SLMs for Natural Language database interaction with the SENSE Citiverse of Cartagena City
DOI:
https://doi.org/10.62161/sauc.v11.6023Keywords:
SLM, text-to-sql, Trino, Datahub, KServe, KubeFlow, Sense Citiverse, Smart City, Data SpacesAbstract
European data spaces host rich catalogues yet rarely turn data into timely answers. We present a self‑contained Natural Language data assistant for Cartagena’s SENSE Citiverse. It orchestrates three SLMs (Entities, SQL, and Domain Expert) on DataHub, Trino and KServe/Kubeflow stack. Two SLM stacks (Compact, Expanded) were evaluated on six air‑quality and port‑operations tasks against Gemini 2.5-Pro LLM. Entity extraction achieved perfect table-discovery. SQL solved 4/6 (Compact) and 5/6 (Expanded); Gemini solved 6/6. The SLM design preserves data-sovereignty and reduces cost. This work demonstrates a new way to interact in the Citiverse based on Natural Language.
Downloads
Global Statistics ℹ️
|
10
Views
|
5
Downloads
|
|
15
Total
|
|
References
Basant, A., Khairnar, A., Paithankar, A., Khattar, A., Renduchintala, A., Malte, A., Bercovich, A., Hazare, A., Rico, A., Ficek, A., Kondratenko, A., Shaposhnikov, A., Bukharin, A., Taghibakhshi, A., Barton, A., Mahabaleshwarkar, A. S., Shen, A., Tao, A., Guan, A., Shors, A. (…) Chen, Z. (2025). NVIDIA Nemotron Nano 2: An accurate and efficient hybrid mamba-transformer reasoning model. ArXiv, V4. https://arxiv.org/abs/2508.14444
Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., Lin, Y.-C., & Molchanov, P. (2025). Small language models are the future of agentic AI. ArXiv, v2. https://arxiv.org/abs/2506.02153
Chen, A., Bundele, M., Ahlawat, G., Stetz, P., Wang, Z., Fei, Q., Jung, D., Chu, A., Jayaraman, B., Panth, A., Arora, Y., Jain, S., Varma, R., Ilin, A., Melnychuk, I., Chueh, C., Sil, J., & Wang, X. (2025). Text-to-SQL for enterprise data analytics. Workshop on Agentic AI for Enterprise at KDD '25, Toronto, ON, Canada. https://arxiv.org/abs/2507.14372
Cortés-Cediel, M. E., Segura-Tinoco, A., Cantador, I., & Rodríguez Bolívar, M. P. (2023). Trends and challenges of e-government chatbots: Advances in exploring open government data and citizen participation content. Government Information Quarterly, 40(4), 101877. https://doi.org/10.1016/j.giq.2023.101877
DataHub Project. (2015a). DataHub GraphQL API. DataHub. https://docs.datahub.com/docs/api/graphql/overview
DataHub Project. (2015b). DataHub quickstart guide. DataHub. https://docs.datahub.com/docs/quickstart/
DataHub Project. (2024). DataHub APIs and SDKs overview. DataHub. https://docs.datahub.com/docs/api/datahub-apis
DataHub Project. (2025). DataHub | Modern data catalog & metadata platform. DataHub. https://datahub.com/
Google. (2025a). Gemma-2-2b-it [Large language model]. Hugging Face. https://huggingface.co/google/gemma-2-2b-it
Google. (2025b). Gemma-3-4b-it [Large language model]. Hugging Face. https://huggingface.co/google/gemma-3-4b-it
Hong, Z., Yuan, Z., Zhang, Q., Chen, H., Dong, J., Huang, F., & Huang, X. (2024). A next-generation survey of LLM-based text-to-SQL database interfaces [Preprint]. arXiv. V8. https://arxiv.org/abs/2406.08426
Inglés-Romero, J. F., Ferri, M., & Jara, A. J. (2025). Exploring human usability challenges in dataspaces. En J. Theissen-Lipp, P. Colpaert, A. Pomp, E. Curry & S. Decke (Eds.) The Third International Workshop on Semantics in Dataspaces, ESWC 2025, Portorož, Slovenia.
Jiang, Y., Pang, P. C.-I., Wong, D., & Kan, H. Y. (2023). Natural language processing adoption in governments and future research directions: A systematic review. Applied Sciences, 13(22), 12346. https://doi.org/10.3390/app132212346
KServe Project. (2024). modelmesh: Distributed model serving framework (Version 0.12.0) [Computer software]. GitHub. https://github.com/kserve/modelmesh
KServe Project. (2025a). Control plane. KServe Documentation. https://kserve.github.io/website/docs/concepts/architecture/control-plane
KServe Project. (2025b). Deploy your first GenAI service. KServe Documentation. https://kserve.github.io/website/docs/getting-started/genai-first-isvc
KServe Project. (2025c). Deploy your first predictive AI service. KServe Documentation. https://kserve.github.io/website/docs/getting-started/predictive-first-isvc
KServe Project. (2025d). KServe. Retrieved October 3, 2025, from https://kserve.github.io/website/
KServe Project. (2025e). kserve: Standardized distributed generative and predictive AI inference platform for scalable, multi-framework deployment on Kubernetes (Version 0.15.2) [Computer software]. GitHub. https://github.com/kserve/kserve
Kubeflow. (2025, 31 de julio). Introduction: A brief introduction to KServe. Kubeflow Documentation. https://www.kubeflow.org/docs/components/kserve/introduction/
Kubernetes. (2024, 20 de abril). Kubernetes documentation. https://kubernetes.io/docs/home/
Meta AI. (2024a). Llama-3.1-8B-Instruct [Large language model]. Hugging Face. https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
Meta AI. (2024b). Llama-3.2-3B-Instruct [Large language model]. Hugging Face. https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct
Qwen Team. (2025a). Qwen2.5-Coder-3B-Instruct [Large language model]. Hugging Face. https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct
Qwen Team. (2025b). Qwen2.5-Coder-7B-Instruct [Large language model]. Hugging Face. https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct
SENSE Project. (2025a, 19 de junio). CitiVerse. Senseverse. https://senseverse.eu/citiverse/
SENSE Project. (2025b, 13 de junio). Pilots. Senseverse. https://senseverse.eu/pilots/
SENSE Project. (2025c, 5 de junio). Senseverse | Building smart, connected citizens with SENSE. Senseverse. https://senseverse.eu/
Shi, L., Tang, Z., Zhang, N., Zhang, X., & Yang, Z. (2025). A survey on employing large language models for text-to-SQL tasks. ACM Computing Surveys, 58(2), Article 54. https://doi.org/10.1145/3737873
Trino. (2020). EXPLAIN. Trino Documentation. https://trino.io/docs/current/sql/explain.html
Trino. (2025a). Concepts. Trino Documentation. https://trino.io/docs/current/overview/concepts.html
Trino. (2025b). Connectors. Trino Documentation. https://trino.io/docs/current/connector.html
Trino. (2025c). EXPLAIN ANALYZE. Trino Documentation. https://trino.io/docs/current/sql/explain-analyze.html
Ukil, A., Jara, A., Gama, J., & Bellatreche, L. (2025). Buck the trend: Make LLMs specific and reduce the cost of intelligence. 28th European Conference on Artificial Intelligence, Bologna, Italy.
Yigitcanlar, T., David, A., Li, W., Fookes, C., Bibri, S. E., & Ye, X. (2024). Unlocking artificial intelligence adoption in local governments: Best practice lessons from real-world implementations. Smart Cities, 7(4), 1576–1625. https://doi.org/10.3390/smartcities7040064
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Authors retain copyright and transfer to the journal the right of first publication and publishing rights

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Those authors who publish in this journal accept the following terms:
-
Authors retain copyright.
-
Authors transfer to the journal the right of first publication. The journal also owns the publishing rights.
-
All published contents are governed by an Attribution-NoDerivatives 4.0 International License.
Access the informative version and legal text of the license. By virtue of this, third parties are allowed to use what is published as long as they mention the authorship of the work and the first publication in this journal. If you transform the material, you may not distribute the modified work. -
Authors may make other independent and additional contractual arrangements for non-exclusive distribution of the version of the article published in this journal (e.g., inclusion in an institutional repository or publication in a book) as long as they clearly indicate that the work was first published in this journal.
- Authors are allowed and recommended to publish their work on the Internet (for example on institutional and personal websites), following the publication of, and referencing the journal, as this could lead to constructive exchanges and a more extensive and quick circulation of published works (see The Effect of Open Access).







