SLMs for Natural Language database interaction with the SENSE Citiverse of Cartagena City

Alejandro Carmona-Martínez; Antonio Jara; Germán Molina

doi:10.62161/sauc.v11.6023

Authors

Alejandro Carmona-Martínez Libelium Lab, Murcia, Spain https://orcid.org/0009-0006-7034-2403
Antonio Jara Libelium Lab, Murcia, Spain https://orcid.org/0000-0002-2651-6684
Germán Molina Libelium Lab, Murcia, Spain

DOI:

https://doi.org/10.62161/sauc.v11.6023

Keywords:

SLM, text-to-sql, Trino, Datahub, KServe, KubeFlow, Sense Citiverse, Smart City, Data Spaces

Abstract

European data spaces host rich catalogues yet rarely turn data into timely answers. We present a self‑contained Natural Language data assistant for Cartagena’s SENSE Citiverse. It orchestrates three SLMs (Entities, SQL, and Domain Expert) on DataHub, Trino and KServe/Kubeflow stack. Two SLM stacks (Compact, Expanded) were evaluated on six air‑quality and port‑operations tasks against Gemini 2.5-Pro LLM. Entity extraction achieved perfect table-discovery. SQL solved 4/6 (Compact) and 5/6 (Expanded); Gemini solved 6/6. The SLM design preserves data-sovereignty and reduces cost. This work demonstrates a new way to interact in the Citiverse based on Natural Language.

Downloads

Download data is not yet available.

Global Statistics

204 Views	79 Downloads
283 Total

Downloads by format:

PDF (Español (España)) 44 PDF 35

References

Basant, A., Khairnar, A., Paithankar, A., Khattar, A., Renduchintala, A., Malte, A., Bercovich, A., Hazare, A., Rico, A., Ficek, A., Kondratenko, A., Shaposhnikov, A., Bukharin, A., Taghibakhshi, A., Barton, A., Mahabaleshwarkar, A. S., Shen, A., Tao, A., Guan, A., Shors, A. (…) Chen, Z. (2025). NVIDIA Nemotron Nano 2: An accurate and efficient hybrid mamba-transformer reasoning model. ArXiv, V4. https://arxiv.org/abs/2508.14444

Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., Lin, Y.-C., & Molchanov, P. (2025). Small language models are the future of agentic AI. ArXiv, v2. https://arxiv.org/abs/2506.02153

Chen, A., Bundele, M., Ahlawat, G., Stetz, P., Wang, Z., Fei, Q., Jung, D., Chu, A., Jayaraman, B., Panth, A., Arora, Y., Jain, S., Varma, R., Ilin, A., Melnychuk, I., Chueh, C., Sil, J., & Wang, X. (2025). Text-to-SQL for enterprise data analytics. Workshop on Agentic AI for Enterprise at KDD '25, Toronto, ON, Canada. https://arxiv.org/abs/2507.14372

Cortés-Cediel, M. E., Segura-Tinoco, A., Cantador, I., & Rodríguez Bolívar, M. P. (2023). Trends and challenges of e-government chatbots: Advances in exploring open government data and citizen participation content. Government Information Quarterly, 40(4), 101877. https://doi.org/10.1016/j.giq.2023.101877 DOI: https://doi.org/10.1016/j.giq.2023.101877

DataHub Project. (2015a). DataHub GraphQL API. DataHub. https://docs.datahub.com/docs/api/graphql/overview

DataHub Project. (2015b). DataHub quickstart guide. DataHub. https://docs.datahub.com/docs/quickstart/

DataHub Project. (2024). DataHub APIs and SDKs overview. DataHub. https://docs.datahub.com/docs/api/datahub-apis

DataHub Project. (2025). DataHub | Modern data catalog & metadata platform. DataHub. https://datahub.com/

Google. (2025a). Gemma-2-2b-it [Large language model]. Hugging Face. https://huggingface.co/google/gemma-2-2b-it

Google. (2025b). Gemma-3-4b-it [Large language model]. Hugging Face. https://huggingface.co/google/gemma-3-4b-it

Hong, Z., Yuan, Z., Zhang, Q., Chen, H., Dong, J., Huang, F., & Huang, X. (2024). A next-generation survey of LLM-based text-to-SQL database interfaces [Preprint]. arXiv. V8. https://arxiv.org/abs/2406.08426

Inglés-Romero, J. F., Ferri, M., & Jara, A. J. (2025). Exploring human usability challenges in dataspaces. En J. Theissen-Lipp, P. Colpaert, A. Pomp, E. Curry & S. Decke (Eds.) The Third International Workshop on Semantics in Dataspaces, ESWC 2025, Portorož, Slovenia.

Jiang, Y., Pang, P. C.-I., Wong, D., & Kan, H. Y. (2023). Natural language processing adoption in governments and future research directions: A systematic review. Applied Sciences, 13(22), 12346. https://doi.org/10.3390/app132212346 DOI: https://doi.org/10.3390/app132212346

KServe Project. (2024). modelmesh: Distributed model serving framework (Version 0.12.0) [Computer software]. GitHub. https://github.com/kserve/modelmesh

KServe Project. (2025a). Control plane. KServe Documentation. https://kserve.github.io/website/docs/concepts/architecture/control-plane

KServe Project. (2025b). Deploy your first GenAI service. KServe Documentation. https://kserve.github.io/website/docs/getting-started/genai-first-isvc

KServe Project. (2025c). Deploy your first predictive AI service. KServe Documentation. https://kserve.github.io/website/docs/getting-started/predictive-first-isvc

KServe Project. (2025d). KServe. Retrieved October 3, 2025, from https://kserve.github.io/website/

KServe Project. (2025e). kserve: Standardized distributed generative and predictive AI inference platform for scalable, multi-framework deployment on Kubernetes (Version 0.15.2) [Computer software]. GitHub. https://github.com/kserve/kserve

Kubeflow. (2025, 31 de julio). Introduction: A brief introduction to KServe. Kubeflow Documentation. https://www.kubeflow.org/docs/components/kserve/introduction/

Kubernetes. (2024, 20 de abril). Kubernetes documentation. https://kubernetes.io/docs/home/

Meta AI. (2024a). Llama-3.1-8B-Instruct [Large language model]. Hugging Face. https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

Meta AI. (2024b). Llama-3.2-3B-Instruct [Large language model]. Hugging Face. https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct

Qwen Team. (2025a). Qwen2.5-Coder-3B-Instruct [Large language model]. Hugging Face. https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct

Qwen Team. (2025b). Qwen2.5-Coder-7B-Instruct [Large language model]. Hugging Face. https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct

SENSE Project. (2025a, 19 de junio). CitiVerse. Senseverse. https://senseverse.eu/citiverse/

SENSE Project. (2025b, 13 de junio). Pilots. Senseverse. https://senseverse.eu/pilots/

SENSE Project. (2025c, 5 de junio). Senseverse | Building smart, connected citizens with SENSE. Senseverse. https://senseverse.eu/

Shi, L., Tang, Z., Zhang, N., Zhang, X., & Yang, Z. (2025). A survey on employing large language models for text-to-SQL tasks. ACM Computing Surveys, 58(2), Article 54. https://doi.org/10.1145/3737873 DOI: https://doi.org/10.1145/3737873

Trino. (2020). EXPLAIN. Trino Documentation. https://trino.io/docs/current/sql/explain.html

Trino. (2025a). Concepts. Trino Documentation. https://trino.io/docs/current/overview/concepts.html

Trino. (2025b). Connectors. Trino Documentation. https://trino.io/docs/current/connector.html

Trino. (2025c). EXPLAIN ANALYZE. Trino Documentation. https://trino.io/docs/current/sql/explain-analyze.html

Ukil, A., Jara, A., Gama, J., & Bellatreche, L. (2025). Buck the trend: Make LLMs specific and reduce the cost of intelligence. 28th European Conference on Artificial Intelligence, Bologna, Italy.

Yigitcanlar, T., David, A., Li, W., Fookes, C., Bibri, S. E., & Ye, X. (2024). Unlocking artificial intelligence adoption in local governments: Best practice lessons from real-world implementations. Smart Cities, 7(4), 1576–1625. https://doi.org/10.3390/smartcities7040064 DOI: https://doi.org/10.3390/smartcities7040064

SLMs for Natural Language database interaction with the SENSE Citiverse of Cartagena City

Authors

DOI:

Keywords:

Abstract

Downloads

Global Statistics

References

Downloads

Published

How to Cite

Issue

Section

License

AI Assistant

Language

SJR

logos

ISSN 2183-9956

Make a Submission

redes sociales

Partners

Information

Current Issue

SLMs for Natural Language database interaction with the SENSE Citiverse of Cartagena City

Authors

DOI:

Keywords:

Abstract

Downloads

Global Statistics ℹ️

References

Downloads

Published

How to Cite

Issue

Section

License

AI Assistant

Language

SJR

logos

ISSN 2183-9956

Make a Submission

redes sociales

Partners

Information

Current Issue

Global Statistics