SLM para la interacción de bases de datos en lenguaje natural con el SENSE Citiverse de la ciudad de Cartagena

Autores/as

DOI:

https://doi.org/10.62161/sauc.v11.6023

Palabras clave:

SLM, texto a sql, Trino, Dahahub, KServe, KubeFlow, Sense Citiverse, Smart City, Espacios de datos

Resumen

Los espacios de datos europeos albergan amplios catálogos, pero rara vez convierten los datos en respuestas oportunas. Presentamos un asistente de datos de lenguaje natural autónomo para SENSE Citiverse de Cartagena. Coordina tres SLM (entidades, SQL y experto en dominios) en una pila DataHub, Trino y KServe/Kubeflow. Se evaluaron dos pilas SLM (Compact y Expanded) en seis tareas de calidad del aire y de operaciones portuarias frente a Gemini 2.5-Pro-LLM. La extracción de entidades logró un descubrimiento perfecto de tablas. SQL resolvió 4/6 (Compact) y 5/6 (Expanded); Gemini resolvió 6/6. El diseño SLM preserva la soberanía de los datos y reduce los costes. Este trabajo demuestra una nueva forma de interactuar en Citiverse basada en el lenguaje natural.

Descargas

Los datos de descargas todavía no están disponibles.

Estadísticas globales ℹ️

Totales acumulados desde su publicación
10
Visualizaciones
5
Descargas
15
Total
Descargas por formato:
PDF 3 PDF (English) 2

Citas

Basant, A., Khairnar, A., Paithankar, A., Khattar, A., Renduchintala, A., Malte, A., Bercovich, A., Hazare, A., Rico, A., Ficek, A., Kondratenko, A., Shaposhnikov, A., Bukharin, A., Taghibakhshi, A., Barton, A., Mahabaleshwarkar, A. S., Shen, A., Tao, A., Guan, A., Shors, A. (…) Chen, Z. (2025). NVIDIA Nemotron Nano 2: An accurate and efficient hybrid mamba-transformer reasoning model. ArXiv, V4. https://arxiv.org/abs/2508.14444

Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., Lin, Y.-C., & Molchanov, P. (2025). Small language models are the future of agentic AI. ArXiv, v2. https://arxiv.org/abs/2506.02153

Chen, A., Bundele, M., Ahlawat, G., Stetz, P., Wang, Z., Fei, Q., Jung, D., Chu, A., Jayaraman, B., Panth, A., Arora, Y., Jain, S., Varma, R., Ilin, A., Melnychuk, I., Chueh, C., Sil, J., & Wang, X. (2025). Text-to-SQL for enterprise data analytics. Workshop on Agentic AI for Enterprise at KDD '25, Toronto, ON, Canada. https://arxiv.org/abs/2507.14372

Cortés-Cediel, M. E., Segura-Tinoco, A., Cantador, I., & Rodríguez Bolívar, M. P. (2023). Trends and challenges of e-government chatbots: Advances in exploring open government data and citizen participation content. Government Information Quarterly, 40(4), 101877. https://doi.org/10.1016/j.giq.2023.101877

DataHub Project. (2015a). DataHub GraphQL API. DataHub. https://docs.datahub.com/docs/api/graphql/overview

DataHub Project. (2015b). DataHub quickstart guide. DataHub. https://docs.datahub.com/docs/quickstart/

DataHub Project. (2024). DataHub APIs and SDKs overview. DataHub. https://docs.datahub.com/docs/api/datahub-apis

DataHub Project. (2025). DataHub | Modern data catalog & metadata platform. DataHub. https://datahub.com/

Google. (2025a). Gemma-2-2b-it [Large language model]. Hugging Face. https://huggingface.co/google/gemma-2-2b-it

Google. (2025b). Gemma-3-4b-it [Large language model]. Hugging Face. https://huggingface.co/google/gemma-3-4b-it

Hong, Z., Yuan, Z., Zhang, Q., Chen, H., Dong, J., Huang, F., & Huang, X. (2024). A next-generation survey of LLM-based text-to-SQL database interfaces [Preprint]. arXiv. V8. https://arxiv.org/abs/2406.08426

Inglés-Romero, J. F., Ferri, M., & Jara, A. J. (2025). Exploring human usability challenges in dataspaces. En J. Theissen-Lipp, P. Colpaert, A. Pomp, E. Curry & S. Decke (Eds.) The Third International Workshop on Semantics in Dataspaces, ESWC 2025, Portorož, Slovenia.

Jiang, Y., Pang, P. C.-I., Wong, D., & Kan, H. Y. (2023). Natural language processing adoption in governments and future research directions: A systematic review. Applied Sciences, 13(22), 12346. https://doi.org/10.3390/app132212346

KServe Project. (2024). modelmesh: Distributed model serving framework (Version 0.12.0) [Computer software]. GitHub. https://github.com/kserve/modelmesh

KServe Project. (2025a). Control plane. KServe Documentation. https://kserve.github.io/website/docs/concepts/architecture/control-plane

KServe Project. (2025b). Deploy your first GenAI service. KServe Documentation. https://kserve.github.io/website/docs/getting-started/genai-first-isvc

KServe Project. (2025c). Deploy your first predictive AI service. KServe Documentation. https://kserve.github.io/website/docs/getting-started/predictive-first-isvc

KServe Project. (2025d). KServe. Retrieved October 3, 2025, from https://kserve.github.io/website/

KServe Project. (2025e). kserve: Standardized distributed generative and predictive AI inference platform for scalable, multi-framework deployment on Kubernetes (Version 0.15.2) [Computer software]. GitHub. https://github.com/kserve/kserve

Kubeflow. (2025, 31 de julio). Introduction: A brief introduction to KServe. Kubeflow Documentation. https://www.kubeflow.org/docs/components/kserve/introduction/

Kubernetes. (2024, 20 de abril). Kubernetes documentation. https://kubernetes.io/docs/home/

Meta AI. (2024a). Llama-3.1-8B-Instruct [Large language model]. Hugging Face. https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

Meta AI. (2024b). Llama-3.2-3B-Instruct [Large language model]. Hugging Face. https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct

Qwen Team. (2025a). Qwen2.5-Coder-3B-Instruct [Large language model]. Hugging Face. https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct

Qwen Team. (2025b). Qwen2.5-Coder-7B-Instruct [Large language model]. Hugging Face. https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct

SENSE Project. (2025a, 19 de junio). CitiVerse. Senseverse. https://senseverse.eu/citiverse/

SENSE Project. (2025b, 13 de junio). Pilots. Senseverse. https://senseverse.eu/pilots/

SENSE Project. (2025c, 5 de junio). Senseverse | Building smart, connected citizens with SENSE. Senseverse. https://senseverse.eu/

Shi, L., Tang, Z., Zhang, N., Zhang, X., & Yang, Z. (2025). A survey on employing large language models for text-to-SQL tasks. ACM Computing Surveys, 58(2), Article 54. https://doi.org/10.1145/3737873

Trino. (2020). EXPLAIN. Trino Documentation. https://trino.io/docs/current/sql/explain.html

Trino. (2025a). Concepts. Trino Documentation. https://trino.io/docs/current/overview/concepts.html

Trino. (2025b). Connectors. Trino Documentation. https://trino.io/docs/current/connector.html

Trino. (2025c). EXPLAIN ANALYZE. Trino Documentation. https://trino.io/docs/current/sql/explain-analyze.html

Ukil, A., Jara, A., Gama, J., & Bellatreche, L. (2025). Buck the trend: Make LLMs specific and reduce the cost of intelligence. 28th European Conference on Artificial Intelligence, Bologna, Italy.

Yigitcanlar, T., David, A., Li, W., Fookes, C., Bibri, S. E., & Ye, X. (2024). Unlocking artificial intelligence adoption in local governments: Best practice lessons from real-world implementations. Smart Cities, 7(4), 1576–1625. https://doi.org/10.3390/smartcities7040064

Publicado

2025-11-28

Cómo citar

Carmona, A., Jara, A., & Molina, G. (2025). SLM para la interacción de bases de datos en lenguaje natural con el SENSE Citiverse de la ciudad de Cartagena. Street Art & Urban Creativity, 11(7), 137–157. https://doi.org/10.62161/sauc.v11.6023

Número

Sección

Monográfico SmartCityExpo