Privacy Centric Offline Chatbot using Large Language Models
DOI:
https://doi.org/10.62760/iteecs.4.2.2025.134Keywords:
Artificial Intelligence, Large Language Models, Local Models, Retrieval Augmented Generation, Offline Chatbot, Retrieval-Augmented Generation (RAG)Abstract
From ELIZA in the 1960s mimicking a psychotherapist to ChatGPT helping us to write code, chatbots have come a long way and are dominating the digital era. Chatbots, which come under Conversational AI make it possible for humans and machines to converse. They are developed significantly through the latest advances in AI, becoming more complex and intelligent, but some downfalls exist. Many current solutions depend on cloud-based models, which need internet connectivity. They also might collect and store the data leading to privacy breaches. This research paper focuses on these problems. This proposed methodology equips locally managed Large Language Models (LLMs) using tools like Ollama and bge-m3 model for embedding. Operating offline ensures that no data is collected and shared and it can also work in remote places with no internet connectivity. This integrates Retrieval-Augmented Generation (RAG) to generate context-aware, accurate answers and it also employs vector indexing search for visually related images and files to extract information.
References
K. Ko, T. Y. Nyein, K. K. Oo, T. Z.Oo and T. T. Zin "Retrieval Augmented Generation for Document Query Automation using Open source LLMs", 2024 5th International Conference on Advanced Information Technologies (ICAIT), pp. 1-6, 2024.
https://doi.org/10.1109/ICAIT65209.2024.10754919
M. Maryamah, M. M. Irfani, E. B. Tri Raharjo, N. A. Rahmi, M. Ghani and I. K. Raharjana "Chatbots in Academia: A Retrieval-Augmented Generation Approach for Improved Efficient Information Access", 2024 16th International Conference on Knowledge and Smart Technology (KST), pp. 259-264, 2024
https://doi.org/10.1109/KST61284.2024.10499652
S. Kukreja, T. Kumar, V. Bharate, A. Purohit, A. Dasgupta and D. Guha “Performance Evaluation of Vector Embeddings with Retrieval-Augmented Generation”, 2024 9th International Conference on Computer and Communication Systems (ICCCS), pp. 333-340, 2024.
https://doi.org/10.1109/ICCCS61882.2024.10603291
S. Bag, A. Gupta, R. Kaushik and C. Jain "RAG Beyond Text: Enhancing Image Retrieval in RAG Systems", 2024 International Conference on Electrical, Computer and Energy Technologies (ICECET), pp.1-6, 2024.
https://doi.org/10.1109/ICECET61485.2024.10698598
A. Ramprasad and P. Sivakumar “Context-Aware Summarization for PDF Documents using Large Language Models”, 2024 International Conference on Expert Clouds and Applications (ICOECA), pp.186-191, 2024.
https://doi.ieeecomputersociety.org/10.1109/ICOECA62351.2024.00044
K. Sakai, Y. Uehara and S. Kashihara "Implementation and Evaluation of LLM-Based Conversational Systems on a Low-Cost Device", IEEE Global Humanitarian Technology Conference (GHTC), pp. 392-399, 2024.
https://doi.org/10.1109/GHTC62424.2024.10771565
V. N. Ignatyev, N. V. Shimchik, D. D. Panov and A. A. Mitrofanov "Large language models in source code static analysis", 2024 Ivannikov Memorial Workshop (IVMEM), pp. 28-35, 2024.
https://doi.org/10.1109/IVMEM63006.2024.10659715
S. Knollmeyer, M. U. Akmal, L. Koval, S. Asif, S. G. Mathias and D. Gro?mann "Document Knowledge Graph to Enhance Question Answering with Retrieval Augmented Generation", 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1-4, 2024.
https://doi.org/10.1109/ETFA61755.2024.10711054
L. Ruhländer, E. Popp, M. Stylidou, S. Khan and D. Svetinovic "On the Security and Privacy Implications of Large Language Models: In-Depth Threat Analysis",2024 IEEE International Conferences on Internet of Things (iThings) and IEEE Green Computing & Communications (GreenCom) and IEEE Cyber, Physical & Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics, pp. 543-550, 2024.
https://doi.org/10.1109/iThings-GreenCom-CPSCom-SmartData-Cybermatics62450.2024.00102
V. V. Mayil and T. R. Jeyalakshmi, "Pretrained Sentence Embedding and Semantic Sentence Similarity Language Model for Text Classification in NLP", 2023 3rd International conference on Artificial Intelligence and Signal Processing (AISP), pp.1-5, 2023.
https://doi.org/10.1109/AISP57993.2023.10134937
K. Kalaiselvi, R. Shanmugam, S. Tamilselvan, S. P. Manikandan and P. Rajasekar "Innovations in Natural Language Processing through Enhanced Linguistic Model Accuracy and Efficiency Using Advanced Reinforcement Learning Techniques", 2024 Second International Conference on Advances in Information Technology (ICAIT), pp.1-5, 2024.
https://doi.org/10.1109/ICAIT61638.2024.10690717
P. Omrani, A. Hosseini, K. Hooshanfar, Z. Ebrahimian, R. Toosi and M. Ali Akhaee, "Hybrid Retrieval-Augmented Generation Approach for LLMs Query Response Enhancement", 2024 10th International Conference on Web Research (ICWR), pp. 22-26, 2024.
https://doi.org/10.1109/ICWR61162.2024.10533345
Y. Zhao and D. Li "Semi-Structured Tender Document Retrieval-Augmented Generation: A Framework Based on Large Language Model", 2024 IEEE 2nd International Conference on Sensors, Electronics and Computer Engineering (ICSECE), pp. 177-182 ,2024.
https://doi.org/10.1109/ICSECE61636.2024.10729522
M. Ni, J. Zhang, C. Fu, J. Wang, X. Ning and S. Li "ChatGrid: Intelligent Knowledge Q&A for Power Dispatching Control Based on Large Language Models and Retrieval-augmented Generation", 2024 IEEE 7th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), pp. 921-925, 2024.
https://doi.org/10.1109/ITNEC60942.2024.10733337
A. Sawant, S. Phadol, S. Mehere, P. Divekar, S. Khedekar and R. Dound "ChatWhiz: Chatbot Built Using Transformers and Gradio Framework", 2024 5th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), pp. 454-461, 2024.
https://doi.org/10.1109/ICICV62344.2024.00077
S. Arulmohan, M. -J. Meurs and S. Mosser "Extracting Domain Models from Textual Requirements in the Era of Large Language Models", 2023 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C), pp. 580-587, 2023.
https://doi.org/10.1109/MODELS-C59198.2023.00096
Z. Zhou and H. Fan "Investigation of Power Device Solder Void Formation by Two-Phase Flow Simulation", 2024 25th International Conference on Electronic Packaging Technology (ICEPT), pp. 1-6, 2024.
https://doi.org/10.1109/ICEPT63120.2024.10667663
H. Qian, T. Xu, Z. Ding, W. Liu and S. Zhu "Exploring Large Language Models for Method Name Prediction", 2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS), pp. 192-203,2024.
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 K. Anjali, K. Vipunsai, K. Ruchitha, M. Bhavani, Ch. China Subba Reddy

This work is licensed under a Creative Commons Attribution 4.0 International License.

This Journal and its metadata are licenced under a