LLM and RAG Powered Chatbot for the College of Computer Science and Mathematics at the University of Mosul

Abstract

This paper presents a proposal for the design and implementation of a chatbot for the website of the College of Computer Science and Mathematics at the University of Mosul. The chatbot uses a Large Language Model (LLM) supported by Retrieval Augmented Generation (RAG) technology, the first of its kind at the university. This innovative approach to designing the chatbot combines the strengths of pre-trained language models and RAG technology, enabling the chatbot to generate informative and accurate responses to a wide range of user inquiries. The chatbot is designed to provide immediate support and act as a virtual assistant for students, faculty, staff and visitors to the college website. It covers various topics related to the College of Computer Science and Mathematics at the University of Mosul, including faculty profiles, research activities, events, academic programs, admission requirements and other college-related matters.

Country : Iraq

1 Ban Sharief Mustafa2 Yusuf Ersayyem Madhi

  1. Instructor Dr, Department of Computer Science, College of Computer Science and Mathematics, University of Mosul, Mosul, Iraq
  2. Student, Department of Computer Science, College of Computer Science and Mathematics, University of Mosul, Mosul, Iraq

IRJIET, Volume 8, Issue 10, October 2024 pp. 59-61

doi.org/10.47001/IRJIET/2024.810010

References

  1. Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y.,... & Wang, H. (2023). Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
  2. Caldarini, G., Jaf, S. and McGarry, K., 2022. A literature survey of recent advances in chatbots. Information, 13(1), p.41.
  3. Oktar, Y., Okur, E., & Turkan, M. (2020). Self-recognition in conversational agents. arXiv preprint arXiv:2002.02334.
  4. Li, D., Yan, J., Zhang, T., Wang, C., He, X., Huang, L.,... & Huang, J. (2024). On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models. arXiv preprint arXiv:2406.16367.
  5. Wang, S., Song, J. L. S., Cheng, J., Fu, Y., Guo, P., Fang, K.,... & Dou, Z. (2024). Domain RAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation. arXiv preprint arXiv:2406.05654.
  6. Patel, J. M., & Patel, J. M. (2020). Web scraping in python using beautiful soup library. Getting structured data from the internet: running web crawlers/scrapers on a big data production scale, 31-84.\
  7. Patil, R., Boit, S., Gudivada, V., & Nandigam, J. (2023). A survey of text representation and embedding techniques in nlp. IEEE Access, 11, 36120-36146.
  8. Li, L., & Lv, J. (2024). Unlocking Insights: Semantic Search in Jupyter Notebooks. arXiv preprint arXiv:2402.13234.
  9. Bharathi Mohan, G., Prasanna Kumar, R., Vishal Krishh, P., Keerthinathan, A., Lavanya, G., Meghana, M. K. U., ... & Doss, S. (2024). An analysis of large language models: their impact and potential applications. Knowledge and Information Systems, 1-24.
  10. Ms.Ch.Lavanya Susanna, R.Pratyusha, P.Swathi, P.Rishi Krishna, V.Sai Pradeep “College Enquiry Chatbot” International Research Journal of Engineering and Technology (IRJET).