Abstract
This paper presents a novel hybrid search approach to improve information retrieval from the Salesian Polytechnic University website, addressing the challenge of efficiently managing and accessing the growing volume of information. Leveraging virtual assistant technology, the study combines vector similarity and keyword-based techniques to optimize data retrieval. The methodology involves a structured process, including information gathering, architecture design, search execution and analysis of the results. The system architecture consists of three key layers: the intelligent layer, which uses the OpenAI API for query processing; the data layer, which uses the Qdrant database for storage; and the logic layer, responsible for query execution. Two search methods are applied: Vector similarity search, which retrieves data based on contextual relevance, and keyword search with BM25, which sorts documents by keyword relevance. Testing and analysis confirm that the hybrid search method significantly improves the efficiency and accuracy of information retrieval. The results show a significant improvement in the request measures obtained, where the 4 highest percentages were selected to obtain the context from which the answer is derived. The highest similarity values were 5.56, followed by 3.84, the effectiveness of this method in various knowledge areas of the university website. In conclusion, the hybrid search approach presented in this paper offers a promising solution to efficiently retrieve information from the Salesian Polytechnic University website, improve accessibility and ultimately improve user satisfaction.
Original language | English |
---|---|
Title of host publication | Information Technology and Systems - ICITS 2024 |
Editors | Alvaro Rocha, Jorge Hochstetter Diez, Carlos Ferras, Mauricio Dieguez Rebolledo |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 247-257 |
Number of pages | 11 |
ISBN (Print) | 9783031542343 |
DOIs | |
State | Published - 2024 |
Externally published | Yes |
Event | International Conference on Information Technology and Systems, ICITS 2024 - Temuco, Chile Duration: 24 Jan 2024 → 26 Jan 2024 |
Publication series
Name | Lecture Notes in Networks and Systems |
---|---|
Volume | 932 LNNS |
ISSN (Print) | 2367-3370 |
ISSN (Electronic) | 2367-3389 |
Conference
Conference | International Conference on Information Technology and Systems, ICITS 2024 |
---|---|
Country/Territory | Chile |
City | Temuco |
Period | 24/01/24 → 26/01/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
Keywords
- BM25
- Education
- Hybrid search
- Innovation
- Natural Language Processing
- Vector Model