A New Hybrid Search Approach to Optimize the Retrieval of Information from the Website at the Universidad Politécnica Salesiana

Juan P. Salgado-Guerrero, Diego F. Quisi-Peralta, Martin Lopez-Nores, Luis D. Paguay-Palaguachi, Jordan F. Murillo-Valarezo, Gabriela Cajamarca-Morquecho

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a novel hybrid search approach to improve information retrieval from the Salesian Polytechnic University website, addressing the challenge of efficiently managing and accessing the growing volume of information. Leveraging virtual assistant technology, the study combines vector similarity and keyword-based techniques to optimize data retrieval. The methodology involves a structured process, including information gathering, architecture design, search execution and analysis of the results. The system architecture consists of three key layers: the intelligent layer, which uses the OpenAI API for query processing; the data layer, which uses the Qdrant database for storage; and the logic layer, responsible for query execution. Two search methods are applied: Vector similarity search, which retrieves data based on contextual relevance, and keyword search with BM25, which sorts documents by keyword relevance. Testing and analysis confirm that the hybrid search method significantly improves the efficiency and accuracy of information retrieval. The results show a significant improvement in the request measures obtained, where the 4 highest percentages were selected to obtain the context from which the answer is derived. The highest similarity values were 5.56, followed by 3.84, the effectiveness of this method in various knowledge areas of the university website. In conclusion, the hybrid search approach presented in this paper offers a promising solution to efficiently retrieve information from the Salesian Polytechnic University website, improve accessibility and ultimately improve user satisfaction.

Original languageEnglish
Title of host publicationInformation Technology and Systems - ICITS 2024
EditorsAlvaro Rocha, Jorge Hochstetter Diez, Carlos Ferras, Mauricio Dieguez Rebolledo
PublisherSpringer Science and Business Media Deutschland GmbH
Pages247-257
Number of pages11
ISBN (Print)9783031542343
DOIs
StatePublished - 2024
Externally publishedYes
EventInternational Conference on Information Technology and Systems, ICITS 2024 - Temuco, Chile
Duration: 24 Jan 202426 Jan 2024

Publication series

NameLecture Notes in Networks and Systems
Volume932 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

ConferenceInternational Conference on Information Technology and Systems, ICITS 2024
Country/TerritoryChile
CityTemuco
Period24/01/2426/01/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Keywords

  • BM25
  • Education
  • Hybrid search
  • Innovation
  • Natural Language Processing
  • Vector Model

Cite this