CN116431878A - Vector retrieval service method, device, equipment and storage medium thereof - Google Patents
Vector retrieval service method, device, equipment and storage medium thereof Download PDFInfo
- Publication number
- CN116431878A CN116431878A CN202310383382.XA CN202310383382A CN116431878A CN 116431878 A CN116431878 A CN 116431878A CN 202310383382 A CN202310383382 A CN 202310383382A CN 116431878 A CN116431878 A CN 116431878A
- Authority
- CN
- China
- Prior art keywords
- data
- vector
- search
- retrieval
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000013598 vector Substances 0.000 title claims abstract description 252
- 238000000034 method Methods 0.000 title claims abstract description 58
- 241000157593 Milvus Species 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims description 20
- 238000012544 monitoring process Methods 0.000 claims description 8
- 238000013507 mapping Methods 0.000 abstract description 14
- 238000005457 optimization Methods 0.000 abstract description 4
- 238000012423 maintenance Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 14
- 238000013473 artificial intelligence Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/907—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2237—Vectors, bitmaps or matrices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Library & Information Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application belongs to the technical field of operation and maintenance of a base frame, is applied to the field of optimization of a semantic retrieval process, and relates to a vector retrieval service method, a vector retrieval service device, vector retrieval service equipment and a vector retrieval storage medium, wherein the vector retrieval service device comprises the steps of obtaining target data; vectorizing target data to obtain a target vector value; acquiring a unique identifier corresponding to a target vector value from a first vector retrieval library; obtaining similar questions or standard questions corresponding to the target data from the second vector retrieval library; the similarity question or the standard question is used as a search substitute field, and the search service is completed through the search engine and the search substitute field. The vector retrieval service mode of knowledge real-time synchronization is established through ElasticSearch, milvus (or message queue+Faiss), so that the elastic search only maintains the mapping relation between the vector and the similarity and the standard, and Milvus or Faiss only needs to maintain the mapping relation between the vector and the unique identifier, and the message queue is monitored to ensure that the retrieval service is updated synchronously in time, thereby avoiding bad service experience for a retrieval party.
Description
Technical Field
The present disclosure relates to the field of semantic search process optimization technologies, and in particular, to a vector search service method, device, apparatus, and storage medium thereof.
Background
Vector retrieval is a common landing scheme of an artificial intelligence technology, after unstructured data such as text, pictures, voice, video and the like are converted into vectors through the artificial intelligence technology, the most similar vectors can be obtained through a vector calculation and retrieval method, and therefore landing of applications such as similar text retrieval, commodity search, picture search and the like is achieved.
Vector retrieval is currently required to be serviced based on vector retrieval algorithms on the floor of the industry, and is packaged and optimized to accommodate respective usage scenarios. Therefore, vector retrieval service is generally only used for vector data management and application of vectors after unstructured data conversion, and the vector retrieval service is an independent service, so that the relation between the vectors and unstructured data and data synchronization are required to be considered in the implementation, the prior mode often needs to pause retrieval service in advance when the bottom layer changes according to data, the changed data is updated at regular time, the purpose of data synchronization is achieved, and then the retrieval service is started. Therefore, the prior art cannot update the search service in time, and is easy to cause poor business experience for the search party.
Disclosure of Invention
An embodiment of the application aims to provide a vector retrieval service method, device and equipment and a storage medium thereof, so as to solve the problems that the prior art cannot timely update retrieval service synchronously and is easy to cause poor service experience for a retrieval party.
In order to solve the above technical problems, the embodiments of the present application provide a vector retrieval service method, which adopts the following technical schemes:
a vector retrieval service method, comprising the steps of:
acquiring target data input by a searcher in a preset search engine, wherein the target data is data to be subjected to similar query search or data to be subjected to standard query search;
carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value;
taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library;
taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library;
and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field.
Further, before executing the step of obtaining the target data input by the retriever in the preset search engine, the method further includes:
acquiring full data in a preset question-answer knowledge base in advance, wherein the full data comprises similar questions or standard questions corresponding to the target data, and the question-answer knowledge base is a TiDB database;
and carrying out differential naming on each data in the full data according to a preset identification naming rule, and taking a differential naming result as a unique identification corresponding to each data in the full data, wherein the identification naming rule can carry out differential naming on each data in the full data by using a differential ID corresponding to each data in the full data.
Further, after executing the step of performing the differential naming for each data in the full data according to the preset identifier naming rule, and taking the differential naming result as the unique identifier corresponding to each data in the full data, the method further includes:
and transferring unique identifiers corresponding to the full-quantity data and each data in the full-quantity data into the second vector retrieval library in pairs according to the unique association relation, wherein the second vector retrieval library is an elastic search vector retrieval library.
Further, after executing the step of transferring the unique identifiers corresponding to the full-volume data and each data in the full-volume data into the second vector search library, the method further includes:
obtaining vector values corresponding to all the data in the full-quantity data according to the vectorization model;
and taking unique identifiers corresponding to all the vector values and the full-vector data as paired data, and synchronously caching and recording the paired data into the first vector retrieval library, wherein the first vector retrieval library is a Faiss vector retrieval library or a Milvus vector retrieval library.
Further, the step of obtaining the unique identifier corresponding to the target vector value from a preset first vector search library by taking the target vector value as a first search field specifically includes:
obtaining a vector value closest to the first search field in the first vector search library according to a cosine similarity algorithm;
and acquiring the unique identifier corresponding to the closest vector value as the unique identifier of the target vector value.
Further, before the step of taking the unique identifier as the second search field and obtaining the similarity question or the standard question corresponding to the target data from a preset second vector search library, the method further includes:
Judging whether data update exists in the total data in the question-answer knowledge base or not according to a preset message queue monitoring component, wherein the data update specifically refers to adding, deleting and modifying operation on the total data;
if the data updating exists in the full-volume data in the question-answer knowledge base, updating the full-volume data and the unique identification in the second vector retrieval base according to the execution logic relation corresponding to the adding, deleting and modifying operation.
Further, after the step of updating the full-vector data and the unique identifier in the second vector search library according to the execution logic relationship corresponding to the add-delete-modify operation is performed, the method further includes:
according to the vectorization model, vector values corresponding to all the data in the full data after updating are obtained;
and taking the vector value and the unique identifier corresponding to each data in the full-quantity data after updating as paired data, and updating the cache record in the first vector retrieval library.
In order to solve the above technical problems, the embodiments of the present application further provide a vector retrieval service apparatus, which adopts the following technical schemes:
a vector retrieval service apparatus comprising:
The target data acquisition module is used for acquiring target data input by a searcher in a preset search engine, wherein the target data is data to be subjected to similar query search or data to be subjected to standard query search;
the vectorization processing module is used for vectorizing the target data according to a preset vectorization model to obtain a target vector value;
the first retrieval module is used for taking the target vector value as a first retrieval field and acquiring a unique identifier corresponding to the target vector value from a preset first vector retrieval library;
the second retrieval module is used for taking the unique identifier as a second retrieval field and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector retrieval library;
and the third retrieval module is used for taking the similarity question or the standard question corresponding to the target data as a retrieval substitute field and completing retrieval service through the search engine and the retrieval substitute field.
In order to solve the above technical problems, the embodiments of the present application further provide a computer device, which adopts the following technical schemes:
a computer device comprising a memory having stored therein computer readable instructions which when executed by a processor implement the steps of the vector retrieval service method described above.
In order to solve the above technical problems, embodiments of the present application further provide a computer readable storage medium, which adopts the following technical solutions:
a computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the vector retrieval service method as described above.
Compared with the prior art, the embodiment of the application has the following main beneficial effects:
according to the vector retrieval service method, target data input by a retriever in a preset search engine are obtained, wherein the target data are data to be subjected to similarity query retrieval or data to be subjected to standard query retrieval; carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value; taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library; taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library; and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field. The vector retrieval service mode of knowledge real-time synchronization is established through ElasticSearch, milvus (or message queue+Faiss), so that the elastic search only maintains the mapping relation between the vector and the similarity and the standard, and Milvus or Faiss only needs to maintain the mapping relation between the vector and the unique identifier, and the message queue is monitored to ensure that the retrieval service is updated synchronously in time, thereby avoiding bad service experience for a retrieval party.
Drawings
For a clearer description of the solution in the present application, a brief description will be given below of the drawings that are needed in the description of the embodiments of the present application, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a vector retrieval service method according to the present application;
FIG. 3 is a schematic diagram of one embodiment of a vector retrieval service apparatus according to the present application;
FIG. 4 is a schematic structural diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In order to better understand the technical solutions of the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture ExpertsGroup Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving PictureExperts Group Audio LayerIV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the vector search service method provided in the embodiments of the present application is generally executed by a server/terminal device, and accordingly, the vector search service apparatus is generally disposed in the server/terminal device.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
For easy understanding, first, the architecture of the search service support library used in the present application is introduced, and the architecture of the search service support library adopted in the present embodiment includes two kinds of architecture, where one of the architecture is: tiDB database + elastic search vector search library + Faiss vector search library + message queue; another architecture is as follows: tiDB database + elastic search vector search library + Milvus vector search library.
The TiDB database, the elastic search vector search library, the Faiss vector search library, and the Milvus vector search library are described below, respectively.
The TiDB database is an open-source distributed relational database, is a fusion type distributed database product which simultaneously supports online transaction processing and online analysis processing, has important characteristics of horizontal expansion or contraction capacity, high-availability of finance, real-time HTAP, cloud-native distributed databases, compatibility with MySQL5.7 protocol, mySQL ecology and the like, is suitable for various application scenes with high availability, high strong consistency requirement, large data scale and the like, such as financial industry scenes with high data consistency and high reliability, high availability of systems, expandability and high disaster recovery requirement, mainly stores data in the financial industry scenes, such as standard questions or/and similar questions during financial industry business consultation, and a question-answer knowledge base in the application adopts the TiDB database, which pre-caches similar questions and standard questions, ensures that retrieval service is suitable for the financial insurance business scenes with high availability, high strong consistency requirement and large data scale;
the elastesearch is a distributed, high-expansion and high-real-time search and data analysis engine, can provide near real-time search and analysis for all types of data, a database matched with the elastesearch search engine is an elastesearch vector retrieval library, and provides data support for distributed search and analysis;
The Faiss vector search library is fully called (Facebook AI Similarity Search) a Facebook AI team open-source clustering and similarity search library, provides efficient similarity search and clustering for dense vectors, supports billion-level vector search, is a mature approximate neighbor search library at present, only caches unique identifiers and vector values corresponding to similarity questions and standard questions in the Faiss library, and maintains the relationship between the unique identifiers and the vector values;
the Milvus vector search library is a feature vector similarity search engine with open sources, is convenient to use, practical, reliable, easy to expand, stable, efficient and quick in search, covers main stream third party index libraries such as Faiss, annoy and hnswlib, is high in performance, supports similar search of massive vector data, and achieves communication among components by using a technology of message queues such as Pulsar and Kafka according to the concept of logs and data.
With continued reference to fig. 2, a flow chart of one embodiment of a vector retrieval service method according to the present application is shown. The vector retrieval service method comprises the following steps:
step S1, target data input by a searcher in a preset search engine are obtained, wherein the target data are data to be subjected to similar query search or data to be subjected to standard query search.
In this embodiment, the search engine may be an elastic search engine.
In this embodiment, before executing the step of obtaining the target data input by the searcher in the preset search engine, the method further includes: acquiring full data in a preset question-answer knowledge base in advance, wherein the full data comprises similar questions or standard questions corresponding to the target data, and the question-answer knowledge base is a TiDB database; and carrying out differential naming on each data in the full data according to a preset identification naming rule, and taking a differential naming result as a unique identification corresponding to each data in the full data, wherein the identification naming rule can carry out differential naming on each data in the full data by using a differential ID corresponding to each data in the full data.
By distinguishing and naming each data in the full data, unique identification is added for each data, so that the operation of search service is facilitated, the search time is saved, and the search confusion is prevented.
In this embodiment, after executing the step of naming each data in the full data differently according to the preset identifier naming rule, and taking the result of the differential naming as the unique identifier corresponding to each data in the full data, the method further includes: and transferring unique identifiers corresponding to the full-quantity data and each data in the full-quantity data into the second vector retrieval library in pairs according to the unique association relation, wherein the second vector retrieval library is an elastic search vector retrieval library.
And the unique identifiers corresponding to the full-volume data and the data in the full-volume data are transferred and stored into the second vector retrieval library one by one according to the unique association relation, so that the elastic search vector retrieval library only maintains the similar query, the standard query and the unique identifiers, and the vector values do not need to be repeatedly maintained in the elastic search vector retrieval library.
In this embodiment, after the step of transferring the unique identifiers corresponding to the full-volume data and each data in the full-volume data into the second vector search library, one by one, according to a unique association relationship, the method further includes: obtaining vector values corresponding to all the data in the full-quantity data according to the vectorization model; and taking unique identifiers corresponding to all the vector values and the full-vector data as paired data, and synchronously caching and recording the paired data into the first vector retrieval library, wherein the first vector retrieval library is a Faiss vector retrieval library or a Milvus vector retrieval library.
By taking unique identifiers corresponding to all data in the vector value and the full-vector data as paired data, the data are synchronously cached and recorded into the first vector retrieval library, so that the first vector retrieval library is ensured to only maintain the relationship between the vector value and the unique identifiers, and no similar questions and standard questions are required to be stored in the first vector retrieval library.
And S2, carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value.
In this embodiment, the vectorization model specifically refers to a model that converts non-numeric data into numeric data.
And S3, taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library.
In this embodiment, the step of obtaining, by using the target vector value as the first search field, the unique identifier corresponding to the target vector value from a preset first vector search library specifically includes: obtaining a vector value closest to the first search field in the first vector search library according to a cosine similarity algorithm; and acquiring the unique identifier corresponding to the closest vector value as the unique identifier of the target vector value.
The unique identification of the target vector value can be directly obtained through the similarity algorithm built in the Faiss vector retrieval library or the Milvus vector retrieval library, and a similarity calculation program is not needed to be additionally written, so that the logic code quantity is reduced.
And S4, taking the unique identification as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library.
In this embodiment, before the step of using the unique identifier as the second search field to obtain the similarity question or the standard question corresponding to the target data from the preset second vector search library, the method further includes: judging whether data update exists in the total data in the question-answer knowledge base or not according to a preset message queue monitoring component, wherein the data update specifically refers to adding, deleting and modifying operation on the total data; if the total data in the question-answer knowledge base does not have data update, executing a step S4; if the data updating exists in the full-volume data in the question-answer knowledge base, updating the full-volume data and the unique identification in the second vector retrieval base according to the execution logic relation corresponding to the adding, deleting and modifying operation.
And judging whether the total data in the question-answer knowledge base is updated by adopting a message queue and a monitoring mode, so that the data in the second vector retrieval base is updated in time, and the high accuracy and the timely reliability of the retrieval service are ensured.
In this embodiment, after the step of updating the full-vector data and the unique identifier in the second vector search library according to the execution logic relationship corresponding to the add-delete-modify operation is performed, the method further includes: according to the vectorization model, vector values corresponding to all the data in the full data after updating are obtained; and taking the vector value and the unique identifier corresponding to each data in the full-quantity data after updating as paired data, and updating the cache record in the first vector retrieval library.
And judging whether the total data in the question-answer knowledge base is updated by adopting a message queue and a monitoring mode, so that the data in the first vector retrieval base is updated in time, and the high accuracy and the timely reliability of the retrieval service are ensured.
And S5, taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field.
By taking the similarity question or the standard question as the search substitute field, the search field is more standard, and any input of each searcher is converted into the similarity question or the standard question as the search substitute field for searching, so that the search field is more standard.
The method comprises the steps of obtaining target data input in a preset search engine by a searcher, wherein the target data are data to be subjected to similar query search or data to be subjected to standard query search; carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value; taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library; taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library; and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field. The vector retrieval service mode of knowledge real-time synchronization is established through ElasticSearch, milvus (or message queue+Faiss), so that the elastic search only maintains the mapping relation between the vector and the similarity and the standard, and Milvus or Faiss only needs to maintain the mapping relation between the vector and the unique identifier, and the message queue is monitored to ensure that the retrieval service is updated synchronously in time, thereby avoiding bad service experience for a retrieval party.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
In the embodiment of the application, in the vector retrieval service process in the artificial intelligence technology, a vector retrieval service mode with knowledge real-time synchronization is commonly constructed through ElasticSearch, milvus (or a message queue+Faiss), so that an elastic search only maintains the mapping relation between a vector and a similar query and a standard query, and Milvus or Faiss only needs to maintain the mapping relation between the vector and a unique identifier, and the message queue is monitored to ensure that the retrieval service is synchronously updated in time, thereby avoiding bad service experience for a retrieval party.
With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a vector search service apparatus, where an embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 3, the vector search service apparatus 300 according to the present embodiment includes: a target data acquisition module 301, a vectorization processing module 302, a first retrieval module 303, a second retrieval module 304, and a third retrieval module 305. Wherein:
the target data acquisition module 301 is configured to acquire target data input by a searcher in a preset search engine, where the target data is data to be searched for a similar query or data to be searched for a standard query;
the vectorization processing module 302 is configured to perform vectorization processing on the target data according to a preset vectorization model, and obtain a target vector value;
the first search module 303 is configured to use the target vector value as a first search field, and obtain a unique identifier corresponding to the target vector value from a preset first vector search library;
the second search module 304 is configured to use the unique identifier as a second search field, and obtain a similarity question or a standard question corresponding to the target data from a preset second vector search library;
And a third search module 305, configured to take a similarity question or a standard question corresponding to the target data as a search substitute field, and complete a search service through the search engine and the search substitute field.
In some specific embodiments of the present application, the vector search service apparatus 300 further includes a unique identifier adding module, where the unique identifier adding module is configured to perform differential naming on each data in the full-size data according to a preset identifier naming rule, and use a differential naming result as a unique identifier corresponding to each data in the full-size data, where the identifier naming rule may perform differential naming on each data in the full-size data according to a differential ID corresponding to each data in the full-size data.
In some specific embodiments of the present application, the vector search service apparatus 300 further includes a first dumping module, where the first dumping module is configured to dump the full-size data and unique identifiers corresponding to the data in the full-size data into the second vector search library, where the second vector search library is an elastic search vector search library, one by one according to a unique association relationship.
In some specific embodiments of the present application, the vector search service apparatus 300 further includes a second dump module, where the second dump module is configured to obtain, according to the vectorization model, a vector value corresponding to each data in the full-size data; and taking unique identifiers corresponding to all the vector values and the full-vector data as paired data, and synchronously caching and recording the paired data into the first vector retrieval library, wherein the first vector retrieval library is a Faiss vector retrieval library or a Milvus vector retrieval library.
In some specific embodiments of the present application, the vector search service apparatus 300 further includes a monitoring identification module, where the monitoring identification module is configured to determine, according to a preset message queue monitoring component, whether a data update exists on the full-size data in the question-answer knowledge base, where the data update specifically refers to an add-delete operation on the full-size data.
In some specific embodiments of the present application, the vector search service apparatus 300 further includes a synchronization update module, where the synchronization update module is configured to update the full-size data and the unique identifier in the second vector search library according to an execution logic relationship corresponding to the add-delete-modify operation; the vector value corresponding to each data in the full data after updating is obtained according to the vectorization model; and the buffer record in the first vector search library is updated by taking the vector value and the unique identifier corresponding to each data in the full-volume data after updating as paired data.
The method comprises the steps of obtaining target data input in a preset search engine by a searcher, wherein the target data are data to be subjected to similar query search or data to be subjected to standard query search; carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value; taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library; taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library; and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field. The vector retrieval service mode of knowledge real-time synchronization is established through ElasticSearch, milvus (or message queue+Faiss), so that the elastic search only maintains the mapping relation between the vector and the similarity and the standard, and Milvus or Faiss only needs to maintain the mapping relation between the vector and the unique identifier, and the message queue is monitored to ensure that the retrieval service is updated synchronously in time, thereby avoiding bad service experience for a retrieval party.
Those skilled in the art will appreciate that implementing all or part of the above described embodiment methods may be accomplished by computer readable instructions, stored on a computer readable storage medium, that the program when executed may comprise the steps of embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 4a, a processor 4b, a network interface 4c communicatively connected to each other via a system bus. It should be noted that only computer device 4 having components 4a-4c is shown in the figures, but it should be understood that not all of the illustrated components need be implemented, and that more or fewer components may alternatively be implemented. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 4a includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 4a may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 4a may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 4. Of course, the memory 4a may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 4a is typically used to store an operating system and various application software installed on the computer device 4, such as computer readable instructions of a vector search service method. Further, the memory 4a may be used to temporarily store various types of data that have been output or are to be output.
The processor 4b may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 4b is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 4b is configured to execute computer readable instructions stored in the memory 4a or process data, such as computer readable instructions for executing the vector search service method.
The network interface 4c may comprise a wireless network interface or a wired network interface, which network interface 4c is typically used to establish a communication connection between the computer device 4 and other electronic devices.
The computer equipment provided by the embodiment belongs to the technical field of semantic retrieval process optimization. The method comprises the steps of obtaining target data input in a preset search engine by a searcher, wherein the target data are data to be subjected to similar query search or data to be subjected to standard query search; carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value; taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library; taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library; and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field. The vector retrieval service mode of knowledge real-time synchronization is established through ElasticSearch, milvus (or message queue+Faiss), so that the elastic search only maintains the mapping relation between the vector and the similarity and the standard, and Milvus or Faiss only needs to maintain the mapping relation between the vector and the unique identifier, and the message queue is monitored to ensure that the retrieval service is updated synchronously in time, thereby avoiding bad service experience for a retrieval party.
The present application also provides another embodiment, namely, a computer readable storage medium storing computer readable instructions executable by a processor to cause the processor to perform the steps of the vector retrieval service method as described above.
The computer readable storage medium provided by the embodiment belongs to the technical field of semantic retrieval process optimization. The method comprises the steps of obtaining target data input in a preset search engine by a searcher, wherein the target data are data to be subjected to similar query search or data to be subjected to standard query search; carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value; taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library; taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library; and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field. The vector retrieval service mode of knowledge real-time synchronization is established through ElasticSearch, milvus (or message queue+Faiss), so that the elastic search only maintains the mapping relation between the vector and the similarity and the standard, and Milvus or Faiss only needs to maintain the mapping relation between the vector and the unique identifier, and the message queue is monitored to ensure that the retrieval service is updated synchronously in time, thereby avoiding bad service experience for a retrieval party.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.
It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.
Claims (10)
1. A vector search service method, comprising the steps of:
acquiring target data input by a searcher in a preset search engine, wherein the target data is data to be subjected to similar query search or data to be subjected to standard query search;
carrying out vectorization processing on the target data according to a preset vectorization model to obtain a target vector value;
taking the target vector value as a first search field, and acquiring a unique identifier corresponding to the target vector value from a preset first vector search library;
taking the unique identifier as a second search field, and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector search library;
and taking the similarity question or the standard question corresponding to the target data as a search substitute field, and completing search service through the search engine and the search substitute field.
2. The vector search service method according to claim 1, wherein before the step of acquiring target data input by a searcher in a preset search engine is performed, the method further comprises:
acquiring full data in a preset question-answer knowledge base in advance, wherein the full data comprises similar questions or standard questions corresponding to the target data, and the question-answer knowledge base is a Ti DB database;
And carrying out differential naming on each data in the full data according to a preset identification naming rule, and taking a differential naming result as a unique identification corresponding to each data in the full data, wherein the identification naming rule can carry out differential naming on each data in the full data by using a differential ID corresponding to each data in the full data.
3. The vector search service method according to claim 2, wherein after the step of performing the differential naming for each data in the full-size data according to a preset identifier naming rule and taking a differential naming result as a unique identifier corresponding to each data in the full-size data, the method further comprises:
and transferring the unique identifiers corresponding to the full-quantity data and each data in the full-quantity data into the second vector retrieval library in pairs according to the unique association relation, wherein the second vector retrieval library is an Elast icSearch vector retrieval library.
4. The vector search service method according to claim 3, wherein after performing the step of storing the unique identifications corresponding to the full-size data and each of the full-size data, respectively, in the second vector search library, one by one, according to a unique association relationship, the method further comprises:
Obtaining vector values corresponding to all the data in the full-quantity data according to the vectorization model;
and taking unique identifiers corresponding to all the vector values and the full-vector data as paired data, and synchronously caching and recording the paired data into the first vector retrieval library, wherein the first vector retrieval library is a Faiss vector retrieval library or a Milvus vector retrieval library.
5. The method according to claim 4, wherein the step of obtaining the unique identifier corresponding to the target vector value from a preset first vector search library by using the target vector value as a first search field specifically includes:
obtaining a vector value closest to the first search field in the first vector search library according to a cosine similarity algorithm;
and acquiring the unique identifier corresponding to the closest vector value as the unique identifier of the target vector value.
6. The vector search service method according to claim 3, wherein before said step of taking said unique identification as a second search field to obtain a similarity question or a criterion corresponding to said target data from a preset second vector search library, said method further comprises:
Judging whether data update exists in the total data in the question-answer knowledge base or not according to a preset message queue monitoring component, wherein the data update specifically refers to adding, deleting and modifying operation on the total data;
if the data updating exists in the full-volume data in the question-answer knowledge base, updating the full-volume data and the unique identification in the second vector retrieval base according to the execution logic relation corresponding to the adding, deleting and modifying operation.
7. The vector search service method according to claim 6, wherein after performing the step of updating the full-size data and the unique identifier in the second vector search bank according to the execution logic relationship corresponding to the add-delete operation, the method further comprises:
according to the vectorization model, vector values corresponding to all the data in the full data after updating are obtained;
and taking the vector value and the unique identifier corresponding to each data in the full-quantity data after updating as paired data, and updating the cache record in the first vector retrieval library.
8. A vector retrieval service apparatus, comprising:
the target data acquisition module is used for acquiring target data input by a searcher in a preset search engine, wherein the target data is data to be subjected to similar query search or data to be subjected to standard query search;
The vectorization processing module is used for vectorizing the target data according to a preset vectorization model to obtain a target vector value;
the first retrieval module is used for taking the target vector value as a first retrieval field and acquiring a unique identifier corresponding to the target vector value from a preset first vector retrieval library;
the second retrieval module is used for taking the unique identifier as a second retrieval field and acquiring a similarity question or a standard question corresponding to the target data from a preset second vector retrieval library;
and the third retrieval module is used for taking the similarity question or the standard question corresponding to the target data as a retrieval substitute field and completing retrieval service through the search engine and the retrieval substitute field.
9. A computer device comprising a memory having stored therein computer readable instructions which when executed implement the steps of the vector retrieval service method of any of claims 1 to 7.
10. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the vector retrieval service method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310383382.XA CN116431878A (en) | 2023-04-06 | 2023-04-06 | Vector retrieval service method, device, equipment and storage medium thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310383382.XA CN116431878A (en) | 2023-04-06 | 2023-04-06 | Vector retrieval service method, device, equipment and storage medium thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116431878A true CN116431878A (en) | 2023-07-14 |
Family
ID=87090312
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310383382.XA Pending CN116431878A (en) | 2023-04-06 | 2023-04-06 | Vector retrieval service method, device, equipment and storage medium thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116431878A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117688163A (en) * | 2024-01-29 | 2024-03-12 | 杭州有赞科技有限公司 | Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation |
CN117687867A (en) * | 2023-11-30 | 2024-03-12 | 广州三叠纪元智能科技有限公司 | Elastic search log recording method, electronic equipment, storage medium and product |
CN118656482A (en) * | 2024-08-21 | 2024-09-17 | 山东浪潮科学研究院有限公司 | Mixed retrieval method and system for RAG question-answering system |
-
2023
- 2023-04-06 CN CN202310383382.XA patent/CN116431878A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117687867A (en) * | 2023-11-30 | 2024-03-12 | 广州三叠纪元智能科技有限公司 | Elastic search log recording method, electronic equipment, storage medium and product |
CN117688163A (en) * | 2024-01-29 | 2024-03-12 | 杭州有赞科技有限公司 | Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation |
CN117688163B (en) * | 2024-01-29 | 2024-04-23 | 杭州有赞科技有限公司 | Online intelligent question-answering method and device based on instruction fine tuning and retrieval enhancement generation |
CN118656482A (en) * | 2024-08-21 | 2024-09-17 | 山东浪潮科学研究院有限公司 | Mixed retrieval method and system for RAG question-answering system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9934260B2 (en) | Streamlined analytic model training and scoring system | |
US20190392258A1 (en) | Method and apparatus for generating information | |
CN116431878A (en) | Vector retrieval service method, device, equipment and storage medium thereof | |
US20200349226A1 (en) | Dictionary Expansion Using Neural Language Models | |
CN112861662B (en) | Target object behavior prediction method based on face and interactive text and related equipment | |
CN118070072A (en) | Problem processing method, device, equipment and storage medium based on artificial intelligence | |
CN112528040B (en) | Detection method for guiding drive corpus based on knowledge graph and related equipment thereof | |
CN117216114A (en) | Data stream association method, device, equipment and storage medium thereof | |
CN117275466A (en) | Business intention recognition method, device, equipment and storage medium thereof | |
CN117094729A (en) | Request processing method, device, computer equipment and storage medium | |
CN116703520A (en) | Product recommendation method based on improved K-means algorithm and related equipment thereof | |
CN116701593A (en) | Chinese question-answering model training method based on GraphQL and related equipment thereof | |
CN115062136A (en) | Event disambiguation method based on graph neural network and related equipment thereof | |
JP2022111020A (en) | Transfer learning method of deep learning model based on document similarity learning and computer device | |
CN118299064B (en) | Rare disease-based graph model training method, application method and related equipment | |
CN116737833A (en) | CDC data resource synchronization method based on partition mode and related equipment thereof | |
CN116662418A (en) | Report realization method, device and equipment based on configuration and storage medium thereof | |
CN117332012A (en) | Data association method, device, equipment and storage medium thereof | |
CN117874073A (en) | Search optimization method, device, equipment and storage medium thereof | |
CN116701512A (en) | Inter-server data call acceleration method, inter-server data call acceleration device, inter-server data call acceleration equipment and storage medium of inter-server data call acceleration equipment | |
CN115879465A (en) | Search engine word segmentation model construction method and related equipment thereof | |
CN118585545A (en) | Sample data acquisition method, device, equipment, storage medium and program product | |
CN117407469A (en) | Cluster deployment method and device, computer equipment and storage medium | |
CN116431607A (en) | Data model reconstruction method, device, equipment and storage medium thereof | |
CN118113710A (en) | Database falling method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |