CN116414941A - Information query method, device, equipment and storage medium - Google Patents
Information query method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN116414941A CN116414941A CN202111647549.6A CN202111647549A CN116414941A CN 116414941 A CN116414941 A CN 116414941A CN 202111647549 A CN202111647549 A CN 202111647549A CN 116414941 A CN116414941 A CN 116414941A
- Authority
- CN
- China
- Prior art keywords
- query
- vector
- information
- preset
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 93
- 239000013598 vector Substances 0.000 claims abstract description 245
- 238000012549 training Methods 0.000 claims description 52
- 238000013135 deep learning Methods 0.000 claims description 46
- 238000012545 processing Methods 0.000 claims description 37
- 238000013136 deep learning model Methods 0.000 claims description 15
- 230000000694 effects Effects 0.000 abstract description 16
- 230000002194 synthesizing effect Effects 0.000 abstract description 4
- 238000004458 analytical method Methods 0.000 description 24
- 230000002452 interceptive effect Effects 0.000 description 16
- 238000001514 detection method Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 101100261006 Salmonella typhi topB gene Proteins 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 101150032437 top-3 gene Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3347—Query execution using vector based model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an information query method, an information query device, information query equipment and an information query storage medium, and belongs to the technical field of Internet, wherein the information query method comprises the following steps: when a query instruction is received, determining query data according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
Description
Technical Field
The present invention relates to the field of internet technologies, and in particular, to an information query method, an information query device, and a storage medium.
Background
In search, recommendation and advertising systems, there are often scenes that need to present specific content when certain text needs to be hit accurately. In this scenario, accuracy is usually pursued, and usually only a precise hit will be effective, and how to generalize hit text (generally query in search, hereinafter generally referred to as query) is a difficult problem.
The prior art basically adopts different methods for different types of queries, and adopts an offline mining method which is frequently used in expansion, and takes effect online or adopts a complete hit or complete hit rule mode. For example, a translation model or a strategy such as co-clicking is utilized to mine a candidate set of the queries, then algorithms such as semantic matching are utilized to control the correlation in a clamping mode, and finally the mined queries are online. The technical scheme has the problems of poor timeliness, poor expandability, limited generalization capability and the like, and can not match with unseen data due to the fact that the query mapping relation needs to be processed offline, and the method can not be applied to the whole data due to different mining means and methods for different types.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide an information query method, an information query device and a storage medium, and aims to solve the technical problems of how to make full use of data, generalize queried information more efficiently and improve information query effects.
In order to achieve the above object, the present invention provides an information query method, including:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
Optionally, the keyword searching according to the query text and the vector searching according to the query vector include:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
Optionally, before the keyword searching is performed according to the query text and the preset keyword index and the vector searching is performed according to the query vector and the preset vector index, the method further includes:
acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
Optionally, the generating a preset keyword index and a preset vector index according to the candidate set to be matched includes:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
Optionally, before the generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, the method further includes:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
Optionally, the keyword searching according to the query text and the preset keyword index, and the vector searching according to the query vector and the preset vector index, includes:
configuring a search engine according to a preset keyword index and a preset vector index;
inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
Optionally, the determining query text and query vector according to the query data includes:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
Optionally, the performing the requirement identification process and the vectorization process according to the query data respectively includes:
performing demand recognition processing on the query data through a demand recognition technology;
and vectorizing the query data through a preset deep learning representation model.
Optionally, the determining the information query result by combining the keyword search result and the vector search result includes:
determining first retrieval information corresponding to the query data according to the keyword retrieval result;
determining second retrieval information corresponding to the query data according to the vector retrieval result;
generating target retrieval information according to the first retrieval information and the second retrieval information;
and generating an information inquiry result according to the target retrieval information.
Optionally, the generating the target search information according to the first search information and the second search information includes:
Combining the first search information and the second search information to determine a set of the first search information and the second search information;
generating target retrieval information according to the first retrieval information and the second retrieval information.
Optionally, after determining the information query result by combining the keyword search result and the vector search result, the method further includes:
carrying out semantic relevance analysis on the information inquiry result through a preset deep learning interactive model;
detecting semantic consistency according to semantic relevance analysis results;
and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
Optionally, before the semantic relevance analysis is performed on the information query result by the preset deep learning interactivity model, the method further includes:
acquiring second training data;
and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
In addition, in order to achieve the above object, the present invention also provides an information query apparatus, including:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
The text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
and the query result module is used for determining information query results by combining the keyword search results and the vector search results.
Optionally, the information retrieval module is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.
Optionally, the information query apparatus further includes:
the index generation module is used for acquiring sample data of various service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
Optionally, the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
Optionally, the information query apparatus further includes:
The model training module is used for acquiring first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
Optionally, the information retrieval module is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; inputting the query text and the query vector into the search engine; and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
In addition, to achieve the above object, the present invention also proposes an information inquiry apparatus including: the information inquiry system comprises a memory, a processor and an information inquiry program which is stored in the memory and can run on the processor, wherein the information inquiry program realizes the information inquiry method when being executed by the processor.
In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon an information inquiry program which, when executed by a processor, implements the information inquiry method as described above.
In the information query method provided by the invention, when a query instruction is received, query data is determined according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
Drawings
FIG. 1 is a schematic diagram of a structure of an information query device in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a first embodiment of an information query method according to the present invention;
FIG. 3 is a generalized query overall flowchart of an embodiment of an information query method of the present invention;
FIG. 4 is a flowchart of a second embodiment of the information query method of the present invention;
FIG. 5 is a flowchart of a third embodiment of an information query method according to the present invention;
Fig. 6 is a schematic functional block diagram of a first embodiment of the information query apparatus of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic diagram of an information query device of a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the information inquiry apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as keys, and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., wi-Fi interface). The memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the device structure shown in fig. 1 is not limiting of the information query device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and an information inquiry program may be included in the memory 1005 as one type of storage medium.
In the information query device shown in fig. 1, the network interface 1004 is mainly used for connecting to an external network and performing data communication with other network devices; the user interface 1003 is mainly used for connecting user equipment and communicating data with the user equipment; the apparatus of the present invention calls the information inquiry program stored in the memory 1005 through the processor 1001 and executes the information inquiry method provided by the embodiment of the present invention.
Based on the hardware structure, the embodiment of the information query method is provided.
Referring to fig. 2, fig. 2 is a flowchart of a first embodiment of the information query method of the present invention.
In a first embodiment, the information query method includes:
and step S10, when a query instruction is received, determining query data according to the query instruction.
It should be noted that, the execution body of the embodiment may be an information query device, and the information query device may be a computer device with a data processing function, or may be another device that may implement the same or similar functions.
It should be noted that, the information query in this embodiment may include, but is not limited to, a text query, and may also include other types of information queries, which are not limited in this embodiment, and in this embodiment, a text query is taken as an example.
It should be noted that, in this embodiment, the query data refers to query condition data input by the user, where the query data may include but is not limited to text data, and may also include other types of data, and in this embodiment, text data is taken as an example for illustration.
It should be appreciated that a user may enter query data in a query interface and then click a query button to send a query instruction to a computer device when an information query is desired. When the computing device receives the query instruction, the computing device may determine corresponding query data according to the query instruction, and then determine an information query result according to the query data.
In a specific implementation, for example, assuming that the user wants to query how much the tomorrow is, the query data is "how much the tomorrow is," which is input by the user, and the final queried information query result may be "how cloudy the tomorrow is, which is clear.
And step S20, determining a query text and a query vector according to the query data.
It should be noted that, unlike the prior art, the scheme determines two search results according to the query text and the query vector in two different ways, and then combines the two search results to obtain an information query result, so that the information query result is generalized to obtain a wider information query result, and the requirements of users are better met.
It should be appreciated that the demand recognition process and the vectorization process may be performed on the query data, respectively, to obtain the query text and the query vector.
It can be understood that when the search is needed, the query data can be vectorized by using the preset deep learning representation model to obtain the query vector, and meanwhile, the query data can be subjected to some standard processing by using the query demand recognition technology to obtain the query text.
And step S30, keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector.
It should be noted that, a preset keyword index and a preset vector index may be generated in advance according to the candidate set to be matched, after the query text and the query vector are determined in the above manner, keyword retrieval may be performed according to the query text and the preset keyword index, so as to select data related to the query text from the candidate set to be matched, and obtain a keyword retrieval result. And vector retrieval can be performed according to the query vector and a preset vector index, so that data related to the query vector is selected from the candidate set to be matched, and a vector retrieval result is obtained.
Step S40, combining the keyword search result and the vector search result to determine the information query result.
It should be understood that the first search information corresponding to the query data may be determined according to the keyword search result, and the second search information corresponding to the query data may be determined according to the vector search result, where the first search information refers to selecting data related to the query text from the candidate set to be matched, and the second search information refers to selecting data related to the query vector from the candidate set to be matched.
It is understood that after the first search information and the second search information are obtained, the target search information may be generated according to the first search information and the second search information, and then the information query result may be generated according to the target search information.
It can be understood that, in order to improve the generation efficiency of the target search information, the first search information and the second search information may be combined, so as to determine a set of the first search information and the second search information, and then the target search information is generated according to the set of the first search information and the second search information, that is, the target search information includes the search information obtained by both the first search information and the second search information.
In a specific implementation, for example, assuming that the first search information determined by the keyword search includes information a, information B, and information C, and assuming that the second search information determined by the vector search includes information D and information E, the target search information may include information a, information B, information C, information D, and information E.
It can be understood that after the information query result is determined, the information query result can be displayed in the query interface, so that the user can conveniently know the information query result, the generalized query result can be provided for the user, and the query requirement of the user is met.
Further, since there may be some data in the query result that is far away from the query data after the generalization query is performed, in order to avoid this, the accuracy of the information query result is improved on the basis of the generalization query, and after step S40, the method further includes:
Carrying out semantic relevance analysis on the information inquiry result through a preset deep learning interactive model; detecting semantic consistency according to semantic relevance analysis results; and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
It can be understood that after the information query result is obtained in the above manner, semantic relevance analysis can be performed on the information query result through a preset deep learning interactive model, then the consistency of the semantics of each data in the information query result is judged according to the semantic relevance analysis result, and if the consistency detection of the semantics is passed, the information query result is directly processed. If the consistency detection of the semantics is not passed, removing part of data with larger semantic difference from the information query result to obtain a target information query result, or returning to the initial step again to query the information, which is not limited in the embodiment.
It can be appreciated that after determining the target information query result, the target information query result can be displayed in the query interface, so that the user can conveniently know the target information query result. Referring to fig. 3, fig. 3 is an overall flowchart of generalized query, in the scheme, query text and query vector are utilized to query in a previously established index by using a search engine, and semantic consistency is judged by using a deep learning interactive model according to a result returned by the query, so that the generalization type and accuracy of the obtained matching result have good effects.
In a specific implementation, for example, it is assumed that query data is "how open a box surrounded by wind in XX game", and data in a candidate set to be matched is "box surrounded by a circle of wind in XX game", and the two texts are very difficult to recall and satisfy in the original rule matching scene, even the data which can be satisfied in the data to be matched in the candidate set to be matched are very difficult to know, after the processing of the scheme, how open a box surrounded by a circle of wind in XX game "can be retrieved" box surrounded by a circle of wind in XX game ", and the calculation semantic correlation is very high, so that the requirements of users are better satisfied.
It should be noted that, the scheme optimizes the concept of accurate matching and rule matching from the concept of searching and semantic computing based on the scene of recall of some special results required to be accurately matched, utilizes keyword searching and vector searching to realize on-line accurate synonymous matching in cooperation with a semantic model, can furthest utilize data, is applicable to all similar scenes, and has great promotion on-line effect.
It can be understood that the scheme can be applied to an onebox recall system of a search engine, and after the generalization method is used, a search result top3 can recall oneboxes by 3.5%, net income is more than about 1%, and key types such as recall query of carefully selected abstract 13% come from the generalization query method, so that the generalization query effect is greatly improved.
Further, in order to achieve a better semantic relevance analysis effect, so that the target information query result is more accurate, before the semantic relevance analysis is performed on the information query result through the preset deep learning interactive model, the method further comprises:
acquiring second training data; and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
It should be appreciated that the second training data may be sample training data related to semantics, and the training data may be mined to train the initial deep learning model to obtain a deep learning interactive model with semantic relevance analysis functionality.
In the embodiment, when a query instruction is received, query data is determined according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
In an embodiment, as shown in fig. 4, a second embodiment of the information query method of the present invention is provided based on the first embodiment, and the step S30 includes:
step S301, performing keyword retrieval according to the query text and the preset keyword index, and performing vector retrieval according to the query vector and the preset vector index.
It should be appreciated that the preset keyword index and the preset vector index may be generated in advance, a library may be built based on the preset keyword index and the preset vector index, and then the search engine may be configured. In the information query process, after the query text and the query vector are determined, the query text and the query vector can be input into a search engine, the query text can be subjected to keyword search through a preset keyword index in the search engine, and meanwhile, the query vector can also be subjected to vector search through a preset vector index in the search engine.
The keyword search and the vector search in this embodiment are not in a fixed sequence, and the keyword search and the vector search may be performed simultaneously, or the keyword search may be performed first and then the vector search may be performed, or the vector search may be performed first and then the keyword search may be performed.
Further, in order to more efficiently generate the preset keyword index and the preset vector index, the requirements of keyword matching and vector matching are satisfied, and before step S301, the method further includes:
acquiring sample data of various service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
It should be understood that sample data of multiple service types may be obtained, a candidate set to be matched is generated according to the sample data of multiple service types, and then an offline vector corresponding to the candidate set to be matched is generated through a preset deep learning representation model. And then generating a preset keyword index according to the candidate set to be matched, generating a preset vector index according to the offline vector, and then establishing a search engine to call the preset keyword index and the preset vector index.
Further, in order to achieve a better vectorization effect, before the generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, the method further includes:
acquiring first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
It should be appreciated that the first training data may be sample training data associated with vectors, and the training data may be mined to train the initial deep learning model to obtain a functional deep learning representation model with vectorization. The first training data and the second training data may be the same, different, or partially the same or partially different, which is not limited in this embodiment.
In this embodiment, a preset keyword index and a preset vector index are generated in advance, a search engine is configured, keyword retrieval is performed according to the query text and the preset keyword index, and vector retrieval is performed according to the query vector and the preset vector index. And when the scene needing to be matched appears, searching in the constructed system by utilizing query data, and performing query generalization by utilizing the thought of a search engine.
In an embodiment, as shown in fig. 5, a third embodiment of the information query method according to the present invention is provided based on the first embodiment or the second embodiment, and in this embodiment, the description is given based on the first embodiment, and the step S20 includes:
Step S201, performing a requirement recognition process and a vectorization process according to the query data, respectively.
It should be appreciated that after determining the query data corresponding to the query instruction, in order to determine the query text and the query vector, a requirement recognition process and a vectorization process may be performed according to the query data, respectively, and then the query text and the query vector may be determined according to the two processing results, respectively.
It can be understood that, in order to achieve a better processing effect, the requirement recognition technology can be used for carrying out requirement recognition processing on the query data, and the vectorization processing is carried out on the query data through a preset deep learning representation model. The preset deep learning representation model can have a vectorization function, wherein vectorization refers to converting text into vectors, and after query data are determined, the query data can be vectorized through the preset deep learning representation model, so that query vectors are obtained. The requirement recognition technology may have a function of text recognition and text specification, and may be set and adopted according to requirements of actual situations, which is not limited in this embodiment, and after determining query data, the requirement recognition technology may perform text recognition and text specification on the query data, so as to obtain a query text.
Step S202, determining a query text according to the requirement recognition processing result, and determining a query vector according to the vectorization processing result.
It can be understood that after the requirement recognition processing and the vectorization processing are performed on the query data, the query text can be determined according to the requirement recognition processing result, meanwhile, the query vector can be determined according to the vectorization processing result, further, the keyword retrieval is performed according to the query text and the preset keyword index, the vector retrieval is performed according to the query vector and the preset vector index, the keyword retrieval result and the vector retrieval result are obtained, then the target retrieval information is generated by combining the first retrieval information corresponding to the keyword retrieval result and the second retrieval information corresponding to the vector retrieval result, and the information query result is generated according to the target retrieval information. In order to improve semantic relativity, query generalization is realized, query accuracy is ensured, semantic relativity analysis can be performed on the basis of information query results, and target information query results are obtained, so that the method is suitable for all scenes needing accurate text matching, data can be fully utilized, and query generalization can be performed more efficiently and accurately.
In this embodiment, the requirement recognition processing and the vectorization processing are performed according to the query data, the query text is determined according to the requirement recognition processing result, and the query vector is determined according to the vectorization processing result, so that the query text and the query vector can be determined respectively, then the keyword retrieval and the vector retrieval are performed simultaneously, and the accuracy of the query result is improved on the basis that query generalization can be realized by matching with the semantic model.
In addition, the embodiment of the invention also provides a storage medium, wherein the storage medium stores an information inquiry program, and the information inquiry program realizes the steps of the information inquiry method when being executed by a processor.
Because the storage medium adopts all the technical schemes of all the embodiments, the storage medium has at least all the beneficial effects brought by the technical schemes of the embodiments, and the description is omitted here.
In addition, referring to fig. 6, an embodiment of the present invention further provides an information query apparatus, where the information query apparatus includes:
and the query data module 10 is used for determining query data according to the query instruction when the query instruction is received.
It should be noted that, the information query in this embodiment may include, but is not limited to, a text query, and may also include other types of information queries, which are not limited in this embodiment, and in this embodiment, a text query is taken as an example.
It should be noted that, in this embodiment, the query data refers to query condition data input by the user, where the query data may include but is not limited to text data, and may also include other types of data, and in this embodiment, text data is taken as an example for illustration.
It should be appreciated that a user may enter query data in a query interface and then click a query button to send a query instruction to a computer device when an information query is desired. When the computing device receives the query instruction, the computing device may determine corresponding query data according to the query instruction, and then determine an information query result according to the query data.
In a specific implementation, for example, assuming that the user wants to query how much the tomorrow is, the query data is "how much the tomorrow is," which is input by the user, and the final queried information query result may be "how cloudy the tomorrow is, which is clear.
A text vector module 20 for determining a query text and a query vector from the query data.
It should be noted that, unlike the prior art, the scheme determines two search results according to the query text and the query vector in two different ways, and then combines the two search results to obtain an information query result, so that the information query result is generalized to obtain a wider information query result, and the requirements of users are better met.
It should be appreciated that the demand recognition process and the vectorization process may be performed on the query data, respectively, to obtain the query text and the query vector.
It can be understood that when the search is needed, the query data can be vectorized by using the preset deep learning representation model to obtain the query vector, and meanwhile, the query data can be subjected to some standard processing by using the query demand recognition technology to obtain the query text.
The information retrieval module 30 is configured to perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector.
It should be noted that, a preset keyword index and a preset vector index may be generated in advance according to the candidate set to be matched, after the query text and the query vector are determined in the above manner, keyword retrieval may be performed according to the query text and the preset keyword index, so as to select data related to the query text from the candidate set to be matched, and obtain a keyword retrieval result. And vector retrieval can be performed according to the query vector and a preset vector index, so that data related to the query vector is selected from the candidate set to be matched, and a vector retrieval result is obtained.
The query result module 40 is configured to determine an information query result by combining the keyword search result and the vector search result.
It should be understood that the first search information corresponding to the query data may be determined according to the keyword search result, and the second search information corresponding to the query data may be determined according to the vector search result, where the first search information refers to selecting data related to the query text from the candidate set to be matched, and the second search information refers to selecting data related to the query vector from the candidate set to be matched.
It is understood that after the first search information and the second search information are obtained, the target search information may be generated according to the first search information and the second search information, and then the information query result may be generated according to the target search information.
It can be understood that, in order to improve the generation efficiency of the target search information, the first search information and the second search information may be combined, so as to determine a set of the first search information and the second search information, and then the target search information is generated according to the set of the first search information and the second search information, that is, the target search information includes the search information obtained by both the first search information and the second search information.
In a specific implementation, for example, assuming that the first search information determined by the keyword search includes information a, information B, and information C, and assuming that the second search information determined by the vector search includes information D and information E, the target search information may include information a, information B, information C, information D, and information E.
It can be understood that after the information query result is determined, the information query result can be displayed in the query interface, so that the user can conveniently know the information query result, the generalized query result can be provided for the user, and the query requirement of the user is met.
Further, since some data may be far away from the query data in the query result after the generalization query is performed, in order to avoid such a situation, the accuracy of the information query result is improved on the basis of the generalization query, and the information query device further comprises a semantic relevance analysis module, which is used for performing semantic relevance analysis on the information query result through a preset deep learning interactivity model; detecting semantic consistency according to semantic relevance analysis results; and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
It can be understood that after the information query result is obtained in the above manner, semantic relevance analysis can be performed on the information query result through a preset deep learning interactive model, then the consistency of the semantics of each data in the information query result is judged according to the semantic relevance analysis result, and if the consistency detection of the semantics is passed, the information query result is directly processed. If the consistency detection of the semantics is not passed, removing part of data with larger semantic difference from the information query result to obtain a target information query result, or returning to the initial step again to query the information, which is not limited in the embodiment.
It can be appreciated that after determining the target information query result, the target information query result can be displayed in the query interface, so that the user can conveniently know the target information query result. Referring to fig. 3, fig. 3 is an overall flowchart of generalized query, in the scheme, query text and query vector are utilized to query in a previously established index by using a search engine, and semantic consistency is judged by using a deep learning interactive model according to a result returned by the query, so that the generalization type and accuracy of the obtained matching result have good effects.
In a specific implementation, for example, it is assumed that query data is "how open a box surrounded by wind in XX game", and data in a candidate set to be matched is "box surrounded by a circle of wind in XX game", and the two texts are very difficult to recall and satisfy in the original rule matching scene, even the data which can be satisfied in the data to be matched in the candidate set to be matched are very difficult to know, after the processing of the scheme, how open a box surrounded by a circle of wind in XX game "can be retrieved" box surrounded by a circle of wind in XX game ", and the calculation semantic correlation is very high, so that the requirements of users are better satisfied.
It should be noted that, the scheme optimizes the concept of accurate matching and rule matching from the concept of searching and semantic computing based on the scene of recall of some special results required to be accurately matched, utilizes keyword searching and vector searching to realize on-line accurate synonymous matching in cooperation with a semantic model, can furthest utilize data, is applicable to all similar scenes, and has great promotion on-line effect.
It can be understood that the scheme can be applied to an onebox recall system of a search engine, and after the generalization method is used, a search result top3 can recall oneboxes by 3.5%, net income is more than about 1%, and key types such as recall query of carefully selected abstract 13% come from the generalization query method, so that the generalization query effect is greatly improved.
Further, in order to achieve a better semantic relativity analysis effect and enable the target information query result to be more accurate, the model training module is further used for acquiring second training data; and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
It should be appreciated that the second training data may be sample training data related to semantics, and the training data may be mined to train the initial deep learning model to obtain a deep learning interactive model with semantic relevance analysis functionality.
In the embodiment, when a query instruction is received, query data is determined according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
In an embodiment, the information retrieval module 30 is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.
In an embodiment, the information query device further includes an index generating module, configured to obtain sample data of multiple service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
In an embodiment, the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
In an embodiment, the information query device further includes a model training module, configured to obtain first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
In one embodiment, the information retrieval module 30 is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; inputting the query text and the query vector into the search engine; and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
In an embodiment, the text vector module 20 is further configured to perform a requirement recognition process and a vectorization process according to the query data, respectively; and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
In one embodiment, the text vector module 20 is further configured to perform a requirement recognition process on the query data through a requirement recognition technology; and vectorizing the query data through a preset deep learning representation model.
In an embodiment, the query result module 40 is further configured to determine first search information corresponding to the query data according to the keyword search result; determining second retrieval information corresponding to the query data according to the vector retrieval result; generating target retrieval information according to the first retrieval information and the second retrieval information; and generating an information inquiry result according to the target retrieval information.
In an embodiment, the query result module 40 is further configured to combine the first search information and the second search information to determine a set of the first search information and the second search information; generating target retrieval information according to the first retrieval information and the second retrieval information.
In an embodiment, the information query device further includes a semantic relevance analysis module, configured to perform semantic relevance analysis on the information query result through a preset deep learning interactive model; detecting semantic consistency according to semantic relevance analysis results; and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
In an embodiment, the model training module is further configured to obtain second training data; and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
Other embodiments or specific implementation methods of the information query apparatus of the present invention may refer to the above method embodiments, and are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in an estimator readable storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a smart device (which may be a mobile phone, estimator, information query device, or network information query device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
The invention discloses an A1 information query method, which comprises the following steps:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
A2, the information query method according to A1, wherein the keyword retrieval is performed according to the query text, and the vector retrieval is performed according to the query vector, and the method comprises the following steps:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
A3, the information query method as described in A2, wherein before the keyword retrieval is performed according to the query text and the preset keyword index and the vector retrieval is performed according to the query vector and the preset vector index, the method further comprises:
Acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
A4, the information query method as described in A3, wherein the generating a preset keyword index and a preset vector index according to the candidate set to be matched includes:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
A5, the information query method as described in A4, before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, further includes:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
A6, the information query method as set forth in A2, wherein the keyword search is performed according to the query text and a preset keyword index, and the vector search is performed according to the query vector and a preset vector index, and the method comprises:
configuring a search engine according to a preset keyword index and a preset vector index;
Inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
A7, the information query method of any one of A1 to A6, wherein the determining a query text and a query vector according to the query data comprises:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
A8, the information query method as described in A7, wherein the performing the requirement identification process and the vectorization process according to the query data respectively includes:
performing demand recognition processing on the query data through a demand recognition technology;
and vectorizing the query data through a preset deep learning representation model.
A9, the information query method according to any one of A1 to A6, wherein the determining the information query result by combining the keyword search result and the vector search result includes:
determining first retrieval information corresponding to the query data according to the keyword retrieval result;
Determining second retrieval information corresponding to the query data according to the vector retrieval result;
generating target retrieval information according to the first retrieval information and the second retrieval information;
and generating an information inquiry result according to the target retrieval information.
A10, the information query method of A9, the generating target search information according to the first search information and the second search information, includes:
combining the first search information and the second search information to determine a set of the first search information and the second search information;
generating target retrieval information according to the first retrieval information and the second retrieval information.
A11, the information query method according to any one of A1 to A6, wherein after the information query result is determined by combining the keyword search result and the vector search result, the method further comprises:
carrying out semantic relevance analysis on the information inquiry result through a preset deep learning interactive model;
detecting semantic consistency according to semantic relevance analysis results;
and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
A12, the information query method as described in A11, before the semantic relevance analysis is performed on the information query result by a preset deep learning interactivity model, further comprises:
Acquiring second training data;
and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
The invention also discloses a B13 and an information inquiry device, wherein the information inquiry device comprises:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
the text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
and the query result module is used for determining information query results by combining the keyword search results and the vector search results.
And B14, the information query device as described in B13, wherein the information retrieval module is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.
B15, the information inquiry apparatus of B14, the information inquiry apparatus further comprising:
the index generation module is used for acquiring sample data of various service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
The information inquiry device as described in B16, wherein the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
B17, the information inquiry apparatus of B16, the information inquiry apparatus further comprising:
the model training module is used for acquiring first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
B18, the information query device as described in B14, wherein the information retrieval module is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; inputting the query text and the query vector into the search engine; and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
The invention also discloses C19, an information inquiry device, the information inquiry device includes: the information inquiry system comprises a memory, a processor and an information inquiry program which is stored in the memory and can run on the processor, wherein the information inquiry program realizes the information inquiry method when being executed by the processor.
The invention also discloses D20, a storage medium, the storage medium stores an information inquiry program, and the information inquiry program realizes the information inquiry method when being executed by a processor.
Claims (10)
1. An information query method, characterized in that the information query method comprises:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
2. The information query method as claimed in claim 1, wherein said performing keyword search based on said query text and performing vector search based on said query vector comprises:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
3. The information query method of claim 2, wherein before the keyword search is performed according to the query text and a preset keyword index, and the vector search is performed according to the query vector and a preset vector index, further comprising:
Acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
4. The information query method of claim 3, wherein the generating a preset keyword index and a preset vector index according to the candidate set to be matched comprises:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
5. The information query method of claim 4, wherein before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, further comprises:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
6. The information query method of claim 2, wherein the performing keyword retrieval according to the query text and a preset keyword index and performing vector retrieval according to the query vector and a preset vector index comprises:
Configuring a search engine according to a preset keyword index and a preset vector index;
inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
7. The information query method of any one of claims 1 to 6, wherein said determining query text and query vectors from said query data comprises:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
8. An information inquiry apparatus, characterized in that the information inquiry apparatus includes:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
the text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
And the query result module is used for determining information query results by combining the keyword search results and the vector search results.
9. An information inquiry apparatus, characterized in that the information inquiry apparatus includes: memory, a processor and an information query program stored on the memory and executable on the processor, which when executed by the processor implements the information query method of any one of claims 1 to 7.
10. A storage medium having stored thereon an information inquiry program which, when executed by a processor, implements the information inquiry method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111647549.6A CN116414941A (en) | 2021-12-29 | 2021-12-29 | Information query method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111647549.6A CN116414941A (en) | 2021-12-29 | 2021-12-29 | Information query method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116414941A true CN116414941A (en) | 2023-07-11 |
Family
ID=87054874
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111647549.6A Pending CN116414941A (en) | 2021-12-29 | 2021-12-29 | Information query method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116414941A (en) |
-
2021
- 2021-12-29 CN CN202111647549.6A patent/CN116414941A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086303B (en) | Intelligent conversation method, device and terminal based on machine reading understanding | |
CN109635273B (en) | Text keyword extraction method, device, equipment and storage medium | |
CN108170859B (en) | Voice query method, device, storage medium and terminal equipment | |
CN110457431A (en) | Answering method, device, computer equipment and the storage medium of knowledge based map | |
US9268767B2 (en) | Semantic-based search system and search method thereof | |
CN111666399A (en) | Intelligent question and answer method and device based on knowledge graph and computer equipment | |
CN116991977B (en) | Domain vector knowledge accurate retrieval method and device based on large language model | |
CN110795541B (en) | Text query method, text query device, electronic equipment and computer readable storage medium | |
CN111274822A (en) | Semantic matching method, device, equipment and storage medium | |
CN111625638B (en) | Question processing method, device, equipment and readable storage medium | |
CN111400340A (en) | Natural language processing method and device, computer equipment and storage medium | |
CN118113815B (en) | Content searching method, related device and medium | |
CN117521625A (en) | Question answering method, question answering device, electronic equipment and medium | |
CN110795547A (en) | Text recognition method and related product | |
CN113822039A (en) | Method and related equipment for mining similar meaning words | |
CN113343692A (en) | Search intention recognition method, model training method, device, medium and equipment | |
CN115203378B (en) | Retrieval enhancement method, system and storage medium based on pre-training language model | |
CN107368525B (en) | Method and device for searching related words, storage medium and terminal equipment | |
CN116414941A (en) | Information query method, device, equipment and storage medium | |
JP6495206B2 (en) | Document concept base generation device, document concept search device, method, and program | |
CN112308016B (en) | Expression image acquisition method and device, electronic equipment and storage medium | |
CN117931858B (en) | Data query method, device, computer equipment and storage medium | |
CN114242047B (en) | Voice processing method and device, electronic equipment and storage medium | |
CN118132791A (en) | Image retrieval method, device, equipment, readable storage medium and product | |
JP6334491B2 (en) | Concept base generation device, concept search device, method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |