CN116414941A - Information query method, device, equipment and storage medium - Google Patents

Information query method, device, equipment and storage medium Download PDF

Info

Publication number
CN116414941A
CN116414941A CN202111647549.6A CN202111647549A CN116414941A CN 116414941 A CN116414941 A CN 116414941A CN 202111647549 A CN202111647549 A CN 202111647549A CN 116414941 A CN116414941 A CN 116414941A
Authority
CN
China
Prior art keywords
query
vector
information
preset
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111647549.6A
Other languages
Chinese (zh)
Inventor
李灏舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN202111647549.6A priority Critical patent/CN116414941A/en
Publication of CN116414941A publication Critical patent/CN116414941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an information query method, an information query device, information query equipment and an information query storage medium, and belongs to the technical field of Internet, wherein the information query method comprises the following steps: when a query instruction is received, determining query data according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.

Description

Information query method, device, equipment and storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to an information query method, an information query device, and a storage medium.
Background
In search, recommendation and advertising systems, there are often scenes that need to present specific content when certain text needs to be hit accurately. In this scenario, accuracy is usually pursued, and usually only a precise hit will be effective, and how to generalize hit text (generally query in search, hereinafter generally referred to as query) is a difficult problem.
The prior art basically adopts different methods for different types of queries, and adopts an offline mining method which is frequently used in expansion, and takes effect online or adopts a complete hit or complete hit rule mode. For example, a translation model or a strategy such as co-clicking is utilized to mine a candidate set of the queries, then algorithms such as semantic matching are utilized to control the correlation in a clamping mode, and finally the mined queries are online. The technical scheme has the problems of poor timeliness, poor expandability, limited generalization capability and the like, and can not match with unseen data due to the fact that the query mapping relation needs to be processed offline, and the method can not be applied to the whole data due to different mining means and methods for different types.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide an information query method, an information query device and a storage medium, and aims to solve the technical problems of how to make full use of data, generalize queried information more efficiently and improve information query effects.
In order to achieve the above object, the present invention provides an information query method, including:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
Optionally, the keyword searching according to the query text and the vector searching according to the query vector include:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
Optionally, before the keyword searching is performed according to the query text and the preset keyword index and the vector searching is performed according to the query vector and the preset vector index, the method further includes:
acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
Optionally, the generating a preset keyword index and a preset vector index according to the candidate set to be matched includes:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
Optionally, before the generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, the method further includes:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
Optionally, the keyword searching according to the query text and the preset keyword index, and the vector searching according to the query vector and the preset vector index, includes:
configuring a search engine according to a preset keyword index and a preset vector index;
inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
Optionally, the determining query text and query vector according to the query data includes:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
Optionally, the performing the requirement identification process and the vectorization process according to the query data respectively includes:
performing demand recognition processing on the query data through a demand recognition technology;
and vectorizing the query data through a preset deep learning representation model.
Optionally, the determining the information query result by combining the keyword search result and the vector search result includes:
determining first retrieval information corresponding to the query data according to the keyword retrieval result;
determining second retrieval information corresponding to the query data according to the vector retrieval result;
generating target retrieval information according to the first retrieval information and the second retrieval information;
and generating an information inquiry result according to the target retrieval information.
Optionally, the generating the target search information according to the first search information and the second search information includes:
Combining the first search information and the second search information to determine a set of the first search information and the second search information;
generating target retrieval information according to the first retrieval information and the second retrieval information.
Optionally, after determining the information query result by combining the keyword search result and the vector search result, the method further includes:
carrying out semantic relevance analysis on the information inquiry result through a preset deep learning interactive model;
detecting semantic consistency according to semantic relevance analysis results;
and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
Optionally, before the semantic relevance analysis is performed on the information query result by the preset deep learning interactivity model, the method further includes:
acquiring second training data;
and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
In addition, in order to achieve the above object, the present invention also provides an information query apparatus, including:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
The text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
and the query result module is used for determining information query results by combining the keyword search results and the vector search results.
Optionally, the information retrieval module is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.
Optionally, the information query apparatus further includes:
the index generation module is used for acquiring sample data of various service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
Optionally, the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
Optionally, the information query apparatus further includes:
The model training module is used for acquiring first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
Optionally, the information retrieval module is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; inputting the query text and the query vector into the search engine; and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
In addition, to achieve the above object, the present invention also proposes an information inquiry apparatus including: the information inquiry system comprises a memory, a processor and an information inquiry program which is stored in the memory and can run on the processor, wherein the information inquiry program realizes the information inquiry method when being executed by the processor.
In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon an information inquiry program which, when executed by a processor, implements the information inquiry method as described above.
In the information query method provided by the invention, when a query instruction is received, query data is determined according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
Drawings
FIG. 1 is a schematic diagram of a structure of an information query device in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a first embodiment of an information query method according to the present invention;
FIG. 3 is a generalized query overall flowchart of an embodiment of an information query method of the present invention;
FIG. 4 is a flowchart of a second embodiment of the information query method of the present invention;
FIG. 5 is a flowchart of a third embodiment of an information query method according to the present invention;
Fig. 6 is a schematic functional block diagram of a first embodiment of the information query apparatus of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic diagram of an information query device of a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the information inquiry apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as keys, and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., wi-Fi interface). The memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the device structure shown in fig. 1 is not limiting of the information query device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and an information inquiry program may be included in the memory 1005 as one type of storage medium.
In the information query device shown in fig. 1, the network interface 1004 is mainly used for connecting to an external network and performing data communication with other network devices; the user interface 1003 is mainly used for connecting user equipment and communicating data with the user equipment; the apparatus of the present invention calls the information inquiry program stored in the memory 1005 through the processor 1001 and executes the information inquiry method provided by the embodiment of the present invention.
Based on the hardware structure, the embodiment of the information query method is provided.
Referring to fig. 2, fig. 2 is a flowchart of a first embodiment of the information query method of the present invention.
In a first embodiment, the information query method includes:
and step S10, when a query instruction is received, determining query data according to the query instruction.
It should be noted that, the execution body of the embodiment may be an information query device, and the information query device may be a computer device with a data processing function, or may be another device that may implement the same or similar functions.
It should be noted that, the information query in this embodiment may include, but is not limited to, a text query, and may also include other types of information queries, which are not limited in this embodiment, and in this embodiment, a text query is taken as an example.
It should be noted that, in this embodiment, the query data refers to query condition data input by the user, where the query data may include but is not limited to text data, and may also include other types of data, and in this embodiment, text data is taken as an example for illustration.
It should be appreciated that a user may enter query data in a query interface and then click a query button to send a query instruction to a computer device when an information query is desired. When the computing device receives the query instruction, the computing device may determine corresponding query data according to the query instruction, and then determine an information query result according to the query data.
In a specific implementation, for example, assuming that the user wants to query how much the tomorrow is, the query data is "how much the tomorrow is," which is input by the user, and the final queried information query result may be "how cloudy the tomorrow is, which is clear.
And step S20, determining a query text and a query vector according to the query data.
It should be noted that, unlike the prior art, the scheme determines two search results according to the query text and the query vector in two different ways, and then combines the two search results to obtain an information query result, so that the information query result is generalized to obtain a wider information query result, and the requirements of users are better met.
It should be appreciated that the demand recognition process and the vectorization process may be performed on the query data, respectively, to obtain the query text and the query vector.
It can be understood that when the search is needed, the query data can be vectorized by using the preset deep learning representation model to obtain the query vector, and meanwhile, the query data can be subjected to some standard processing by using the query demand recognition technology to obtain the query text.
And step S30, keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector.
It should be noted that, a preset keyword index and a preset vector index may be generated in advance according to the candidate set to be matched, after the query text and the query vector are determined in the above manner, keyword retrieval may be performed according to the query text and the preset keyword index, so as to select data related to the query text from the candidate set to be matched, and obtain a keyword retrieval result. And vector retrieval can be performed according to the query vector and a preset vector index, so that data related to the query vector is selected from the candidate set to be matched, and a vector retrieval result is obtained.
Step S40, combining the keyword search result and the vector search result to determine the information query result.
It should be understood that the first search information corresponding to the query data may be determined according to the keyword search result, and the second search information corresponding to the query data may be determined according to the vector search result, where the first search information refers to selecting data related to the query text from the candidate set to be matched, and the second search information refers to selecting data related to the query vector from the candidate set to be matched.
It is understood that after the first search information and the second search information are obtained, the target search information may be generated according to the first search information and the second search information, and then the information query result may be generated according to the target search information.
It can be understood that, in order to improve the generation efficiency of the target search information, the first search information and the second search information may be combined, so as to determine a set of the first search information and the second search information, and then the target search information is generated according to the set of the first search information and the second search information, that is, the target search information includes the search information obtained by both the first search information and the second search information.
In a specific implementation, for example, assuming that the first search information determined by the keyword search includes information a, information B, and information C, and assuming that the second search information determined by the vector search includes information D and information E, the target search information may include information a, information B, information C, information D, and information E.
It can be understood that after the information query result is determined, the information query result can be displayed in the query interface, so that the user can conveniently know the information query result, the generalized query result can be provided for the user, and the query requirement of the user is met.
Further, since there may be some data in the query result that is far away from the query data after the generalization query is performed, in order to avoid this, the accuracy of the information query result is improved on the basis of the generalization query, and after step S40, the method further includes:
Carrying out semantic relevance analysis on the information inquiry result through a preset deep learning interactive model; detecting semantic consistency according to semantic relevance analysis results; and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
It can be understood that after the information query result is obtained in the above manner, semantic relevance analysis can be performed on the information query result through a preset deep learning interactive model, then the consistency of the semantics of each data in the information query result is judged according to the semantic relevance analysis result, and if the consistency detection of the semantics is passed, the information query result is directly processed. If the consistency detection of the semantics is not passed, removing part of data with larger semantic difference from the information query result to obtain a target information query result, or returning to the initial step again to query the information, which is not limited in the embodiment.
It can be appreciated that after determining the target information query result, the target information query result can be displayed in the query interface, so that the user can conveniently know the target information query result. Referring to fig. 3, fig. 3 is an overall flowchart of generalized query, in the scheme, query text and query vector are utilized to query in a previously established index by using a search engine, and semantic consistency is judged by using a deep learning interactive model according to a result returned by the query, so that the generalization type and accuracy of the obtained matching result have good effects.
In a specific implementation, for example, it is assumed that query data is "how open a box surrounded by wind in XX game", and data in a candidate set to be matched is "box surrounded by a circle of wind in XX game", and the two texts are very difficult to recall and satisfy in the original rule matching scene, even the data which can be satisfied in the data to be matched in the candidate set to be matched are very difficult to know, after the processing of the scheme, how open a box surrounded by a circle of wind in XX game "can be retrieved" box surrounded by a circle of wind in XX game ", and the calculation semantic correlation is very high, so that the requirements of users are better satisfied.
It should be noted that, the scheme optimizes the concept of accurate matching and rule matching from the concept of searching and semantic computing based on the scene of recall of some special results required to be accurately matched, utilizes keyword searching and vector searching to realize on-line accurate synonymous matching in cooperation with a semantic model, can furthest utilize data, is applicable to all similar scenes, and has great promotion on-line effect.
It can be understood that the scheme can be applied to an onebox recall system of a search engine, and after the generalization method is used, a search result top3 can recall oneboxes by 3.5%, net income is more than about 1%, and key types such as recall query of carefully selected abstract 13% come from the generalization query method, so that the generalization query effect is greatly improved.
Further, in order to achieve a better semantic relevance analysis effect, so that the target information query result is more accurate, before the semantic relevance analysis is performed on the information query result through the preset deep learning interactive model, the method further comprises:
acquiring second training data; and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
It should be appreciated that the second training data may be sample training data related to semantics, and the training data may be mined to train the initial deep learning model to obtain a deep learning interactive model with semantic relevance analysis functionality.
In the embodiment, when a query instruction is received, query data is determined according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
In an embodiment, as shown in fig. 4, a second embodiment of the information query method of the present invention is provided based on the first embodiment, and the step S30 includes:
step S301, performing keyword retrieval according to the query text and the preset keyword index, and performing vector retrieval according to the query vector and the preset vector index.
It should be appreciated that the preset keyword index and the preset vector index may be generated in advance, a library may be built based on the preset keyword index and the preset vector index, and then the search engine may be configured. In the information query process, after the query text and the query vector are determined, the query text and the query vector can be input into a search engine, the query text can be subjected to keyword search through a preset keyword index in the search engine, and meanwhile, the query vector can also be subjected to vector search through a preset vector index in the search engine.
The keyword search and the vector search in this embodiment are not in a fixed sequence, and the keyword search and the vector search may be performed simultaneously, or the keyword search may be performed first and then the vector search may be performed, or the vector search may be performed first and then the keyword search may be performed.
Further, in order to more efficiently generate the preset keyword index and the preset vector index, the requirements of keyword matching and vector matching are satisfied, and before step S301, the method further includes:
acquiring sample data of various service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
It should be understood that sample data of multiple service types may be obtained, a candidate set to be matched is generated according to the sample data of multiple service types, and then an offline vector corresponding to the candidate set to be matched is generated through a preset deep learning representation model. And then generating a preset keyword index according to the candidate set to be matched, generating a preset vector index according to the offline vector, and then establishing a search engine to call the preset keyword index and the preset vector index.
Further, in order to achieve a better vectorization effect, before the generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, the method further includes:
acquiring first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
It should be appreciated that the first training data may be sample training data associated with vectors, and the training data may be mined to train the initial deep learning model to obtain a functional deep learning representation model with vectorization. The first training data and the second training data may be the same, different, or partially the same or partially different, which is not limited in this embodiment.
In this embodiment, a preset keyword index and a preset vector index are generated in advance, a search engine is configured, keyword retrieval is performed according to the query text and the preset keyword index, and vector retrieval is performed according to the query vector and the preset vector index. And when the scene needing to be matched appears, searching in the constructed system by utilizing query data, and performing query generalization by utilizing the thought of a search engine.
In an embodiment, as shown in fig. 5, a third embodiment of the information query method according to the present invention is provided based on the first embodiment or the second embodiment, and in this embodiment, the description is given based on the first embodiment, and the step S20 includes:
Step S201, performing a requirement recognition process and a vectorization process according to the query data, respectively.
It should be appreciated that after determining the query data corresponding to the query instruction, in order to determine the query text and the query vector, a requirement recognition process and a vectorization process may be performed according to the query data, respectively, and then the query text and the query vector may be determined according to the two processing results, respectively.
It can be understood that, in order to achieve a better processing effect, the requirement recognition technology can be used for carrying out requirement recognition processing on the query data, and the vectorization processing is carried out on the query data through a preset deep learning representation model. The preset deep learning representation model can have a vectorization function, wherein vectorization refers to converting text into vectors, and after query data are determined, the query data can be vectorized through the preset deep learning representation model, so that query vectors are obtained. The requirement recognition technology may have a function of text recognition and text specification, and may be set and adopted according to requirements of actual situations, which is not limited in this embodiment, and after determining query data, the requirement recognition technology may perform text recognition and text specification on the query data, so as to obtain a query text.
Step S202, determining a query text according to the requirement recognition processing result, and determining a query vector according to the vectorization processing result.
It can be understood that after the requirement recognition processing and the vectorization processing are performed on the query data, the query text can be determined according to the requirement recognition processing result, meanwhile, the query vector can be determined according to the vectorization processing result, further, the keyword retrieval is performed according to the query text and the preset keyword index, the vector retrieval is performed according to the query vector and the preset vector index, the keyword retrieval result and the vector retrieval result are obtained, then the target retrieval information is generated by combining the first retrieval information corresponding to the keyword retrieval result and the second retrieval information corresponding to the vector retrieval result, and the information query result is generated according to the target retrieval information. In order to improve semantic relativity, query generalization is realized, query accuracy is ensured, semantic relativity analysis can be performed on the basis of information query results, and target information query results are obtained, so that the method is suitable for all scenes needing accurate text matching, data can be fully utilized, and query generalization can be performed more efficiently and accurately.
In this embodiment, the requirement recognition processing and the vectorization processing are performed according to the query data, the query text is determined according to the requirement recognition processing result, and the query vector is determined according to the vectorization processing result, so that the query text and the query vector can be determined respectively, then the keyword retrieval and the vector retrieval are performed simultaneously, and the accuracy of the query result is improved on the basis that query generalization can be realized by matching with the semantic model.
In addition, the embodiment of the invention also provides a storage medium, wherein the storage medium stores an information inquiry program, and the information inquiry program realizes the steps of the information inquiry method when being executed by a processor.
Because the storage medium adopts all the technical schemes of all the embodiments, the storage medium has at least all the beneficial effects brought by the technical schemes of the embodiments, and the description is omitted here.
In addition, referring to fig. 6, an embodiment of the present invention further provides an information query apparatus, where the information query apparatus includes:
and the query data module 10 is used for determining query data according to the query instruction when the query instruction is received.
It should be noted that, the information query in this embodiment may include, but is not limited to, a text query, and may also include other types of information queries, which are not limited in this embodiment, and in this embodiment, a text query is taken as an example.
It should be noted that, in this embodiment, the query data refers to query condition data input by the user, where the query data may include but is not limited to text data, and may also include other types of data, and in this embodiment, text data is taken as an example for illustration.
It should be appreciated that a user may enter query data in a query interface and then click a query button to send a query instruction to a computer device when an information query is desired. When the computing device receives the query instruction, the computing device may determine corresponding query data according to the query instruction, and then determine an information query result according to the query data.
In a specific implementation, for example, assuming that the user wants to query how much the tomorrow is, the query data is "how much the tomorrow is," which is input by the user, and the final queried information query result may be "how cloudy the tomorrow is, which is clear.
A text vector module 20 for determining a query text and a query vector from the query data.
It should be noted that, unlike the prior art, the scheme determines two search results according to the query text and the query vector in two different ways, and then combines the two search results to obtain an information query result, so that the information query result is generalized to obtain a wider information query result, and the requirements of users are better met.
It should be appreciated that the demand recognition process and the vectorization process may be performed on the query data, respectively, to obtain the query text and the query vector.
It can be understood that when the search is needed, the query data can be vectorized by using the preset deep learning representation model to obtain the query vector, and meanwhile, the query data can be subjected to some standard processing by using the query demand recognition technology to obtain the query text.
The information retrieval module 30 is configured to perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector.
It should be noted that, a preset keyword index and a preset vector index may be generated in advance according to the candidate set to be matched, after the query text and the query vector are determined in the above manner, keyword retrieval may be performed according to the query text and the preset keyword index, so as to select data related to the query text from the candidate set to be matched, and obtain a keyword retrieval result. And vector retrieval can be performed according to the query vector and a preset vector index, so that data related to the query vector is selected from the candidate set to be matched, and a vector retrieval result is obtained.
The query result module 40 is configured to determine an information query result by combining the keyword search result and the vector search result.
It should be understood that the first search information corresponding to the query data may be determined according to the keyword search result, and the second search information corresponding to the query data may be determined according to the vector search result, where the first search information refers to selecting data related to the query text from the candidate set to be matched, and the second search information refers to selecting data related to the query vector from the candidate set to be matched.
It is understood that after the first search information and the second search information are obtained, the target search information may be generated according to the first search information and the second search information, and then the information query result may be generated according to the target search information.
It can be understood that, in order to improve the generation efficiency of the target search information, the first search information and the second search information may be combined, so as to determine a set of the first search information and the second search information, and then the target search information is generated according to the set of the first search information and the second search information, that is, the target search information includes the search information obtained by both the first search information and the second search information.
In a specific implementation, for example, assuming that the first search information determined by the keyword search includes information a, information B, and information C, and assuming that the second search information determined by the vector search includes information D and information E, the target search information may include information a, information B, information C, information D, and information E.
It can be understood that after the information query result is determined, the information query result can be displayed in the query interface, so that the user can conveniently know the information query result, the generalized query result can be provided for the user, and the query requirement of the user is met.
Further, since some data may be far away from the query data in the query result after the generalization query is performed, in order to avoid such a situation, the accuracy of the information query result is improved on the basis of the generalization query, and the information query device further comprises a semantic relevance analysis module, which is used for performing semantic relevance analysis on the information query result through a preset deep learning interactivity model; detecting semantic consistency according to semantic relevance analysis results; and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
It can be understood that after the information query result is obtained in the above manner, semantic relevance analysis can be performed on the information query result through a preset deep learning interactive model, then the consistency of the semantics of each data in the information query result is judged according to the semantic relevance analysis result, and if the consistency detection of the semantics is passed, the information query result is directly processed. If the consistency detection of the semantics is not passed, removing part of data with larger semantic difference from the information query result to obtain a target information query result, or returning to the initial step again to query the information, which is not limited in the embodiment.
It can be appreciated that after determining the target information query result, the target information query result can be displayed in the query interface, so that the user can conveniently know the target information query result. Referring to fig. 3, fig. 3 is an overall flowchart of generalized query, in the scheme, query text and query vector are utilized to query in a previously established index by using a search engine, and semantic consistency is judged by using a deep learning interactive model according to a result returned by the query, so that the generalization type and accuracy of the obtained matching result have good effects.
In a specific implementation, for example, it is assumed that query data is "how open a box surrounded by wind in XX game", and data in a candidate set to be matched is "box surrounded by a circle of wind in XX game", and the two texts are very difficult to recall and satisfy in the original rule matching scene, even the data which can be satisfied in the data to be matched in the candidate set to be matched are very difficult to know, after the processing of the scheme, how open a box surrounded by a circle of wind in XX game "can be retrieved" box surrounded by a circle of wind in XX game ", and the calculation semantic correlation is very high, so that the requirements of users are better satisfied.
It should be noted that, the scheme optimizes the concept of accurate matching and rule matching from the concept of searching and semantic computing based on the scene of recall of some special results required to be accurately matched, utilizes keyword searching and vector searching to realize on-line accurate synonymous matching in cooperation with a semantic model, can furthest utilize data, is applicable to all similar scenes, and has great promotion on-line effect.
It can be understood that the scheme can be applied to an onebox recall system of a search engine, and after the generalization method is used, a search result top3 can recall oneboxes by 3.5%, net income is more than about 1%, and key types such as recall query of carefully selected abstract 13% come from the generalization query method, so that the generalization query effect is greatly improved.
Further, in order to achieve a better semantic relativity analysis effect and enable the target information query result to be more accurate, the model training module is further used for acquiring second training data; and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
It should be appreciated that the second training data may be sample training data related to semantics, and the training data may be mined to train the initial deep learning model to obtain a deep learning interactive model with semantic relevance analysis functionality.
In the embodiment, when a query instruction is received, query data is determined according to the query instruction; determining a query text and a query vector according to the query data; keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector; and determining an information query result by combining the keyword search result and the vector search result. According to the scheme, after the search results are obtained by utilizing the keyword search and the vector search respectively, the information query results are obtained by synthesizing the search results, so that the query results can be generalized, and the purpose of generalized query is realized, so that the data is fully utilized, the queried information is more efficiently generalized, the information query effect is improved, and the requirements of users are better met.
In an embodiment, the information retrieval module 30 is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.
In an embodiment, the information query device further includes an index generating module, configured to obtain sample data of multiple service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
In an embodiment, the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
In an embodiment, the information query device further includes a model training module, configured to obtain first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
In one embodiment, the information retrieval module 30 is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; inputting the query text and the query vector into the search engine; and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
In an embodiment, the text vector module 20 is further configured to perform a requirement recognition process and a vectorization process according to the query data, respectively; and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
In one embodiment, the text vector module 20 is further configured to perform a requirement recognition process on the query data through a requirement recognition technology; and vectorizing the query data through a preset deep learning representation model.
In an embodiment, the query result module 40 is further configured to determine first search information corresponding to the query data according to the keyword search result; determining second retrieval information corresponding to the query data according to the vector retrieval result; generating target retrieval information according to the first retrieval information and the second retrieval information; and generating an information inquiry result according to the target retrieval information.
In an embodiment, the query result module 40 is further configured to combine the first search information and the second search information to determine a set of the first search information and the second search information; generating target retrieval information according to the first retrieval information and the second retrieval information.
In an embodiment, the information query device further includes a semantic relevance analysis module, configured to perform semantic relevance analysis on the information query result through a preset deep learning interactive model; detecting semantic consistency according to semantic relevance analysis results; and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
In an embodiment, the model training module is further configured to obtain second training data; and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
Other embodiments or specific implementation methods of the information query apparatus of the present invention may refer to the above method embodiments, and are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in an estimator readable storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a smart device (which may be a mobile phone, estimator, information query device, or network information query device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
The invention discloses an A1 information query method, which comprises the following steps:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
A2, the information query method according to A1, wherein the keyword retrieval is performed according to the query text, and the vector retrieval is performed according to the query vector, and the method comprises the following steps:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
A3, the information query method as described in A2, wherein before the keyword retrieval is performed according to the query text and the preset keyword index and the vector retrieval is performed according to the query vector and the preset vector index, the method further comprises:
Acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
A4, the information query method as described in A3, wherein the generating a preset keyword index and a preset vector index according to the candidate set to be matched includes:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
A5, the information query method as described in A4, before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, further includes:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
A6, the information query method as set forth in A2, wherein the keyword search is performed according to the query text and a preset keyword index, and the vector search is performed according to the query vector and a preset vector index, and the method comprises:
configuring a search engine according to a preset keyword index and a preset vector index;
Inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
A7, the information query method of any one of A1 to A6, wherein the determining a query text and a query vector according to the query data comprises:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
A8, the information query method as described in A7, wherein the performing the requirement identification process and the vectorization process according to the query data respectively includes:
performing demand recognition processing on the query data through a demand recognition technology;
and vectorizing the query data through a preset deep learning representation model.
A9, the information query method according to any one of A1 to A6, wherein the determining the information query result by combining the keyword search result and the vector search result includes:
determining first retrieval information corresponding to the query data according to the keyword retrieval result;
Determining second retrieval information corresponding to the query data according to the vector retrieval result;
generating target retrieval information according to the first retrieval information and the second retrieval information;
and generating an information inquiry result according to the target retrieval information.
A10, the information query method of A9, the generating target search information according to the first search information and the second search information, includes:
combining the first search information and the second search information to determine a set of the first search information and the second search information;
generating target retrieval information according to the first retrieval information and the second retrieval information.
A11, the information query method according to any one of A1 to A6, wherein after the information query result is determined by combining the keyword search result and the vector search result, the method further comprises:
carrying out semantic relevance analysis on the information inquiry result through a preset deep learning interactive model;
detecting semantic consistency according to semantic relevance analysis results;
and obtaining a target information inquiry result according to the detection result, and displaying the target information inquiry result.
A12, the information query method as described in A11, before the semantic relevance analysis is performed on the information query result by a preset deep learning interactivity model, further comprises:
Acquiring second training data;
and training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.
The invention also discloses a B13 and an information inquiry device, wherein the information inquiry device comprises:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
the text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
and the query result module is used for determining information query results by combining the keyword search results and the vector search results.
And B14, the information query device as described in B13, wherein the information retrieval module is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.
B15, the information inquiry apparatus of B14, the information inquiry apparatus further comprising:
the index generation module is used for acquiring sample data of various service types; generating a candidate set to be matched according to the sample data; and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
The information inquiry device as described in B16, wherein the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
B17, the information inquiry apparatus of B16, the information inquiry apparatus further comprising:
the model training module is used for acquiring first training data; and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
B18, the information query device as described in B14, wherein the information retrieval module is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; inputting the query text and the query vector into the search engine; and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
The invention also discloses C19, an information inquiry device, the information inquiry device includes: the information inquiry system comprises a memory, a processor and an information inquiry program which is stored in the memory and can run on the processor, wherein the information inquiry program realizes the information inquiry method when being executed by the processor.
The invention also discloses D20, a storage medium, the storage medium stores an information inquiry program, and the information inquiry program realizes the information inquiry method when being executed by a processor.

Claims (10)

1. An information query method, characterized in that the information query method comprises:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
2. The information query method as claimed in claim 1, wherein said performing keyword search based on said query text and performing vector search based on said query vector comprises:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
3. The information query method of claim 2, wherein before the keyword search is performed according to the query text and a preset keyword index, and the vector search is performed according to the query vector and a preset vector index, further comprising:
Acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
4. The information query method of claim 3, wherein the generating a preset keyword index and a preset vector index according to the candidate set to be matched comprises:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
5. The information query method of claim 4, wherein before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, further comprises:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
6. The information query method of claim 2, wherein the performing keyword retrieval according to the query text and a preset keyword index and performing vector retrieval according to the query vector and a preset vector index comprises:
Configuring a search engine according to a preset keyword index and a preset vector index;
inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
7. The information query method of any one of claims 1 to 6, wherein said determining query text and query vectors from said query data comprises:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
8. An information inquiry apparatus, characterized in that the information inquiry apparatus includes:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
the text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
And the query result module is used for determining information query results by combining the keyword search results and the vector search results.
9. An information inquiry apparatus, characterized in that the information inquiry apparatus includes: memory, a processor and an information query program stored on the memory and executable on the processor, which when executed by the processor implements the information query method of any one of claims 1 to 7.
10. A storage medium having stored thereon an information inquiry program which, when executed by a processor, implements the information inquiry method according to any one of claims 1 to 7.
CN202111647549.6A 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium Pending CN116414941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111647549.6A CN116414941A (en) 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111647549.6A CN116414941A (en) 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116414941A true CN116414941A (en) 2023-07-11

Family

ID=87054874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111647549.6A Pending CN116414941A (en) 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116414941A (en)

Similar Documents

Publication Publication Date Title
CN109086303B (en) Intelligent conversation method, device and terminal based on machine reading understanding
CN109635273B (en) Text keyword extraction method, device, equipment and storage medium
CN108170859B (en) Voice query method, device, storage medium and terminal equipment
CN110457431A (en) Answering method, device, computer equipment and the storage medium of knowledge based map
US9268767B2 (en) Semantic-based search system and search method thereof
CN111666399A (en) Intelligent question and answer method and device based on knowledge graph and computer equipment
CN116991977B (en) Domain vector knowledge accurate retrieval method and device based on large language model
CN110795541B (en) Text query method, text query device, electronic equipment and computer readable storage medium
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN111625638B (en) Question processing method, device, equipment and readable storage medium
CN111400340A (en) Natural language processing method and device, computer equipment and storage medium
CN118113815B (en) Content searching method, related device and medium
CN117521625A (en) Question answering method, question answering device, electronic equipment and medium
CN110795547A (en) Text recognition method and related product
CN113822039A (en) Method and related equipment for mining similar meaning words
CN113343692A (en) Search intention recognition method, model training method, device, medium and equipment
CN115203378B (en) Retrieval enhancement method, system and storage medium based on pre-training language model
CN107368525B (en) Method and device for searching related words, storage medium and terminal equipment
CN116414941A (en) Information query method, device, equipment and storage medium
JP6495206B2 (en) Document concept base generation device, document concept search device, method, and program
CN112308016B (en) Expression image acquisition method and device, electronic equipment and storage medium
CN117931858B (en) Data query method, device, computer equipment and storage medium
CN114242047B (en) Voice processing method and device, electronic equipment and storage medium
CN118132791A (en) Image retrieval method, device, equipment, readable storage medium and product
JP6334491B2 (en) Concept base generation device, concept search device, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination