CN116414941A - Information query method, device, equipment and storage medium - Google Patents

Information query method, device, equipment and storage medium Download PDF

Info

Publication number
CN116414941A
CN116414941A CN202111647549.6A CN202111647549A CN116414941A CN 116414941 A CN116414941 A CN 116414941A CN 202111647549 A CN202111647549 A CN 202111647549A CN 116414941 A CN116414941 A CN 116414941A
Authority
CN
China
Prior art keywords
query
vector
information
retrieval
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111647549.6A
Other languages
Chinese (zh)
Inventor
李灏舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN202111647549.6A priority Critical patent/CN116414941A/en
Publication of CN116414941A publication Critical patent/CN116414941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种信息查询方法、装置、设备及存储介质,属于互联网技术领域,所述方法包括:在接收到查询指令时,根据所述查询指令确定查询数据;根据所述查询数据确定查询文本和查询向量;根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;结合关键词检索结果和向量检索结果确定信息查询结果。本方案利用关键词检索和向量检索分别得到检索结果后,再综合检索结果得到信息查询结果,可以泛化查询结果,实现泛化查询的目的,从而充分利用数据,较为高效地对查询到的信息进行泛化,提高信息查询效果,更好的满足用户的需求。

Figure 202111647549

The invention discloses an information query method, device, equipment and storage medium, belonging to the technical field of the Internet. The method includes: when receiving a query instruction, determining query data according to the query instruction; determining query data according to the query data text and query vector; perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector; combine keyword retrieval results and vector retrieval results to determine information query results. This scheme uses keyword retrieval and vector retrieval to obtain the retrieval results respectively, and then synthesizes the retrieval results to obtain the information query results, which can generalize the query results and achieve the purpose of generalized query, so as to make full use of the data and more efficiently query the information Carry out generalization, improve the effect of information query, and better meet the needs of users.

Figure 202111647549

Description

信息查询方法、装置、设备及存储介质Information query method, device, equipment and storage medium

技术领域technical field

本发明涉及互联网技术领域,尤其涉及一种信息查询方法、装置、设备及存储介质。The present invention relates to the technical field of the Internet, in particular to an information query method, device, equipment and storage medium.

背景技术Background technique

在搜索、推荐和广告系统中,经常有需要精准命中某些文本时,需要展现特定内容的场景。这种场景下通常追求准确,一般是精确命中才会生效,而如何对命中文本(在搜索中一般为query,下统称query)进行泛化是一个难题。In search, recommendation and advertising systems, there are often scenarios where specific content needs to be displayed when certain texts need to be precisely hit. In this scenario, accuracy is usually pursued, and it usually takes effect only when it is precisely hit. However, how to generalize the hit text (generally query in search, hereinafter collectively referred to as query) is a difficult problem.

现有的技术基本是分不同的query类型,对不同类型的query采用不一样的方法,且在扩展中经常使用的离线挖掘的方法,在线生效还是通过完全命中或者完全命中规则的方式。如,利用翻译模型或者共点击等策略挖掘query的候选集,然后利用语义匹配等算法卡控相关性,最后将挖掘出来的query进行上线。这种技术方案有着时效性差、可扩展性差,以及泛化能力有限等问题,因为需要离线处理好query映射关系,针对没见过的数据,依然无法进行匹配,而且针对不同的类型经常会采用不同的挖掘手段和方法,无法适用于全量的数据。The existing technology is basically divided into different query types, and different methods are used for different types of queries, and the offline mining method that is often used in the extension, the online effect is still through the full hit or full hit rule. For example, use strategies such as translation models or co-clicks to mine query candidate sets, then use algorithms such as semantic matching to control the relevance, and finally put the mined queries online. This technical solution has problems such as poor timeliness, poor scalability, and limited generalization ability. Because the query mapping relationship needs to be processed offline, it is still impossible to match data that has not been seen, and different types are often used. The mining means and methods cannot be applied to the full amount of data.

上述内容仅用于辅助理解本发明的技术方案,并不代表承认上述内容是现有技术。The above content is only used to assist in understanding the technical solution of the present invention, and does not mean that the above content is admitted as prior art.

发明内容Contents of the invention

本发明的主要目的在于提出一种信息查询方法、装置、设备及存储介质,旨在解决如何充分利用数据,较为高效地对查询到的信息进行泛化,提高信息查询效果的技术问题。The main purpose of the present invention is to propose an information query method, device, equipment and storage medium, aiming to solve the technical problem of how to make full use of data, more efficiently generalize the queried information, and improve the effect of information query.

为实现上述目的,本发明提供一种信息查询方法,所述信息查询方法包括:In order to achieve the above object, the present invention provides an information query method, the information query method comprising:

在接收到查询指令时,根据所述查询指令确定查询数据;When receiving a query instruction, determine query data according to the query instruction;

根据所述查询数据确定查询文本和查询向量;determining a query text and a query vector according to the query data;

根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;performing keyword retrieval according to the query text, and performing vector retrieval according to the query vector;

结合关键词检索结果和向量检索结果确定信息查询结果。Combining keyword retrieval results and vector retrieval results to determine information query results.

可选地,所述根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索,包括:Optionally, performing keyword retrieval according to the query text, and performing vector retrieval according to the query vector include:

根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。Keyword retrieval is performed according to the query text and a preset keyword index, and vector retrieval is performed according to the query vector and a preset vector index.

可选地,所述根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索之前,还包括:Optionally, before performing the keyword retrieval according to the query text and the preset keyword index, and performing the vector retrieval according to the query vector and the preset vector index, the method further includes:

获取多种业务类型的样本数据;Obtain sample data of various business types;

根据所述样本数据生成待匹配候选集;generating a candidate set to be matched according to the sample data;

根据所述待匹配候选集生成预设关键词索引和预设向量索引。Generate a preset keyword index and a preset vector index according to the candidate set to be matched.

可选地,所述根据所述待匹配候选集生成预设关键词索引和预设向量索引,包括:Optionally, the generating a preset keyword index and a preset vector index according to the candidate set to be matched includes:

通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量;Generate the offline vector corresponding to the candidate set to be matched by preset deep learning representation model;

根据所述待匹配候选集生成预设关键词索引,并根据所述离线向量生成预设向量索引。A preset keyword index is generated according to the candidate set to be matched, and a preset vector index is generated according to the offline vector.

可选地,所述通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量之前,还包括:Optionally, before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, it also includes:

获取第一训练数据;Obtain the first training data;

根据所述第一训练数据对初始深度学习模型进行训练,得到预设深度学习表示型模型。The initial deep learning model is trained according to the first training data to obtain a preset deep learning representation model.

可选地,所述根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索,包括:Optionally, performing keyword retrieval according to the query text and a preset keyword index, and performing vector retrieval according to the query vector and a preset vector index include:

根据预设关键词索引和预设向量索引配置检索引擎;Configure the search engine according to the preset keyword index and preset vector index;

将所述查询文本和所述查询向量输入所述检索引擎;inputting said query text and said query vector into said search engine;

通过所述检索引擎调用预设关键词索引对所述查询文本进行关键词检索,并通过所述检索引擎调用预设向量索引对所述查询向量进行向量检索。The search engine calls a preset keyword index to perform keyword search on the query text, and the search engine calls a preset vector index to perform vector search on the query vector.

可选地,所述根据所述查询数据确定查询文本和查询向量,包括:Optionally, the determining the query text and the query vector according to the query data includes:

分别根据所述查询数据进行需求识别处理和向量化处理;Respectively performing demand identification processing and vectorization processing according to the query data;

根据需求识别处理结果确定查询文本,并根据向量化处理结果确定查询向量。The query text is determined according to the requirement identification processing result, and the query vector is determined according to the vectorization processing result.

可选地,所述分别根据所述查询数据进行需求识别处理和向量化处理,包括:Optionally, performing demand identification processing and vectorization processing respectively according to the query data includes:

通过需求识别技术对所述查询数据进行需求识别处理;performing demand identification processing on the query data through a demand identification technology;

通过预设深度学习表示型模型对所述查询数据进行向量化处理。The query data is vectorized through a preset deep learning representation model.

可选地,所述结合关键词检索结果和向量检索结果确定信息查询结果,包括:Optionally, the combination of keyword retrieval results and vector retrieval results to determine information query results includes:

根据所述关键词检索结果确定与所述查询数据对应的第一检索信息;determining first retrieval information corresponding to the query data according to the keyword retrieval results;

根据所述向量检索结果确定与所述查询数据对应的第二检索信息;determining second retrieval information corresponding to the query data according to the vector retrieval result;

根据所述第一检索信息和所述第二检索信息生成目标检索信息;generating target retrieval information according to the first retrieval information and the second retrieval information;

根据所述目标检索信息生成信息查询结果。An information query result is generated according to the target retrieval information.

可选地,所述根据所述第一检索信息和所述第二检索信息生成目标检索信息,包括:Optionally, the generating target retrieval information according to the first retrieval information and the second retrieval information includes:

将所述第一检索信息和所述第二检索信息进行结合,确定所述第一检索信息与所述第二检索信息的集合;Combining the first search information and the second search information to determine a set of the first search information and the second search information;

根据所述第一检索信息和所述第二检索信息的集合生成目标检索信息。Target retrieval information is generated according to a set of the first retrieval information and the second retrieval information.

可选地,所述结合关键词检索结果和向量检索结果确定信息查询结果之后,还包括:Optionally, after determining the information query result by combining the keyword search result and the vector search result, it further includes:

通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析;Performing a semantic correlation analysis on the information query results through a preset deep learning interactivity model;

根据语义相关性分析结果检测语义的一致性;Detect semantic consistency based on semantic correlation analysis results;

根据检测结果得到目标信息查询结果,并对所述目标信息查询结果进行展示。The target information query result is obtained according to the detection result, and the target information query result is displayed.

可选地,所述通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析之前,还包括:Optionally, before performing semantic correlation analysis on the information query results through the preset deep learning interactivity model, it also includes:

获取第二训练数据;Obtain second training data;

根据所述第二训练数据对初始深度学习模型进行训练,得到预设深度学习交互性模型。The initial deep learning model is trained according to the second training data to obtain a preset deep learning interactive model.

此外,为实现上述目的,本发明还提出一种信息查询装置,所述信息查询装置包括:In addition, in order to achieve the above purpose, the present invention also proposes an information query device, which includes:

查询数据模块,用于在接收到查询指令时,根据所述查询指令确定查询数据;A query data module, configured to determine query data according to the query command when the query command is received;

文本向量模块,用于根据所述查询数据确定查询文本和查询向量;A text vector module, configured to determine query text and query vectors according to the query data;

信息检索模块,用于根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;An information retrieval module, configured to perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector;

查询结果模块,用于结合关键词检索结果和向量检索结果确定信息查询结果。The query result module is used to determine the information query result in combination with the keyword retrieval result and the vector retrieval result.

可选地,所述信息检索模块,还用于根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。Optionally, the information retrieval module is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.

可选地,所述信息查询装置还包括:Optionally, the information query device also includes:

索引生成模块,用于获取多种业务类型的样本数据;根据所述样本数据生成待匹配候选集;根据所述待匹配候选集生成预设关键词索引和预设向量索引。The index generation module is used to obtain sample data of various business types; generate a candidate set to be matched according to the sample data; generate a preset keyword index and a preset vector index according to the candidate set to be matched.

可选地,所述索引生成模块,还用于通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量;根据所述待匹配候选集生成预设关键词索引,并根据所述离线向量生成预设向量索引。Optionally, the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model; generate a preset keyword index according to the candidate set to be matched, and generate a preset keyword index according to the The above offline vectors generate preset vector indexes.

可选地,所述信息查询装置还包括:Optionally, the information query device also includes:

模型训练模块,用于获取第一训练数据;根据所述第一训练数据对初始深度学习模型进行训练,得到预设深度学习表示型模型。The model training module is used to acquire first training data; train the initial deep learning model according to the first training data to obtain a preset deep learning representation model.

可选地,所述信息检索模块,还用于根据预设关键词索引和预设向量索引配置检索引擎;将所述查询文本和所述查询向量输入所述检索引擎;通过所述检索引擎调用预设关键词索引对所述查询文本进行关键词检索,并通过所述检索引擎调用预设向量索引对所述查询向量进行向量检索。Optionally, the information retrieval module is also configured to configure a retrieval engine according to a preset keyword index and a preset vector index; input the query text and the query vector into the retrieval engine; call The preset keyword index performs keyword search on the query text, and the search engine invokes a preset vector index to perform vector search on the query vector.

此外,为实现上述目的,本发明还提出一种信息查询设备,所述信息查询设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的信息查询程序,所述信息查询程序被处理器执行时实现如上所述的信息查询方法。In addition, in order to achieve the above object, the present invention also proposes an information query device, the information query device includes: a memory, a processor, and an information query program stored in the memory and operable on the processor. When the above information query program is executed by the processor, the above information query method is realized.

此外,为实现上述目的,本发明还提出一种存储介质,所述存储介质上存储有信息查询程序,所述信息查询程序被处理器执行时实现如上所述的信息查询方法。In addition, in order to achieve the above purpose, the present invention also proposes a storage medium, on which an information query program is stored, and when the information query program is executed by a processor, the above information query method is realized.

本发明提出的信息查询方法中,在接收到查询指令时,根据所述查询指令确定查询数据;根据所述查询数据确定查询文本和查询向量;根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;结合关键词检索结果和向量检索结果确定信息查询结果。本方案利用关键词检索和向量检索分别得到检索结果后,再综合检索结果得到信息查询结果,可以泛化查询结果,实现泛化查询的目的,从而充分利用数据,较为高效地对查询到的信息进行泛化,提高信息查询效果,更好的满足用户的需求。In the information query method proposed by the present invention, when a query instruction is received, the query data is determined according to the query instruction; the query text and the query vector are determined according to the query data; keyword retrieval is performed according to the query text, and according to the The above query vector is used for vector retrieval; and the information query result is determined by combining the keyword retrieval result and the vector retrieval result. This scheme uses keyword retrieval and vector retrieval to obtain the retrieval results respectively, and then synthesizes the retrieval results to obtain the information query results, which can generalize the query results and achieve the purpose of generalized query, so as to make full use of the data and more efficiently query the information Carry out generalization, improve the effect of information query, and better meet the needs of users.

附图说明Description of drawings

图1是本发明实施例方案涉及的硬件运行环境的信息查询设备结构示意图;Fig. 1 is a schematic structural diagram of an information query device of a hardware operating environment involved in the solution of an embodiment of the present invention;

图2为本发明信息查询方法第一实施例的流程示意图;Fig. 2 is a schematic flow chart of the first embodiment of the information query method of the present invention;

图3为本发明信息查询方法一实施例的泛化查询整体流程图;Fig. 3 is the overall flowchart of the generalized query of an embodiment of the information query method of the present invention;

图4为本发明信息查询方法第二实施例的流程示意图;Fig. 4 is a schematic flow chart of the second embodiment of the information query method of the present invention;

图5为本发明信息查询方法第三实施例的流程示意图;Fig. 5 is a schematic flow chart of the third embodiment of the information query method of the present invention;

图6为本发明信息查询装置第一实施例的功能模块示意图。Fig. 6 is a schematic diagram of functional modules of the first embodiment of the information query device of the present invention.

本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization of the purpose of the present invention, functional characteristics and advantages will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

具体实施方式Detailed ways

应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

参照图1,图1为本发明实施例方案涉及的硬件运行环境的信息查询设备结构示意图。Referring to FIG. 1 , FIG. 1 is a schematic structural diagram of an information query device in a hardware operating environment involved in the solution of an embodiment of the present invention.

如图1所示,该信息查询设备可以包括:处理器1001,例如中央处理器(CentralProcessing Unit,CPU),通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如按键,可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如Wi-Fi接口)。存储器1005可以是高速随机存取存储器(Random Access Memory,RAM),也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1 , the information query device may include: a processor 1001 , such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002 , a user interface 1003 , a network interface 1004 , and a memory 1005 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a button, and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a Wi-Fi interface). The memory 1005 may be a high-speed random access memory (Random Access Memory, RAM), or a stable memory (non-volatile memory), such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .

本领域技术人员可以理解,图1中示出的设备结构并不构成对信息查询设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the device structure shown in FIG. 1 does not constitute a limitation to the information query device, and may include more or less components than those shown in the figure, or combine some components, or arrange different components.

如图1所示,作为一种存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及信息查询程序。As shown in FIG. 1 , the memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and an information query program.

在图1所示的信息查询设备中,网络接口1004主要用于连接外网,与其他网络设备进行数据通信;用户接口1003主要用于连接用户设备,与所述用户设备进行数据通信;本发明设备通过处理器1001调用存储器1005中存储的信息查询程序,并执行本发明实施例提供的信息查询方法。In the information query device shown in Figure 1, the network interface 1004 is mainly used to connect to the external network and perform data communication with other network devices; the user interface 1003 is mainly used to connect to user equipment and perform data communication with the user equipment; the present invention The device invokes the information query program stored in the memory 1005 through the processor 1001, and executes the information query method provided by the embodiment of the present invention.

基于上述硬件结构,提出本发明信息查询方法实施例。Based on the above hardware structure, an embodiment of the information query method of the present invention is proposed.

参照图2,图2为本发明信息查询方法第一实施例的流程示意图。Referring to FIG. 2 , FIG. 2 is a schematic flowchart of the first embodiment of the information query method of the present invention.

在第一实施例中,所述信息查询方法包括:In the first embodiment, the information query method includes:

步骤S10,在接收到查询指令时,根据所述查询指令确定查询数据。Step S10, when a query instruction is received, determine query data according to the query instruction.

需要说明的是,本实施例的执行主体可为信息查询设备,该信息查询设备可为具有数据处理功能的计算机设备,还可为其他可实现相同或相似功能的设备,本实施例对此不作限制,在本实施例中,以信息查询设备为例进行说明。It should be noted that the execution subject of this embodiment can be an information query device, and the information query device can be a computer device with a data processing function, or other devices that can achieve the same or similar functions, and this embodiment does not make any In this embodiment, an information query device is used as an example for illustration.

需要说明的是,本实施例中的信息查询可以包括但不限于文本查询,还可包括其他类型的信息查询,本实施例对此不作限制,在本实施例中,以文本查询为例进行说明。It should be noted that the information query in this embodiment may include but not limited to text query, and may also include other types of information query, which is not limited in this embodiment. In this embodiment, text query is used as an example for illustration .

需要说明的是,本实施例中的查询数据指的是用户输入的查询条件数据,所述查询数据可以包括但不限于文本数据,还可包括其他类型的数据,本实施例对此不作限制,在本实施例中,以文本数据为例进行说明。It should be noted that the query data in this embodiment refers to the query condition data input by the user, and the query data may include but not limited to text data, and may also include other types of data, which is not limited in this embodiment. In this embodiment, text data is taken as an example for description.

应当理解的是,用户在需要进行信息查询时,可在查询界面中输入查询数据,然后点击查询按钮,以向计算机设备发送查询指令。计算设备在接收到查询指令时,可根据查询指令确定对应的查询数据,然后根据查询数据来确定信息查询结果。It should be understood that when a user needs to query information, he can input query data in the query interface, and then click a query button to send a query instruction to the computer device. When the computing device receives the query instruction, it can determine the corresponding query data according to the query instruction, and then determine the information query result according to the query data.

在具体实现中,例如,假设用户想要查询明天天气怎么样,则查询数据为用户输入的“明天天气怎么样”,最终查询到的信息查询结果可为“明天晴转多云”。In a specific implementation, for example, assuming that the user wants to inquire about the weather tomorrow, the query data is "how is the weather tomorrow" input by the user, and the finally obtained information query result may be "tomorrow will be sunny to cloudy".

步骤S20,根据所述查询数据确定查询文本和查询向量。Step S20, determining a query text and a query vector according to the query data.

需要说明的是,区别于现有技术,本方案根据查询文本和查询向量通过两种不同的方式来确定两种检索结果,然后将这两种检索结果进行结合来得到信息查询结果,从而将信息查询结果进行泛化,得到更为广泛的信息查询结果,更好的满足用户的需求。It should be noted that, different from the prior art, this solution determines the two retrieval results in two different ways according to the query text and the query vector, and then combines the two retrieval results to obtain the information query result, so that the information The query results are generalized to obtain more extensive information query results to better meet the needs of users.

应当理解的是,可分别针对查询数据进行需求识别处理和向量化处理,从而得到查询文本和查询向量。It should be understood that requirement recognition processing and vectorization processing may be performed on the query data respectively, so as to obtain query text and query vectors.

可以理解的是,在需要检索时,可利用预设深度学习表示型模型对查询数据进行向量化,得到查询向量,同时还可利用query需求识别技术对查询数据进行一些规范处理,得到查询文本。It is understandable that when retrieval is required, the preset deep learning representation model can be used to vectorize the query data to obtain a query vector, and at the same time, the query demand recognition technology can be used to perform some standardized processing on the query data to obtain the query text.

步骤S30,根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索。Step S30, perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector.

需要说明的是,可预先根据待匹配候选集生成预设关键词索引和预设向量索引,在通过上述方式确定查询文本和查询向量之后,可以根据查询文本和预设关键词索引进行关键词检索,以从待匹配候选集中选取与查询文本相关的数据,得到关键词检索结果。同时还可根据查询向量和预设向量索引进行向量检索,以从待匹配候选集中选取与查询向量相关的数据,得到向量检索结果。It should be noted that the preset keyword index and preset vector index can be generated in advance according to the candidate set to be matched, and after the query text and query vector are determined in the above-mentioned way, keyword retrieval can be performed according to the query text and preset keyword index , to select the data related to the query text from the candidate set to be matched to obtain the keyword retrieval result. At the same time, vector retrieval can also be performed according to the query vector and the preset vector index, so as to select data related to the query vector from the candidate set to be matched to obtain the vector retrieval result.

步骤S40,结合关键词检索结果和向量检索结果确定信息查询结果。Step S40, combining the keyword retrieval result and the vector retrieval result to determine the information query result.

应当理解的是,可根据关键词检索结确定与查询数据对应的第一检索信息,并根据向量检索结果确定与查询数据对应的第二检索信息,其中,第一检索信息指的是从待匹配候选集中选取与查询文本相关的数据,第二检索信息指的是从待匹配候选集中选取与查询向量相关的数据。It should be understood that the first retrieval information corresponding to the query data can be determined according to the keyword retrieval result, and the second retrieval information corresponding to the query data can be determined according to the vector retrieval result, wherein the first retrieval information refers to the The data related to the query text is selected from the candidate set, and the second retrieval information refers to selecting data related to the query vector from the candidate set to be matched.

可以理解的是,在得到第一检索信息和第二检索信息之后,可根据第一检索信息和第二检索信息生成目标检索信息,进而根据目标检索信息生成信息查询结果。It can be understood that after the first search information and the second search information are obtained, the target search information can be generated according to the first search information and the second search information, and then the information query result can be generated according to the target search information.

可以理解的是,为了提高目标检索信息的生成效率,可将第一检索信息和第二检索信息进行结合,从而确定第一检索信息与第二检索信息的集合,然后根据第一检索信息和第二检索信息的集合生成目标检索信息,即目标检索信息中包含了第一检索信息和第二检索信息这两种方式得到的检索信息。It can be understood that, in order to improve the generation efficiency of the target retrieval information, the first retrieval information and the second retrieval information can be combined to determine the set of the first retrieval information and the second retrieval information, and then according to the first retrieval information and the second retrieval information The collection of the second retrieval information generates the target retrieval information, that is, the target retrieval information includes the retrieval information obtained by the first retrieval information and the second retrieval information.

在具体实现中,例如,假设通过关键词检索确定的第一检索信息包括信息A、信息B以及信息C,假设通过向量检索确定的第二检索信息包括信息D和信息E,则目标检索信息可包括信息A、信息B、信息C、信息D以及信息E。In a specific implementation, for example, assuming that the first search information determined through keyword search includes information A, information B, and information C, and assuming that the second search information determined through vector search includes information D and information E, the target search information can be Including information A, information B, information C, information D and information E.

可以理解的是,在确定信息查询结果之后,可在查询界面中对信息查询结果进行展示,使用户方便知晓信息查询结果,能够为用户提供泛化的查询结果,符合用户的查询需求。It can be understood that after the information query result is determined, the information query result can be displayed in the query interface, so that the user can easily know the information query result, and can provide the user with generalized query results to meet the user's query needs.

进一步地,由于在进行泛化查询之后,查询结果中可能存在一些数据与查询数据相差的较远,因为,为了避免这种情况出现,在泛化查询的基础上提高信息查询结果的准确性,所述步骤S40之后,还包括:Furthermore, after the generalized query is performed, there may be some data in the query result that is far from the query data, because, in order to avoid this situation, the accuracy of the information query result is improved on the basis of the generalized query, After the step S40, it also includes:

通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析;根据语义相关性分析结果检测语义的一致性;根据检测结果得到目标信息查询结果,并对所述目标信息查询结果进行展示。Perform semantic correlation analysis on the information query results through a preset deep learning interactivity model; detect semantic consistency according to the semantic correlation analysis results; obtain target information query results according to the detection results, and perform target information query results on the target information query results. exhibit.

可以理解的是,在通过上述方式得到信息查询结果之后,可通过预设深度学习交互性模型对信息查询结果进行语义相关性分析,然后根据语义相关性分析结果判断信息查询结果中各数据的语义的一致性,如果语义的一致性检测通过,则直接将信息查询结果。如果语义的一致性检测不通过,则将信息查询结果中部分语义差距较大的数据移除,以得到目标信息查询结果,或者重新返回初始步骤进行信息查询,本实施例对此不作限制。It is understandable that after the information query results are obtained through the above methods, the semantic correlation analysis of the information query results can be performed through the preset deep learning interactive model, and then the semantics of each data in the information query results can be judged according to the semantic correlation analysis results. Consistency, if the semantic consistency check is passed, the information query result will be directly sent. If the semantic consistency check fails, remove some data with large semantic differences in the information query result to obtain the target information query result, or return to the initial step for information query, which is not limited in this embodiment.

可以理解的是,在确定目标信息查询结果之后,可在查询界面中对目标信息查询结果进行展示,使用户方便知晓目标信息查询结果。可参照图3,图3为泛化查询整体流程图,本方案利用query文本和query向量利用检索引擎在之前建立的索引中进行查询,根据查询返回的结果,利用深度学习交互性模型来判断语义的一致性,由此得到的匹配结果泛化型和准确性都会有很好的效果。It can be understood that after the target information query result is determined, the target information query result can be displayed on the query interface, so that the user can easily know the target information query result. You can refer to Figure 3, which is the overall flow chart of generalized query. This solution uses query text and query vector to use the search engine to query in the previously established index, and uses the deep learning interactive model to judge the semantics according to the results returned by the query. Consistency, the generalization and accuracy of the matching results obtained from this will have a good effect.

在具体实现中,例如,假设查询数据为“XX游戏中被风包围的宝箱怎么开”,待匹配候选集中有数据为“XX游戏中被一圈风围着的宝箱”,这两个文本的在原来的规则匹配场景是非常难以召回和满足的,甚至都很难知道待匹配候选集中的待匹配数据中有可以满足的数据,经过本方案处理之后,“XX游戏中被风包围的宝箱怎么开”可以检索到“XX游戏中被一圈风围着的宝箱”,且计算语义相关性非常高,更好的满足用户的需求。In a specific implementation, for example, assuming that the query data is "how to open a treasure chest surrounded by wind in XX game", and there is data in the candidate set to be matched as "a treasure chest surrounded by wind in XX game", the two texts In the original rule matching scenario, it is very difficult to recall and satisfy, and it is even difficult to know that there is data that can be satisfied in the data to be matched in the candidate set to be matched. Open" can retrieve "the treasure chest surrounded by a circle of wind in the XX game", and the calculation semantics is very relevant, which can better meet the needs of users.

需要说明的是,本方案基于将需要精准匹配召回一些特型结果的场景,从精确匹配+规则匹配的思路优化为检索+语义计算的思路,利用关键词检索+向量检索,配合语义模型实现在线的精确的同义匹配,可以最大限度的利用数据,适用于所有类似场景,对线上效果有较大的提升。It should be noted that this solution is based on scenarios that require precise matching and recall of some special results, and is optimized from the idea of exact matching + rule matching to the idea of retrieval + semantic calculation. It uses keyword retrieval + vector retrieval and cooperates with the semantic model to realize online The precise synonymous matching can maximize the use of data and is applicable to all similar scenarios, which greatly improves the online effect.

可以理解的是,可将本方案应用在搜索引擎的onebox召回系统中,使用该种泛化方法后搜索结果top3可以多召回onebox 3.5%,净收益约1%以上,重点类型如精选摘要13%的召回query来自该泛化查询方法,极大提高了泛化查询效果。It is understandable that this solution can be applied to the onebox recall system of search engines. After using this generalization method, the top3 search results can recall more oneboxes by 3.5%, with a net profit of more than 1%. Key types such as featured abstracts13 % of the recall query comes from this generalized query method, which greatly improves the generalized query effect.

进一步地,为了达到更好的语义相关性分析效果,使得目标信息查询结果更加准确,所述通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析之前,还包括:Further, in order to achieve a better semantic correlation analysis effect and make the target information query result more accurate, before performing the semantic correlation analysis on the information query result through the preset deep learning interactivity model, it also includes:

获取第二训练数据;根据所述第二训练数据对初始深度学习模型进行训练,得到预设深度学习交互性模型。Acquiring second training data; training the initial deep learning model according to the second training data to obtain a preset deep learning interactive model.

应当理解的是,第二训练数据可为与语义相关的样本训练数据,可挖掘训练数据,以对初始深度学习模型进行训练,得到具有语义相关性分析功能的深度学习交互性模型。It should be understood that the second training data may be sample training data related to semantics, and the training data may be mined to train the initial deep learning model to obtain a deep learning interactive model with a semantic correlation analysis function.

在本实施例中,在接收到查询指令时,根据所述查询指令确定查询数据;根据所述查询数据确定查询文本和查询向量;根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;结合关键词检索结果和向量检索结果确定信息查询结果。本方案利用关键词检索和向量检索分别得到检索结果后,再综合检索结果得到信息查询结果,可以泛化查询结果,实现泛化查询的目的,从而充分利用数据,较为高效地对查询到的信息进行泛化,提高信息查询效果,更好的满足用户的需求。In this embodiment, when a query instruction is received, the query data is determined according to the query instruction; the query text and the query vector are determined according to the query data; keyword retrieval is performed according to the query text, and according to the query vector Carry out vector retrieval; combine keyword retrieval results and vector retrieval results to determine information query results. This scheme uses keyword retrieval and vector retrieval to obtain the retrieval results respectively, and then synthesizes the retrieval results to obtain the information query results, which can generalize the query results and achieve the purpose of generalized query, so as to make full use of the data and more efficiently query the information Carry out generalization, improve the effect of information query, and better meet the needs of users.

在一实施例中,如图4所示,基于第一实施例提出本发明信息查询方法第二实施例,所述步骤S30,包括:In one embodiment, as shown in FIG. 4 , a second embodiment of the information query method of the present invention is proposed based on the first embodiment, and the step S30 includes:

步骤S301,根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。Step S301, perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.

应当理解的是,可预先生成预设关键词索引和预设向量索引,根据预设关键词索引和预设向量索引建库,然后对检索引擎进行配置。在信息查询过程中,确定查询文本和查询向量之后,可将查询文本和查询向量输入检索引擎,可通过检索引擎中的预设关键词索引对查询文本进行关键词检索,同时还可通过检索引擎中的预设向量索引对查询向量进行向量检索。It should be understood that the preset keyword index and preset vector index can be generated in advance, a database can be built according to the preset keyword index and preset vector index, and then the search engine can be configured. In the process of information query, after the query text and query vector are determined, the query text and query vector can be input into the search engine, and the query text can be searched for keywords through the preset keyword index in the search engine, and at the same time, the search engine can The preset vector index in performs vector retrieval on the query vector.

其中,本实施例中的关键词检索和向量检索之间并没有固定的先后顺序,可同时进行关键词检索和向量检索,也可先进行关键词检索再进行向量检索,也可先进行向量检索再进行关键词检索,本实施例对此不作限制。Among them, there is no fixed order between keyword retrieval and vector retrieval in this embodiment, keyword retrieval and vector retrieval can be performed at the same time, keyword retrieval can be performed first and then vector retrieval, or vector retrieval can be performed first Then perform keyword search, which is not limited in this embodiment.

进一步地,为了更加高效的生成预设关键词索引和预设向量索引,满足关键词匹配和向量匹配的需求,所述步骤S301之前,还包括:Further, in order to more efficiently generate the preset keyword index and preset vector index and meet the requirements of keyword matching and vector matching, before the step S301, it also includes:

获取多种业务类型的样本数据;根据所述样本数据生成待匹配候选集;根据所述待匹配候选集生成预设关键词索引和预设向量索引。Acquiring sample data of various business types; generating a candidate set to be matched according to the sample data; generating a preset keyword index and a preset vector index according to the candidate set to be matched.

应当理解的是,可获取多种业务类型的样本数据,根据这些多种业务类型的样本数据生成待匹配候选集,然后通过预设深度学习表示型模型生成待匹配候选集对应的离线向量。然后根据待匹配候选集生成预设关键词索引,并根据离线向量生成预设向量索引,然后建立检索引擎来调用预设关键词索引和预设向量索引。It should be understood that the sample data of various business types can be obtained, the candidate sets to be matched are generated according to the sample data of these various business types, and then the offline vectors corresponding to the candidate sets to be matched are generated through a preset deep learning representation model. Then generate a preset keyword index according to the candidate set to be matched, and generate a preset vector index according to the offline vector, and then establish a search engine to call the preset keyword index and the preset vector index.

进一步地,为了达到更好的向量化效果,所述通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量之前,还包括:Further, in order to achieve a better vectorization effect, before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, it also includes:

获取第一训练数据;根据所述第一训练数据对初始深度学习模型进行训练,得到预设深度学习表示型模型。Acquiring first training data; training an initial deep learning model according to the first training data to obtain a preset deep learning representation model.

应当理解的是,第一训练数据可为与向量相关的样本训练数据,可挖掘训练数据,以对初始深度学习模型进行训练,得到具有向量化的功能深度学习表示型模型。其中,第一训练数据与第二训练数据可以相同,也可以不相同,或者部分相同部分不相同,本实施例对此不作限制。It should be understood that the first training data may be sample training data related to vectors, and the training data may be mined to train the initial deep learning model to obtain a vectorized functional deep learning representation model. Wherein, the first training data and the second training data may be the same or different, or partly the same and partly different, which is not limited in this embodiment.

在本实施例中,预先生成预设关键词索引和预设向量索引,并配置搜索引擎,根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。从而根据待匹配候选集建立文本索引和向量索引,将完全匹配和规则匹配转换为检索的方式,当需要匹配的场景出现时,利用查询数据在所构建的系统里进行检索,利用搜索引擎的思路进行查询泛化。In this embodiment, the preset keyword index and the preset vector index are generated in advance, and the search engine is configured to perform keyword retrieval according to the query text and the preset keyword index, and according to the query vector and the preset vector Index for vector retrieval. In this way, text indexes and vector indexes are established according to the candidate sets to be matched, and complete matching and rule matching are converted into retrieval methods. When a scene that needs to be matched appears, the query data is used to search in the built system, and the idea of a search engine is used Perform query generalization.

在一实施例中,如图5所示,基于第一实施例或第二实施例提出本发明信息查询方法第三实施例,在本实施例中,基于第一实施例进行说明,所述步骤S20,包括:In one embodiment, as shown in FIG. 5, a third embodiment of the information query method of the present invention is proposed based on the first embodiment or the second embodiment. In this embodiment, description is made based on the first embodiment. The steps S20, including:

步骤S201,分别根据所述查询数据进行需求识别处理和向量化处理。Step S201, performing demand identification processing and vectorization processing respectively according to the query data.

应当理解的是,在确定查询指令对应的查询数据之后,为了确定查询文本和查询向量,可分别根据查询数据进行需求识别处理和向量化处理,然后根据这两种处理结果来分别确定查询文本和查询向量。It should be understood that, after determining the query data corresponding to the query instruction, in order to determine the query text and the query vector, demand recognition processing and vectorization processing can be performed according to the query data, and then the query text and query vector can be respectively determined according to the two processing results. query vector.

可以理解的是,为了达到更好的处理效果,可通过需求识别技术对查询数据进行需求识别处理,通过预设深度学习表示型模型对查询数据进行向量化处理。其中,预设深度学习表示型模型可具有向量化的功能,向量化指的是将文本转化成向量,在确定查询数据之后,可通过预设深度学习表示型模型对查询数据进行向量化,从而得到查询向量。需求识别技术可具有文本识别和文本规范的功能,可根据实际情况的需求来设置和采用相关的需求识别技术,本实施例对此不作限制,在确定查询数据之后,可通过需求识别技术对查询数据进行文本识别和文本规范,从而得到查询文本。It can be understood that, in order to achieve better processing effect, the query data can be identified and processed through the requirement identification technology, and the query data can be vectorized through the preset deep learning representation model. Among them, the preset deep learning representation model can have the function of vectorization. Vectorization refers to converting text into vectors. After the query data is determined, the query data can be vectorized through the preset deep learning representation model, so that Get the query vector. Requirement recognition technology can have the function of text recognition and text standardization, can set up and adopt relevant demand recognition technology according to the demand of actual situation, this embodiment does not limit to this, after confirming query data, can use demand recognition technology to query The data is subjected to text recognition and text standardization to obtain the query text.

步骤S202,根据需求识别处理结果确定查询文本,并根据向量化处理结果确定查询向量。In step S202, the query text is determined according to the requirement identification processing result, and the query vector is determined according to the vectorization processing result.

可以理解的是,在分别对查询数据进行需求识别处理和向量化处理之后,可根据需求识别处理结果来确定查询文本,同时还可根据向量化处理结果来确定查询向量,进而根据查询文本和预设关键词索引进行关键词检索,并根据查询向量和预设向量索引进行向量检索,得到关键词检索结果和向量检索结果,然后结合关键词检索结果对应的第一检索信息以及向量检索结果对应的第二检索信息来生成目标检索信息,并根据目标检索信息生成信息查询结果。为了提高语义相关性,在实现查询泛化的同时,还保证查询的准确性,可在信息查询结果的基础上进行语义相关性分析,得到目标信息查询结果,从而适用于所有需要精确定文本匹配的场景,可以充分利用数据,较为高效和准确地进行查询泛化。It can be understood that, after the requirement recognition processing and vectorization processing are performed on the query data respectively, the query text can be determined according to the requirement recognition processing result, and the query vector can also be determined according to the vectorization processing result, and then according to the query text and the predicted Set the keyword index for keyword retrieval, and perform vector retrieval according to the query vector and the preset vector index to obtain the keyword retrieval result and the vector retrieval result, and then combine the first retrieval information corresponding to the keyword retrieval result and the vector retrieval result corresponding to The second retrieval information is used to generate target retrieval information, and an information query result is generated according to the target retrieval information. In order to improve the semantic correlation and ensure the accuracy of the query while realizing the generalization of the query, the semantic correlation analysis can be carried out on the basis of the information query results to obtain the target information query results, which is suitable for all text matches that require precise determination The scene can make full use of the data, and perform query generalization more efficiently and accurately.

在本实施例中,分别根据所述查询数据进行需求识别处理和向量化处理,根据需求识别处理结果确定查询文本,并根据向量化处理结果确定查询向量,从而可以分别确定查询文本和查询向量,然后同时进行关键词检索和向量检索,再配合语义模型可以实现查询泛化的基础上,提高查询结果的准确性。In this embodiment, the requirement recognition processing and vectorization processing are performed respectively according to the query data, the query text is determined according to the requirement recognition processing result, and the query vector is determined according to the vectorization processing result, so that the query text and the query vector can be respectively determined, Then carry out keyword retrieval and vector retrieval at the same time, and cooperate with the semantic model to realize query generalization and improve the accuracy of query results.

此外,本发明实施例还提出一种存储介质,所述存储介质上存储有信息查询程序,所述信息查询程序被处理器执行时实现如上文所述的信息查询方法的步骤。In addition, an embodiment of the present invention also proposes a storage medium, on which an information query program is stored, and when the information query program is executed by a processor, the steps of the information query method as described above are implemented.

由于本存储介质采用了上述所有实施例的全部技术方案,因此至少具有上述实施例的技术方案所带来的所有有益效果,在此不再一一赘述。Since the storage medium adopts all the technical solutions of all the above-mentioned embodiments, it at least has all the beneficial effects brought by the technical solutions of the above-mentioned embodiments, which will not be repeated here.

此外,参照图6,本发明实施例还提出一种信息查询装置,所述信息查询装置包括:In addition, referring to FIG. 6 , an embodiment of the present invention also proposes an information query device, the information query device includes:

查询数据模块10,用于在接收到查询指令时,根据所述查询指令确定查询数据。The query data module 10 is configured to determine query data according to the query instruction when receiving the query instruction.

需要说明的是,本实施例中的信息查询可以包括但不限于文本查询,还可包括其他类型的信息查询,本实施例对此不作限制,在本实施例中,以文本查询为例进行说明。It should be noted that the information query in this embodiment may include but not limited to text query, and may also include other types of information query, which is not limited in this embodiment. In this embodiment, text query is used as an example for illustration .

需要说明的是,本实施例中的查询数据指的是用户输入的查询条件数据,所述查询数据可以包括但不限于文本数据,还可包括其他类型的数据,本实施例对此不作限制,在本实施例中,以文本数据为例进行说明。It should be noted that the query data in this embodiment refers to the query condition data input by the user, and the query data may include but not limited to text data, and may also include other types of data, which is not limited in this embodiment. In this embodiment, text data is taken as an example for description.

应当理解的是,用户在需要进行信息查询时,可在查询界面中输入查询数据,然后点击查询按钮,以向计算机设备发送查询指令。计算设备在接收到查询指令时,可根据查询指令确定对应的查询数据,然后根据查询数据来确定信息查询结果。It should be understood that when a user needs to query information, he can input query data in the query interface, and then click a query button to send a query instruction to the computer device. When the computing device receives the query instruction, it can determine the corresponding query data according to the query instruction, and then determine the information query result according to the query data.

在具体实现中,例如,假设用户想要查询明天天气怎么样,则查询数据为用户输入的“明天天气怎么样”,最终查询到的信息查询结果可为“明天晴转多云”。In a specific implementation, for example, assuming that the user wants to inquire about the weather tomorrow, the query data is "how is the weather tomorrow" input by the user, and the finally obtained information query result may be "tomorrow will be sunny to cloudy".

文本向量模块20,用于根据所述查询数据确定查询文本和查询向量。A text vector module 20, configured to determine a query text and a query vector according to the query data.

需要说明的是,区别于现有技术,本方案根据查询文本和查询向量通过两种不同的方式来确定两种检索结果,然后将这两种检索结果进行结合来得到信息查询结果,从而将信息查询结果进行泛化,得到更为广泛的信息查询结果,更好的满足用户的需求。It should be noted that, different from the prior art, this solution determines the two retrieval results in two different ways according to the query text and the query vector, and then combines the two retrieval results to obtain the information query result, so that the information The query results are generalized to obtain more extensive information query results to better meet the needs of users.

应当理解的是,可分别针对查询数据进行需求识别处理和向量化处理,从而得到查询文本和查询向量。It should be understood that requirement recognition processing and vectorization processing may be performed on the query data respectively, so as to obtain query text and query vectors.

可以理解的是,在需要检索时,可利用预设深度学习表示型模型对查询数据进行向量化,得到查询向量,同时还可利用query需求识别技术对查询数据进行一些规范处理,得到查询文本。It is understandable that when retrieval is required, the preset deep learning representation model can be used to vectorize the query data to obtain a query vector, and at the same time, the query demand recognition technology can be used to perform some standardized processing on the query data to obtain the query text.

信息检索模块30,用于根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索。The information retrieval module 30 is configured to perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector.

需要说明的是,可预先根据待匹配候选集生成预设关键词索引和预设向量索引,在通过上述方式确定查询文本和查询向量之后,可以根据查询文本和预设关键词索引进行关键词检索,以从待匹配候选集中选取与查询文本相关的数据,得到关键词检索结果。同时还可根据查询向量和预设向量索引进行向量检索,以从待匹配候选集中选取与查询向量相关的数据,得到向量检索结果。It should be noted that the preset keyword index and preset vector index can be generated in advance according to the candidate set to be matched, and after the query text and query vector are determined in the above-mentioned way, keyword retrieval can be performed according to the query text and preset keyword index , to select the data related to the query text from the candidate set to be matched to obtain the keyword retrieval result. At the same time, vector retrieval can also be performed according to the query vector and the preset vector index, so as to select data related to the query vector from the candidate set to be matched to obtain the vector retrieval result.

查询结果模块40,用于结合关键词检索结果和向量检索结果确定信息查询结果。The query result module 40 is used to determine the information query result in combination with the keyword search result and the vector search result.

应当理解的是,可根据关键词检索结确定与查询数据对应的第一检索信息,并根据向量检索结果确定与查询数据对应的第二检索信息,其中,第一检索信息指的是从待匹配候选集中选取与查询文本相关的数据,第二检索信息指的是从待匹配候选集中选取与查询向量相关的数据。It should be understood that the first retrieval information corresponding to the query data can be determined according to the keyword retrieval result, and the second retrieval information corresponding to the query data can be determined according to the vector retrieval result, wherein the first retrieval information refers to the The data related to the query text is selected from the candidate set, and the second retrieval information refers to selecting data related to the query vector from the candidate set to be matched.

可以理解的是,在得到第一检索信息和第二检索信息之后,可根据第一检索信息和第二检索信息生成目标检索信息,进而根据目标检索信息生成信息查询结果。It can be understood that after the first search information and the second search information are obtained, the target search information can be generated according to the first search information and the second search information, and then the information query result can be generated according to the target search information.

可以理解的是,为了提高目标检索信息的生成效率,可将第一检索信息和第二检索信息进行结合,从而确定第一检索信息与第二检索信息的集合,然后根据第一检索信息和第二检索信息的集合生成目标检索信息,即目标检索信息中包含了第一检索信息和第二检索信息这两种方式得到的检索信息。It can be understood that, in order to improve the generation efficiency of the target retrieval information, the first retrieval information and the second retrieval information can be combined to determine the set of the first retrieval information and the second retrieval information, and then according to the first retrieval information and the second retrieval information The collection of the second retrieval information generates the target retrieval information, that is, the target retrieval information includes the retrieval information obtained by the first retrieval information and the second retrieval information.

在具体实现中,例如,假设通过关键词检索确定的第一检索信息包括信息A、信息B以及信息C,假设通过向量检索确定的第二检索信息包括信息D和信息E,则目标检索信息可包括信息A、信息B、信息C、信息D以及信息E。In a specific implementation, for example, assuming that the first search information determined through keyword search includes information A, information B, and information C, and assuming that the second search information determined through vector search includes information D and information E, the target search information can be Including information A, information B, information C, information D and information E.

可以理解的是,在确定信息查询结果之后,可在查询界面中对信息查询结果进行展示,使用户方便知晓信息查询结果,能够为用户提供泛化的查询结果,符合用户的查询需求。It can be understood that after the information query result is determined, the information query result can be displayed in the query interface, so that the user can easily know the information query result, and can provide the user with generalized query results to meet the user's query needs.

进一步地,由于在进行泛化查询之后,查询结果中可能存在一些数据与查询数据相差的较远,因为,为了避免这种情况出现,在泛化查询的基础上提高信息查询结果的准确性,所述信息查询装置还包括语义相关性分析模块,用于通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析;根据语义相关性分析结果检测语义的一致性;根据检测结果得到目标信息查询结果,并对所述目标信息查询结果进行展示。Furthermore, after the generalized query is performed, there may be some data in the query result that is far from the query data, because, in order to avoid this situation, the accuracy of the information query result is improved on the basis of the generalized query, The information query device also includes a semantic correlation analysis module, which is used to perform semantic correlation analysis on the information query results through a preset deep learning interactivity model; detect semantic consistency according to the semantic correlation analysis results; A target information query result is obtained, and the target information query result is displayed.

可以理解的是,在通过上述方式得到信息查询结果之后,可通过预设深度学习交互性模型对信息查询结果进行语义相关性分析,然后根据语义相关性分析结果判断信息查询结果中各数据的语义的一致性,如果语义的一致性检测通过,则直接将信息查询结果。如果语义的一致性检测不通过,则将信息查询结果中部分语义差距较大的数据移除,以得到目标信息查询结果,或者重新返回初始步骤进行信息查询,本实施例对此不作限制。It is understandable that after the information query results are obtained through the above methods, the semantic correlation analysis of the information query results can be performed through the preset deep learning interactive model, and then the semantics of each data in the information query results can be judged according to the semantic correlation analysis results. Consistency, if the semantic consistency check is passed, the information query result will be directly sent. If the semantic consistency check fails, remove some data with large semantic differences in the information query result to obtain the target information query result, or return to the initial step for information query, which is not limited in this embodiment.

可以理解的是,在确定目标信息查询结果之后,可在查询界面中对目标信息查询结果进行展示,使用户方便知晓目标信息查询结果。可参照图3,图3为泛化查询整体流程图,本方案利用query文本和query向量利用检索引擎在之前建立的索引中进行查询,根据查询返回的结果,利用深度学习交互性模型来判断语义的一致性,由此得到的匹配结果泛化型和准确性都会有很好的效果。It can be understood that after the target information query result is determined, the target information query result can be displayed on the query interface, so that the user can easily know the target information query result. You can refer to Figure 3, which is the overall flow chart of generalized query. This solution uses query text and query vector to use the search engine to query in the previously established index, and uses the deep learning interactive model to judge the semantics according to the results returned by the query. Consistency, the generalization and accuracy of the matching results obtained from this will have a good effect.

在具体实现中,例如,假设查询数据为“XX游戏中被风包围的宝箱怎么开”,待匹配候选集中有数据为“XX游戏中被一圈风围着的宝箱”,这两个文本的在原来的规则匹配场景是非常难以召回和满足的,甚至都很难知道待匹配候选集中的待匹配数据中有可以满足的数据,经过本方案处理之后,“XX游戏中被风包围的宝箱怎么开”可以检索到“XX游戏中被一圈风围着的宝箱”,且计算语义相关性非常高,更好的满足用户的需求。In a specific implementation, for example, assuming that the query data is "how to open a treasure chest surrounded by wind in XX game", and there is data in the candidate set to be matched as "a treasure chest surrounded by wind in XX game", the two texts In the original rule matching scenario, it is very difficult to recall and satisfy, and it is even difficult to know that there is data that can be satisfied in the data to be matched in the candidate set to be matched. Open" can retrieve "the treasure chest surrounded by a circle of wind in the XX game", and the calculation semantics is very relevant, which can better meet the needs of users.

需要说明的是,本方案基于将需要精准匹配召回一些特型结果的场景,从精确匹配+规则匹配的思路优化为检索+语义计算的思路,利用关键词检索+向量检索,配合语义模型实现在线的精确的同义匹配,可以最大限度的利用数据,适用于所有类似场景,对线上效果有较大的提升。It should be noted that this solution is based on scenarios that require precise matching and recall of some special results, and is optimized from the idea of exact matching + rule matching to the idea of retrieval + semantic calculation. It uses keyword retrieval + vector retrieval and cooperates with the semantic model to realize online The precise synonymous matching can maximize the use of data and is applicable to all similar scenarios, which greatly improves the online effect.

可以理解的是,可将本方案应用在搜索引擎的onebox召回系统中,使用该种泛化方法后搜索结果top3可以多召回onebox 3.5%,净收益约1%以上,重点类型如精选摘要13%的召回query来自该泛化查询方法,极大提高了泛化查询效果。It is understandable that this solution can be applied to the onebox recall system of search engines. After using this generalization method, the top3 search results can recall more oneboxes by 3.5%, with a net profit of more than 1%. Key types such as featured abstracts13 % of the recall query comes from this generalized query method, which greatly improves the generalized query effect.

进一步地,为了达到更好的语义相关性分析效果,使得目标信息查询结果更加准确,所述模型训练模块,还用于获取第二训练数据;根据所述第二训练数据对初始深度学习模型进行训练,得到预设深度学习交互性模型。Further, in order to achieve a better semantic correlation analysis effect and make the target information query result more accurate, the model training module is also used to obtain the second training data; perform the initial deep learning model according to the second training data Training to get the preset deep learning interactive model.

应当理解的是,第二训练数据可为与语义相关的样本训练数据,可挖掘训练数据,以对初始深度学习模型进行训练,得到具有语义相关性分析功能的深度学习交互性模型。It should be understood that the second training data may be sample training data related to semantics, and the training data may be mined to train the initial deep learning model to obtain a deep learning interactive model with a semantic correlation analysis function.

在本实施例中,在接收到查询指令时,根据所述查询指令确定查询数据;根据所述查询数据确定查询文本和查询向量;根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;结合关键词检索结果和向量检索结果确定信息查询结果。本方案利用关键词检索和向量检索分别得到检索结果后,再综合检索结果得到信息查询结果,可以泛化查询结果,实现泛化查询的目的,从而充分利用数据,较为高效地对查询到的信息进行泛化,提高信息查询效果,更好的满足用户的需求。In this embodiment, when a query instruction is received, the query data is determined according to the query instruction; the query text and the query vector are determined according to the query data; keyword retrieval is performed according to the query text, and according to the query vector Carry out vector retrieval; combine keyword retrieval results and vector retrieval results to determine information query results. This scheme uses keyword retrieval and vector retrieval to obtain the retrieval results respectively, and then synthesizes the retrieval results to obtain the information query results, which can generalize the query results and achieve the purpose of generalized query, so as to make full use of the data and more efficiently query the information Carry out generalization, improve the effect of information query, and better meet the needs of users.

在一实施例中,所述信息检索模块30,还用于根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。In an embodiment, the information retrieval module 30 is further configured to perform keyword retrieval according to the query text and a preset keyword index, and perform vector retrieval according to the query vector and a preset vector index.

在一实施例中,所述信息查询装置还包括索引生成模块,用于获取多种业务类型的样本数据;根据所述样本数据生成待匹配候选集;根据所述待匹配候选集生成预设关键词索引和预设向量索引。In one embodiment, the information query device further includes an index generation module, which is used to obtain sample data of various business types; generate a candidate set to be matched according to the sample data; generate a preset key according to the candidate set to be matched Word index and preset vector index.

在一实施例中,所述索引生成模块,还用于通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量;根据所述待匹配候选集生成预设关键词索引,并根据所述离线向量生成预设向量索引。In an embodiment, the index generating module is further configured to generate an offline vector corresponding to the candidate set to be matched by using a preset deep learning representation model; generate a preset keyword index according to the candidate set to be matched, and A preset vector index is generated according to the offline vector.

在一实施例中,所述信息查询装置还包括模型训练模块,用于获取第一训练数据;根据所述第一训练数据对初始深度学习模型进行训练,得到预设深度学习表示型模型。In one embodiment, the information query device further includes a model training module, configured to acquire first training data; and train an initial deep learning model according to the first training data to obtain a preset deep learning representation model.

在一实施例中,所述信息检索模块30,还用于根据预设关键词索引和预设向量索引配置检索引擎;将所述查询文本和所述查询向量输入所述检索引擎;通过所述检索引擎调用预设关键词索引对所述查询文本进行关键词检索,并通过所述检索引擎调用预设向量索引对所述查询向量进行向量检索。In one embodiment, the information retrieval module 30 is further configured to configure a retrieval engine according to a preset keyword index and a preset vector index; input the query text and the query vector into the retrieval engine; through the The retrieval engine invokes a preset keyword index to perform keyword retrieval on the query text, and the retrieval engine invokes a preset vector index to perform vector retrieval on the query vector.

在一实施例中,所述文本向量模块20,还用于分别根据所述查询数据进行需求识别处理和向量化处理;根据需求识别处理结果确定查询文本,并根据向量化处理结果确定查询向量。In one embodiment, the text vector module 20 is further configured to perform requirement recognition processing and vectorization processing according to the query data; determine query text according to the requirement recognition processing result, and determine a query vector according to the vectorization processing result.

在一实施例中,所述文本向量模块20,还用于通过需求识别技术对所述查询数据进行需求识别处理;通过预设深度学习表示型模型对所述查询数据进行向量化处理。In one embodiment, the text vector module 20 is further configured to perform requirement recognition processing on the query data through a requirement recognition technology; perform vectorization processing on the query data through a preset deep learning representation model.

在一实施例中,所述查询结果模块40,还用于根据所述关键词检索结果确定与所述查询数据对应的第一检索信息;根据所述向量检索结果确定与所述查询数据对应的第二检索信息;根据所述第一检索信息和所述第二检索信息生成目标检索信息;根据所述目标检索信息生成信息查询结果。In one embodiment, the query result module 40 is further configured to determine the first search information corresponding to the query data according to the keyword search result; determine the first search information corresponding to the query data according to the vector search result second retrieval information; generating target retrieval information according to the first retrieval information and the second retrieval information; generating information query results according to the target retrieval information.

在一实施例中,所述查询结果模块40,还用于将所述第一检索信息和所述第二检索信息进行结合,确定所述第一检索信息与所述第二检索信息的集合;根据所述第一检索信息和所述第二检索信息的集合生成目标检索信息。In one embodiment, the query result module 40 is further configured to combine the first search information and the second search information to determine a set of the first search information and the second search information; Target retrieval information is generated according to a set of the first retrieval information and the second retrieval information.

在一实施例中,所述信息查询装置还包括语义相关性分析模块,用于通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析;根据语义相关性分析结果检测语义的一致性;根据检测结果得到目标信息查询结果,并对所述目标信息查询结果进行展示。In one embodiment, the information query device further includes a semantic correlation analysis module, configured to perform semantic correlation analysis on the information query results through a preset deep learning interactivity model; Consistency: obtaining target information query results according to the detection results, and displaying the target information query results.

在一实施例中,所述模型训练模块,还用于获取第二训练数据;根据所述第二训练数据对初始深度学习模型进行训练,得到预设深度学习交互性模型。In one embodiment, the model training module is further used to acquire second training data; and to train an initial deep learning model according to the second training data to obtain a preset deep learning interactive model.

在本发明所述信息查询装置的其他实施例或具体实现方法可参照上述各方法实施例,此处不再赘述。For other embodiments or specific implementation methods of the information query device of the present invention, reference may be made to the above method embodiments, which will not be repeated here.

需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.

上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该估算机软件产品存储在如上所述的一个估算机可读存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台智能设备(可以是手机,估算机,信息查询设备,或者网络信息查询设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art can be embodied in the form of a software product, which is stored in a computer-readable storage medium as described above (such as ROM/RAM, magnetic disk, optical disk), including several instructions to make a smart device (which can be a mobile phone, a computing machine, an information query device, or a network information query device, etc.) execute the various embodiments of the present invention. Methods.

以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the patent scope of the present invention. Any equivalent structure or equivalent process conversion made by using the description of the present invention and the contents of the accompanying drawings, or directly or indirectly used in other related technical fields , are all included in the scope of patent protection of the present invention in the same way.

本发明公开了A1、一种信息查询方法,所述信息查询方法包括:The invention discloses A1. An information query method, the information query method comprising:

在接收到查询指令时,根据所述查询指令确定查询数据;When receiving a query instruction, determine query data according to the query instruction;

根据所述查询数据确定查询文本和查询向量;determining a query text and a query vector according to the query data;

根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;performing keyword retrieval according to the query text, and performing vector retrieval according to the query vector;

结合关键词检索结果和向量检索结果确定信息查询结果。Combining keyword retrieval results and vector retrieval results to determine information query results.

A2、如A1所述的信息查询方法,所述根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索,包括:A2. The information query method as described in A1, wherein the keyword retrieval is performed according to the query text, and the vector retrieval is performed according to the query vector, including:

根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。Keyword retrieval is performed according to the query text and a preset keyword index, and vector retrieval is performed according to the query vector and a preset vector index.

A3、如A2所述的信息查询方法,所述根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索之前,还包括:A3, the information query method as described in A2, the keyword retrieval according to the query text and the preset keyword index, and before the vector retrieval according to the query vector and the preset vector index, it also includes:

获取多种业务类型的样本数据;Obtain sample data of various business types;

根据所述样本数据生成待匹配候选集;generating a candidate set to be matched according to the sample data;

根据所述待匹配候选集生成预设关键词索引和预设向量索引。Generate a preset keyword index and a preset vector index according to the candidate set to be matched.

A4、如A3所述的信息查询方法,所述根据所述待匹配候选集生成预设关键词索引和预设向量索引,包括:A4. The information query method as described in A3, said generating a preset keyword index and a preset vector index according to the candidate set to be matched, including:

通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量;Generate the offline vector corresponding to the candidate set to be matched by preset deep learning representation model;

根据所述待匹配候选集生成预设关键词索引,并根据所述离线向量生成预设向量索引。A preset keyword index is generated according to the candidate set to be matched, and a preset vector index is generated according to the offline vector.

A5、如A4所述的信息查询方法,所述通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量之前,还包括:A5, the information query method as described in A4, before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, it also includes:

获取第一训练数据;Obtain the first training data;

根据所述第一训练数据对初始深度学习模型进行训练,得到预设深度学习表示型模型。The initial deep learning model is trained according to the first training data to obtain a preset deep learning representation model.

A6、如A2所述的信息查询方法,所述根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索,包括:A6. The information query method as described in A2, wherein the keyword retrieval is performed according to the query text and the preset keyword index, and the vector retrieval is performed according to the query vector and the preset vector index, including:

根据预设关键词索引和预设向量索引配置检索引擎;Configure the search engine according to the preset keyword index and preset vector index;

将所述查询文本和所述查询向量输入所述检索引擎;inputting said query text and said query vector into said search engine;

通过所述检索引擎调用预设关键词索引对所述查询文本进行关键词检索,并通过所述检索引擎调用预设向量索引对所述查询向量进行向量检索。The search engine calls a preset keyword index to perform keyword search on the query text, and the search engine calls a preset vector index to perform vector search on the query vector.

A7、如A1至A6中任一项所述的信息查询方法,所述根据所述查询数据确定查询文本和查询向量,包括:A7. The information query method as described in any one of A1 to A6, said determining query text and query vectors according to said query data, including:

分别根据所述查询数据进行需求识别处理和向量化处理;Respectively performing demand identification processing and vectorization processing according to the query data;

根据需求识别处理结果确定查询文本,并根据向量化处理结果确定查询向量。The query text is determined according to the requirement identification processing result, and the query vector is determined according to the vectorization processing result.

A8、如A7所述的信息查询方法,所述分别根据所述查询数据进行需求识别处理和向量化处理,包括:A8, the information query method as described in A7, said performing demand recognition processing and vectorization processing according to said query data respectively, including:

通过需求识别技术对所述查询数据进行需求识别处理;performing demand identification processing on the query data through a demand identification technology;

通过预设深度学习表示型模型对所述查询数据进行向量化处理。The query data is vectorized through a preset deep learning representation model.

A9、如A1至A6中任一项所述的信息查询方法,所述结合关键词检索结果和向量检索结果确定信息查询结果,包括:A9. The information query method as described in any one of A1 to A6, said combination of keyword retrieval results and vector retrieval results to determine information query results, including:

根据所述关键词检索结果确定与所述查询数据对应的第一检索信息;determining first retrieval information corresponding to the query data according to the keyword retrieval results;

根据所述向量检索结果确定与所述查询数据对应的第二检索信息;determining second retrieval information corresponding to the query data according to the vector retrieval result;

根据所述第一检索信息和所述第二检索信息生成目标检索信息;generating target retrieval information according to the first retrieval information and the second retrieval information;

根据所述目标检索信息生成信息查询结果。An information query result is generated according to the target retrieval information.

A10、如A9所述的信息查询方法,所述根据所述第一检索信息和所述第二检索信息生成目标检索信息,包括:A10. The information query method as described in A9, said generating target retrieval information according to said first retrieval information and said second retrieval information, comprising:

将所述第一检索信息和所述第二检索信息进行结合,确定所述第一检索信息与所述第二检索信息的集合;Combining the first search information and the second search information to determine a set of the first search information and the second search information;

根据所述第一检索信息和所述第二检索信息的集合生成目标检索信息。Target retrieval information is generated according to a set of the first retrieval information and the second retrieval information.

A11、如A1至A6中任一项所述的信息查询方法,所述结合关键词检索结果和向量检索结果确定信息查询结果之后,还包括:A11. The information query method as described in any one of A1 to A6, after determining the information query result in combination with the keyword retrieval result and the vector retrieval result, it also includes:

通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析;Performing a semantic correlation analysis on the information query results through a preset deep learning interactivity model;

根据语义相关性分析结果检测语义的一致性;Detect semantic consistency based on semantic correlation analysis results;

根据检测结果得到目标信息查询结果,并对所述目标信息查询结果进行展示。The target information query result is obtained according to the detection result, and the target information query result is displayed.

A12、如A11所述的信息查询方法,所述通过预设深度学习交互性模型对所述信息查询结果进行语义相关性分析之前,还包括:A12, the information query method as described in A11, before performing semantic correlation analysis on the information query results through the preset deep learning interactive model, it also includes:

获取第二训练数据;Obtain second training data;

根据所述第二训练数据对初始深度学习模型进行训练,得到预设深度学习交互性模型。The initial deep learning model is trained according to the second training data to obtain a preset deep learning interactive model.

本发明还公开了B13、一种信息查询装置,所述信息查询装置包括:The present invention also discloses B13, an information query device, the information query device includes:

查询数据模块,用于在接收到查询指令时,根据所述查询指令确定查询数据;A query data module, configured to determine query data according to the query command when the query command is received;

文本向量模块,用于根据所述查询数据确定查询文本和查询向量;A text vector module, configured to determine query text and query vectors according to the query data;

信息检索模块,用于根据所述查询文本进行关键词检索,并根据所述查询向量进行向量检索;An information retrieval module, configured to perform keyword retrieval according to the query text, and perform vector retrieval according to the query vector;

查询结果模块,用于结合关键词检索结果和向量检索结果确定信息查询结果。The query result module is used to determine the information query result in combination with the keyword retrieval result and the vector retrieval result.

B14、如B13所述的信息查询装置,所述信息检索模块,还用于根据所述查询文本和预设关键词索引进行关键词检索,并根据所述查询向量和预设向量索引进行向量检索。B14, the information query device as described in B13, the information retrieval module is also used for keyword retrieval according to the query text and preset keyword index, and vector retrieval according to the query vector and preset vector index .

B15、如B14所述的信息查询装置,所述信息查询装置还包括:B15, the information query device as described in B14, the information query device also includes:

索引生成模块,用于获取多种业务类型的样本数据;根据所述样本数据生成待匹配候选集;根据所述待匹配候选集生成预设关键词索引和预设向量索引。The index generation module is used to obtain sample data of various business types; generate a candidate set to be matched according to the sample data; generate a preset keyword index and a preset vector index according to the candidate set to be matched.

B16、如B15所述的信息查询装置,所述索引生成模块,还用于通过预设深度学习表示型模型生成所述待匹配候选集对应的离线向量;根据所述待匹配候选集生成预设关键词索引,并根据所述离线向量生成预设向量索引。B16, the information query device as described in B15, the index generation module is also used to generate the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model; generate a preset according to the candidate set to be matched keyword index, and generate a preset vector index according to the offline vector.

B17、如B16所述的信息查询装置,所述信息查询装置还包括:B17, the information query device as described in B16, the information query device also includes:

模型训练模块,用于获取第一训练数据;根据所述第一训练数据对初始深度学习模型进行训练,得到预设深度学习表示型模型。The model training module is used to acquire first training data; train the initial deep learning model according to the first training data to obtain a preset deep learning representation model.

B18、如B14所述的信息查询装置,所述信息检索模块,还用于根据预设关键词索引和预设向量索引配置检索引擎;将所述查询文本和所述查询向量输入所述检索引擎;通过所述检索引擎调用预设关键词索引对所述查询文本进行关键词检索,并通过所述检索引擎调用预设向量索引对所述查询向量进行向量检索。B18, the information query device as described in B14, the information retrieval module is also used to configure a retrieval engine according to a preset keyword index and a preset vector index; input the query text and the query vector into the retrieval engine ; using the search engine to call a preset keyword index to perform keyword search on the query text, and use the search engine to call a preset vector index to perform vector search on the query vector.

本发明还公开了C19、一种信息查询设备,所述信息查询设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的信息查询程序,所述信息查询程序被处理器执行时实现如上所述的信息查询方法。The present invention also discloses C19, an information query device. The information query device includes: a memory, a processor, and an information query program stored in the memory and operable on the processor. The information query program When executed by the processor, the above-mentioned information query method is realized.

本发明还公开了D20、一种存储介质,所述存储介质上存储有信息查询程序,所述信息查询程序被处理器执行时实现如上所述的信息查询方法。The present invention also discloses D20, a storage medium, where an information query program is stored on the storage medium, and when the information query program is executed by a processor, the above-mentioned information query method is realized.

Claims (10)

1. An information query method, characterized in that the information query method comprises:
when a query instruction is received, determining query data according to the query instruction;
determining a query text and a query vector according to the query data;
keyword retrieval is carried out according to the query text, and vector retrieval is carried out according to the query vector;
and determining an information query result by combining the keyword search result and the vector search result.
2. The information query method as claimed in claim 1, wherein said performing keyword search based on said query text and performing vector search based on said query vector comprises:
and carrying out keyword retrieval according to the query text and a preset keyword index, and carrying out vector retrieval according to the query vector and a preset vector index.
3. The information query method of claim 2, wherein before the keyword search is performed according to the query text and a preset keyword index, and the vector search is performed according to the query vector and a preset vector index, further comprising:
Acquiring sample data of various service types;
generating a candidate set to be matched according to the sample data;
and generating a preset keyword index and a preset vector index according to the candidate set to be matched.
4. The information query method of claim 3, wherein the generating a preset keyword index and a preset vector index according to the candidate set to be matched comprises:
generating an offline vector corresponding to the candidate set to be matched through a preset deep learning representation model;
and generating a preset keyword index according to the candidate set to be matched, and generating a preset vector index according to the offline vector.
5. The information query method of claim 4, wherein before generating the offline vector corresponding to the candidate set to be matched through the preset deep learning representation model, further comprises:
acquiring first training data;
and training the initial deep learning model according to the first training data to obtain a preset deep learning representation model.
6. The information query method of claim 2, wherein the performing keyword retrieval according to the query text and a preset keyword index and performing vector retrieval according to the query vector and a preset vector index comprises:
Configuring a search engine according to a preset keyword index and a preset vector index;
inputting the query text and the query vector into the search engine;
and calling a preset keyword index through the search engine to perform keyword search on the query text, and calling a preset vector index through the search engine to perform vector search on the query vector.
7. The information query method of any one of claims 1 to 6, wherein said determining query text and query vectors from said query data comprises:
carrying out demand identification processing and vectorization processing according to the query data respectively;
and determining a query text according to the demand recognition processing result, and determining a query vector according to the vectorization processing result.
8. An information inquiry apparatus, characterized in that the information inquiry apparatus includes:
the query data module is used for determining query data according to the query instruction when the query instruction is received;
the text vector module is used for determining a query text and a query vector according to the query data;
the information retrieval module is used for carrying out keyword retrieval according to the query text and carrying out vector retrieval according to the query vector;
And the query result module is used for determining information query results by combining the keyword search results and the vector search results.
9. An information inquiry apparatus, characterized in that the information inquiry apparatus includes: memory, a processor and an information query program stored on the memory and executable on the processor, which when executed by the processor implements the information query method of any one of claims 1 to 7.
10. A storage medium having stored thereon an information inquiry program which, when executed by a processor, implements the information inquiry method according to any one of claims 1 to 7.
CN202111647549.6A 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium Pending CN116414941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111647549.6A CN116414941A (en) 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111647549.6A CN116414941A (en) 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116414941A true CN116414941A (en) 2023-07-11

Family

ID=87054874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111647549.6A Pending CN116414941A (en) 2021-12-29 2021-12-29 Information query method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116414941A (en)

Similar Documents

Publication Publication Date Title
CN111708703B (en) Test case set generation method, device, equipment and computer readable storage medium
CN108170859B (en) Voice query method, device, storage medium and terminal equipment
JP7574183B2 (en) Interactive message processing method, device, computer device, and computer program
CN110457431B (en) Knowledge graph-based question and answer method and device, computer equipment and storage medium
CN107846350B (en) Method, computer readable medium and system for context-aware network chat
US11373047B2 (en) Method, system, and computer program for artificial intelligence answer
US20210073254A1 (en) Search-based natural language intent determination
CN111666399B (en) Intelligent question-answering method and device based on knowledge graph and computer equipment
CN113553414B (en) Intelligent dialogue method, intelligent dialogue device, electronic equipment and storage medium
CN111368043A (en) Event question-answering method, device, equipment and storage medium based on artificial intelligence
CN111061859A (en) Data processing method and device based on knowledge graph and computer equipment
US9268767B2 (en) Semantic-based search system and search method thereof
US20130124194A1 (en) Systems and methods for manipulating data using natural language commands
CN108768824B (en) Information processing method and device
CN117112595A (en) Information query method and device, electronic equipment and storage medium
CN116361428A (en) Question-answer recall method, device and storage medium
CN111625638B (en) Question processing method, device, equipment and readable storage medium
CN118673038B (en) Index acquisition method, apparatus, electronic device and computer readable storage medium
CN117931858A (en) Data query method, device, computer equipment and storage medium
CN118394978A (en) Method, device and equipment for constructing multi-mode knowledge retrieval system integrating large models
CN116414941A (en) Information query method, device, equipment and storage medium
US20220027419A1 (en) Smart search and recommendation method for content, storage medium, and terminal
CN107368525B (en) Method and device for searching related words, storage medium and terminal equipment
CN113792129B (en) Intelligent session method, device, computer equipment and medium
CN111708862B (en) Text matching method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination