CN109710732B

CN109710732B - Information query method, device, storage medium and electronic equipment

Info

Publication number: CN109710732B
Application number: CN201811379175.2A
Authority: CN
Inventors: 刘嘉伟; 董超; 崔朝辉; 赵立军; 张霞
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2018-11-19
Filing date: 2018-11-19
Publication date: 2021-03-05
Anticipated expiration: 2038-11-19
Also published as: CN109710732A

Abstract

The disclosure relates to an information query method, an information query device, a storage medium and electronic equipment, and relates to the technical field of information, wherein the method comprises the following steps: the method comprises the steps of obtaining a first vocabulary set by segmenting an obtained target problem, wherein the first vocabulary set comprises a segmentation result of the target problem, carrying out synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, the word vector is obtained by utilizing a preset model to train a preset corpus, obtaining a matching score of the target problem and each record in the knowledge base according to a preset algorithm and a preset knowledge base according to the second vocabulary set and the preset knowledge base, wherein the knowledge base comprises at least one record, each record comprises a problem and an answer corresponding to the problem, and determining the answer matched with the target problem according to the matching score of each record. The method can effectively utilize the existing knowledge base to realize the information query service on the semantic level, and improve the accuracy and the coverage of information query.

Description

Information query method, device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of information technologies, and in particular, to an information query method, an information query device, a storage medium, and an electronic device.

Background

With the rapid development of information technologies such as internet, cloud computing and language processing technologies, artificial intelligence has increasingly affected people's daily life, wherein the intelligent question-answering system can query questions provided by users by using the existing knowledge base and provide corresponding answers for the users. In the prior art, a search engine is generally used for single keyword retrieval on the existing knowledge base, records matched with keywords are fed back to a user as answers, the accuracy of query is not high, and in many technical fields, a large amount of unstructured historical data are accumulated in the existing knowledge base and cannot be directly retrieved, so that the coverage rate of query is low.

Disclosure of Invention

The disclosure aims to provide an information query method, an information query device, a storage medium and electronic equipment, which are used for solving the problems of low information query accuracy and coverage rate in the prior art.

In order to achieve the above object, according to a first aspect of embodiments of the present disclosure, there is provided an information query method, including:

the method comprises the steps of obtaining a first vocabulary set by segmenting words of an obtained target problem, wherein the first vocabulary set comprises word segmentation results of the target problem;

performing synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, wherein the word vector is obtained by training a preset corpus by using a preset model;

according to the second vocabulary set and a preset knowledge base, obtaining a matching score of the target question and each record in the knowledge base according to a preset algorithm, wherein the knowledge base comprises at least one record, and each record comprises a question and an answer corresponding to the question;

and determining answers matched with the target questions according to the matching scores of each record.

Optionally, the performing synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set includes:

training the corpus by using a preset word vector generation model to obtain the word vectors;

and performing synonym expansion on the first vocabulary set according to the word vector, preset stop words and professional words in the target field to which the target problem belongs to obtain the second vocabulary set.

Optionally, the obtaining, according to the second vocabulary set and a preset knowledge base, a matching score between the target problem and each record in the knowledge base according to a preset algorithm includes:

performing synonym expansion on each record in the knowledge base according to the word vector, preset stop words and professional words of a target field to which the target problem belongs;

carrying out synonymy sentence expansion on each record in the knowledge base by utilizing a Neural Machine Translation (NMT) algorithm;

and according to the second vocabulary set and the knowledge base, obtaining the matching score of the target problem and each record in the knowledge base according to a preset algorithm.

Optionally, the performing synonym expansion on each record in the knowledge base by using a neural machine translation NMT algorithm includes:

translating a first record in a first language in the knowledge base into an intermediate record in a second language using the NMT algorithm;

translating the intermediate record into a synonymous record expressed in the first language using the NMT algorithm;

and storing the synonymous records into the knowledge base, wherein the first record is any one record in the knowledge base.

Optionally, the determining an answer matched with the target question according to the matching score of each record includes:

arranging the matching scores of each record in descending order from high to low to obtain a score ordering;

selecting answers in the top n records with the highest ranking in the grading ordering as answers matched with the target question; or,

when the ratio of the first-ranked matching score to the second-ranked matching score in the score ranking is larger than a preset threshold value, taking an answer in a record corresponding to the first-ranked matching score as an answer matched with the target question;

and when the ratio of the first-ranked matching score to the second-ranked matching score is less than or equal to a preset threshold value, selecting the answer in the top m records with the highest ranking in the score ranking as the answer matched with the target question.

calculating the matching score of the target problem and each record in a knowledge base by using a first calculation formula according to the second vocabulary set and a preset knowledge base;

the first calculation formula includes:

wherein d is_jScore for j record in the knowledge base_jIs denoted by d_jS is the second vocabulary set and d_jQ is the number of words in the first set, t_iNum (d) as the ith vocabulary in the second vocabulary set_j) Is d_jThe number of words in the word segmentation result, num (t)_i) Is t_iAt d_jD is the number of occurrences ofIdentifying the number of records in the library, N_iFor the inclusion of t in the knowledge base_iThe number of records of (2).

According to a second aspect of the embodiments of the present disclosure, there is provided an information query apparatus, the apparatus including:

the word segmentation module is used for segmenting the acquired target problem to acquire a first word set, wherein the first word set comprises a word segmentation result of the target problem;

the expansion module is used for carrying out synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, wherein the word vector is obtained by utilizing a preset model to train a preset corpus;

the scoring module is used for acquiring a matching score between the target question and each record in a knowledge base according to a preset algorithm and the second vocabulary set and the preset knowledge base, wherein the knowledge base comprises at least one record, and each record comprises a question and an answer corresponding to the question;

and the determining module is used for determining answers matched with the target questions according to the matching scores of each record.

Optionally, the expansion module includes:

the first training submodule is used for training the corpus by utilizing a preset word vector generation model so as to obtain the word vectors;

and the first expansion submodule is used for carrying out synonym expansion on the first vocabulary set according to the word vector, preset stop words and professional words in the target field to which the target problem belongs so as to obtain the second vocabulary set.

Optionally, the scoring module includes:

the second training submodule is used for training the corpus by utilizing a preset word vector generation model so as to obtain the word vectors;

a second expansion submodule, configured to perform synonym expansion on each record in the knowledge base according to the word vector, a preset stop word, and a professional word in a target field to which the target problem belongs;

the synonym expansion submodule is used for carrying out synonym expansion on each record in the knowledge base by utilizing a Neural Machine Translation (NMT) algorithm;

and the scoring submodule is used for acquiring the matching score of the target problem and each record in the knowledge base according to a preset algorithm according to the second vocabulary set and the knowledge base.

Optionally, the synonym expansion submodule is configured to:

Optionally, the determining module includes:

the sorting submodule is used for sorting the matching scores of each record in a descending order from high to low so as to obtain a score sorting;

a determining submodule, configured to select answers in the top n records in the ranking order as answers matching the target question; or,

the determining submodule is used for taking an answer in a record corresponding to the first ranked matching score as an answer matched with the target question when the ratio of the first ranked matching score to the second ranked matching score in the score ranking is larger than a preset threshold value;

the determining sub-module is further configured to select answers in the top m records with the highest ranking in the ranking order of scores as answers matching the target question when a ratio of the first ranked matching score to the second ranked matching score is less than or equal to a preset threshold.

Optionally, the scoring module is configured to:

the first calculation formula includes:

wherein d is_jScore for j record in the knowledge base_jIs denoted by d_jS is the second vocabulary set and d_jQ is the number of words in the first set, t_iNum (d) as the ith vocabulary in the second vocabulary set_j) Is d_jThe number of words in the word segmentation result, num (t)_i) Is t_iAt d_jD is the number of entries recorded in the knowledge base, N_iFor the inclusion of t in the knowledge base_iThe number of records of (2).

According to a third aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, which, when executed by a processor, implements the steps of the information query method provided by the first aspect.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the information query method provided by the first aspect.

According to the technical scheme, the method comprises the steps of firstly segmenting an obtained target problem to obtain a first vocabulary set containing a segmentation result of the target problem, then carrying out synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, wherein the word vector is obtained by training a preset corpus by using a preset model, then determining a matching score of the target problem and each record in a knowledge base according to a preset algorithm according to the second vocabulary set and the preset knowledge base, each record contains a corresponding answer of the problem and the problem, and finally determining an answer matched with the target problem according to the matching score of each record in the knowledge base.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:

FIG. 1 is a flow diagram illustrating a method of querying information in accordance with an exemplary embodiment;

FIG. 2 is a flow diagram illustrating another method of information querying, according to an example embodiment;

FIG. 3 is a flow diagram illustrating another method of information querying, according to an example embodiment;

FIG. 4 is a flow diagram illustrating another method of information querying, according to an example embodiment;

FIG. 5 is a block diagram illustrating an information query device in accordance with an exemplary embodiment;

FIG. 6 is a block diagram illustrating another information querying device, according to an example embodiment;

FIG. 7 is a block diagram illustrating another information querying device, according to an example embodiment;

FIG. 8 is a block diagram illustrating another information querying device, according to an example embodiment;

FIG. 9 is a block diagram illustrating an electronic device in accordance with an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Before introducing the information query method, the information query device, the storage medium and the electronic device provided by the present disclosure, an application scenario related to each embodiment in the present disclosure is first introduced, where the application scenario may be a human-computer interactive intelligent question-answering system, and a user may input a target question to be queried through the intelligent question-answering system to obtain a corresponding answer. The smart question answering system may be any kind of terminal, for example, a mobile terminal such as a smart phone, a tablet computer, a smart television, a smart watch, a PDA (Personal Digital Assistant, chinese), a portable computer, or a fixed terminal such as a desktop computer.

Fig. 1 is a flow chart illustrating a method of querying information, as shown in fig. 1, according to an exemplary embodiment, the method comprising the steps of:

step 101, performing word segmentation on the obtained target problem to obtain a first word set, wherein the first word set comprises word segmentation results of the target problem.

For example, first, a target problem input by a user is obtained, a word is segmented for the target problem according to a preset word segmentation method, and a word segmentation result of the target problem is stored in a first vocabulary set. The word segmentation method may be a Maximum Matching algorithm (MM), a semantic-based word segmentation method, a statistical-based word segmentation method, and the like, and for example, words included in the target problem may be identified according to a pre-stored dictionary in a left-to-right order, and words that do not conform to the language habit are removed by using a disambiguation method to obtain a word segmentation result of the target problem. The target questions are: taking the social insurance card where to get as an example, the target problem is participled to obtain a first vocabulary combination as follows: { where, earning, social security card }.

And 102, performing synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, wherein the word vector is obtained by training a preset corpus by using a preset model.

For example, a preset model is first used to train a preset corpus to obtain Word vectors (english: Word embedding), the Word vectors have good semantic characteristics and can effectively express semantic and grammatical features, wherein the preset model may be a Word Vector generation model (english: Word to Vector, for short: Word2vec), the corpus may be a large amount of existing semantic data, and an information acquisition tool (e.g., a web crawler tool) may also be used to acquire semantic data of various technical fields on the internet, such as news, microblogs, forums, and the like. And carrying out synonym expansion on the first vocabulary set according to the word vector, and taking the synonym expanded first vocabulary set as a second vocabulary set. Taking the word "social security card" in the first word set as an example, the synonyms "social security card", and the like of the "social security card" can be put into the second word set after expansion of the synonyms. When the synonym expansion is carried out on the first vocabulary set by utilizing the word vector, the vocabulary in the second vocabulary set can be confirmed again by the manager, so that the synonym expansion accuracy is improved.

And 103, acquiring a matching score of the target question and each record in the knowledge base according to a preset algorithm according to the second vocabulary set and a preset knowledge base, wherein the knowledge base comprises at least one record, and each record comprises a question and an answer corresponding to the question.

For example, the predetermined knowledge base may include a plurality of records, where each record includes a question and an answer corresponding to the question, and may be understood as a question-answer pair, where the question in each record is not repeated. And according to the words and the knowledge base in the second word set, sequentially acquiring the matching scores of the target problem and each record in the knowledge base, wherein the matching scores can reflect the matching degree of each record and the second word set corresponding to the target problem. For example, the matching degree of each vocabulary in the second vocabulary set with each record in the knowledge base may be calculated respectively, and then the matching degree of each vocabulary is summed according to different weights to obtain the matching score of the target problem with each record in the knowledge base, wherein the weight corresponding to each vocabulary may be determined according to the number of times that each vocabulary appears in a record, for example, the more times that a vocabulary appears in a record, the higher the matching degree of the vocabulary with the record is, the higher the corresponding weight is.

And step 104, determining answers matched with the target questions according to the matching scores of each record.

For example, after determining the matching score of the target question and each record, according to the size of the matching score, determining the record matching the target question, and using the answer in the record as the answer matching the target question. For example, a preset number (e.g., 3) of records with the highest matching score may be determined, answers in the preset number of records may be used as answers matched with the target question and recommended to the user, so that the user may select the most needed answer, the answer in the record with the highest matching score may be directly used as the answer matched with the target question, and the answer in the record with the matching score satisfying a preset condition may be used as the answer matched with the target question.

In summary, the present disclosure first performs word segmentation on an obtained target problem to obtain a first vocabulary set including a word segmentation result of the target problem, and then performs synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, where the word vector is obtained by training a preset corpus using a preset model, and then determines a matching score between the target problem and each record in the knowledge base according to a preset algorithm according to the second vocabulary set and the preset knowledge base, where each record includes a question and a question-corresponding answer, and finally determines an answer matched with the target problem according to the matching score of each record in the knowledge base.

Fig. 2 is a flow chart illustrating another information query method according to an exemplary embodiment, and as shown in fig. 2, step 102 may be implemented by:

step 1021, training the corpus by using a preset word vector generation model to obtain a word vector.

And 1022, performing synonym expansion on the first vocabulary set according to the word vector, the preset stop word and the professional word in the target field to which the target problem belongs to obtain a second vocabulary set.

For example, a corpus is trained by using a preset word vector generation model, wherein the word vector generation model is a neural network model, and can extract the features of semantics and grammar in the natural language according to semantic data in the corpus and train to obtain word vectors. And performing synonym expansion on the first vocabulary set according to the word vector, the preset stop words and the professional words in the target field to which the target problem belongs, and taking the synonym expanded first vocabulary set as a second vocabulary set. Stop Words (english: Stop Words) are Words or vocabularies that can be filtered out when processing natural language data (such as text) in a retrieval process, such as partial prepositions, conjunctions, adverbs and the like in chinese, and may include, for example: "in," "and," "to," "of," "in," and the like. The professional word of the target field to which the target question belongs may be determined first, for example, when the user inputs the target question, the target field to which the target question belongs may be selected in advance, or semantic analysis may be performed on the target question to determine the target field, and then the corresponding professional word may be obtained according to the target field. The target questions are: taking the social insurance card where to get as an example, the target problem is participled to obtain a first vocabulary combination as follows: { where, earning, social security card }, synonym expansion is performed on the first vocabulary set, and the second vocabulary set can be obtained as { where, earning, getting, social security card }. When the synonyms are expanded for the first vocabulary set, the manager can confirm the vocabulary in the second vocabulary set again to improve the accuracy of synonym expansion.

Fig. 3 is a flowchart illustrating another information query method according to an exemplary embodiment, and as shown in fig. 3, step 103 may include:

and step 1031, training the corpus by using a preset word vector generation model to obtain word vectors.

And 1032, performing synonym expansion on each record in the knowledge base according to the word vector, the preset stop word and the professional word of the target field to which the target problem belongs.

For example, a corpus is trained by using a preset word vector generation model to obtain word vectors. And carrying out synonym expansion on each record in the knowledge base according to the word vector, the preset stop words and the professional words of the target field to which the target problem belongs. For example, a record in the knowledge base is "how to receive the social security card", the record is participled to obtain a vocabulary set which is { social security card, how, and receiving }, the vocabulary set is synonym expanded, and the result of the participle corresponding to the record can be expanded as follows: { social security card, how, getting }.

And 1033, performing synonym expansion on each record in the knowledge base by using a Neural Machine Translation (NMT) algorithm.

Illustratively, the method utilizes an NMT (english: Natural Machine Translation, chinese: neural Machine Translation) algorithm to perform synonymy expansion on each record in the knowledge base, and can store the synonymy of each record as the synonymy of the record in the knowledge base. For example, an NMT algorithm may be used to translate any chinese record in the knowledge base into english, and then translate the english translation result into chinese, and the obtained chinese result is used as a synonymous sentence of the chinese record, and through two translation processes, the recorded vocabulary and sentence pattern are expanded. It should be noted that, when the synonym expansion is performed on each record in the knowledge base, the administrator may confirm the expansion again to improve the accuracy of the synonym expansion.

And 1034, acquiring a matching score between the target problem and each record in the knowledge base according to a preset algorithm according to the second vocabulary set and the knowledge base.

Illustratively, according to the words in the second word set and the knowledge base subjected to synonym expansion and synonym expansion, the matching scores of the target problem and each record in the knowledge base are sequentially obtained.

Alternatively, step 1033 may be implemented by:

1) a first record in a knowledge base expressed in a first language is translated into an intermediate record in a second language using an NMT algorithm.

2) The intermediate records are translated into synonymous records expressed in the first language using the NMT algorithm.

3) And storing the synonymous records into a knowledge base, wherein the first record is any record in the knowledge base.

Fig. 4 is a flowchart illustrating another information query method according to an exemplary embodiment, and as shown in fig. 4, step 104 includes:

step 1041, arranging the matching scores of each record in descending order from high to low to obtain a score ranking.

Step 1042, the answer in the top n records with the highest ranking in the ranking order of scores is selected as the answer matching the target question.

Or, in step 1043, when the ratio of the first ranked matching score to the second ranked matching score in the score ranking is greater than the preset threshold, the answer in the record corresponding to the first ranked matching score is taken as the answer matched with the target question.

Step 1044 is to select the answer in the top m records with the highest ranking in the ranking order as the answer matched with the target question when the ratio of the first ranked matching score to the second ranked matching score is less than or equal to a preset threshold.

For example, the matching scores of each record in the knowledge base are arranged in descending order from high to low to obtain a score ranking. And according to the grading ranking, taking the answers in the records meeting the preset conditions as answers matched with the target questions. The preset conditions can be as follows: answers in the top n (for example, 3) records in the ranking of scores are selected as answers matched with the target question, and the n answers are recommended to the user so that the user can select the most needed answer. The answer in the highest ranked record in the ranking order of scores may also be recommended directly as the answer for the target question match. The ratio of the first-ranked matching score to the second-ranked matching score in the score ranking may also be calculated first, when the ratio is greater than a preset threshold (indicating that the difference between the first-ranked matching score and the subsequent matching score is large), the answer in the record corresponding to the first-ranked matching score is taken as the answer matched with the target question, and when the ratio is less than or equal to the preset threshold, the answer in the top m (for example, 5) records ranked highest in the score ranking is selected as the answer matched with the target question.

Optionally, step 103 may be implemented by:

and calculating the matching score of the target problem and each record in the knowledge base by using a first calculation formula according to the second vocabulary set and a preset knowledge base.

The first calculation formula includes:

wherein d is_jScore for the j record in the knowledge base_jIs denoted by d_jS is the second vocabulary set and d_jQ is the number of words in the first set of words, t_iNum (d) as the ith word in the second set of words_j) Is d_jThe number of words in the word segmentation result, num (t)_i) Is t_iAt d_jThe number of occurrences in (1), D is the number of entries in the knowledge base, N_iFor inclusion of t in the knowledge base_iThe number of records of (2).

It should be noted that the predetermined knowledge base (including D records, the jth record being D)_j) The method may be an original knowledge base, that is, the problem in each record is not repeated, or may be a knowledge base subjected to synonym expansion and synonym expansion, that is, the original knowledge base is subjected to synonym expansion and synonym expansion by performing steps 1031 to 1033, and at this time, the problem in each record in the knowledge base may be repeated (for example: synonymous sentence). For example, take the knowledge base as the original knowledge base, the knowledge base includes 20 records as an example, d_jFor the j record in the 20 records, num (d) if the vocabulary in the word segmentation result of the j record is 3_j) 3. Alternatively, the knowledge base may be the knowledge base after being expanded (via steps 1031-1034), assuming that d is determined by performing step 1032_jPerforming synonym expansion, and performing synonym expansion on each of the 20 records by executing 1033, expanding the knowledge base to 30 records, d_jFor the j-th record of the 30 records, if d_jThe word segmentation result of (2) has 5 words, num (d)_j) 5. Since the second vocabulary set is obtained by expanding the first vocabulary set, a scene with s larger than Q may occur, and at this time, provision may be made for

Is 1, i.e.

Is a positive number less than or equal to 1.

The target questions are: "where to pick up social security card", the knowledge base contains 30 records, (i.e., D ═ 30), and D in the knowledge base_jFor example, to "how the social security card receives", the target problem is participled to obtain a first vocabulary set as: { where, pick, social security card }, the first vocabulary set is synonymously expanded to obtain a second vocabulary set of { where, pick, social security card } (i.e. corresponding to { t }₁、t₂、t₃、t₄}) to d)_jPerforming word segmentation to obtain: { social security card, how, and acquisition }, then s is 2 (acquisition, social security card, respectively), Q is 3, t₁Corresponding to

Is 0, t₂Corresponding to

Is composed of

t₃Corresponding to

Is 0, t₄Corresponding to

Is composed of

Then Score_jIs composed of

Fig. 5 is a block diagram illustrating an information inquiry apparatus according to an exemplary embodiment, and as shown in fig. 5, the apparatus 200 includes:

the word segmentation module 201 is configured to obtain a first word set by performing word segmentation on the obtained target problem, where the first word set includes a word segmentation result of the target problem.

The expansion module 202 is configured to perform synonym expansion on the first vocabulary set according to a preset word vector to obtain a second vocabulary set, where the word vector is obtained by training a preset corpus using a preset model.

And the scoring module 203 is configured to obtain a matching score between the target question and each record in the knowledge base according to the second vocabulary set and a preset knowledge base according to a preset algorithm, where the knowledge base includes at least one record, and each record includes a question and an answer corresponding to the question.

A determining module 204, configured to determine an answer matching the target question according to the matching score of each record.

Fig. 6 is a block diagram illustrating another information query apparatus according to an exemplary embodiment, and as shown in fig. 6, the expansion module 202 includes:

the first training sub-module 2021 is configured to train the corpus using a preset word vector generation model to obtain a word vector.

The first expansion sub-module 2022 is configured to perform synonym expansion on the first vocabulary set according to the word vector, the preset stop word, and the professional word in the target field to which the target problem belongs, so as to obtain a second vocabulary set.

Fig. 7 is a block diagram illustrating another information query apparatus according to an exemplary embodiment, and as shown in fig. 7, the scoring module 203 includes:

the second training submodule 2031 is configured to train the corpus using a preset word vector generation model to obtain a word vector.

The second expansion sub-module 2032 is configured to expand synonyms for each record in the knowledge base according to the word vector, the preset stop word, and the professional word in the target field to which the target question belongs.

The synonym expansion submodule 2033 is configured to perform synonym expansion on each record in the knowledge base by using a neural machine translation NMT algorithm.

And the scoring submodule 2034 is configured to obtain, according to the second vocabulary set and the knowledge base, a matching score between the target problem and each record in the knowledge base according to a preset algorithm.

Optionally, the synonym expansion submodule 2033 may be implemented by:

Fig. 8 is a block diagram illustrating another information querying device according to an exemplary embodiment, and as shown in fig. 8, the determining module 204 includes:

the sorting sub-module 2041 is configured to sort the matching scores of each record in descending order from high to low, so as to obtain a score sorting.

The determining submodule 2042 is configured to select answers in the top n records in the ranking order of scores as answers matching the target question. Or,

the determining sub-module 2042 is configured to, when a ratio of a first ranked matching score to a second ranked matching score in the score ranking is greater than a preset threshold, take an answer in a record corresponding to the first ranked matching score as an answer matched with the target question.

The determining sub-module 2042 is further configured to select answers in the top m records with the highest ranking in the ranking order of scores as answers matching the target question when the ratio of the first-ranked matching score to the second-ranked matching score is less than or equal to a preset threshold.

Optionally, the scoring module 203 may be implemented by:

The first calculation formula includes:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 9 is a block diagram illustrating an electronic device 300 in accordance with an example embodiment. As shown in fig. 9, the electronic device 300 may include: a processor 301 and a memory 302. The electronic device 300 may also include one or more of a multimedia component 303, an input/output (I/O) interface 304, and a communication component 305.

The processor 301 is configured to control the overall operation of the electronic device 300, so as to complete all or part of the steps in the information query method. The memory 302 is used to store various types of data to support operation at the electronic device 300, such as instructions for any application or method operating on the electronic device 300 and application-related data, such as contact data, transmitted and received messages, pictures, audio, video, and the like. The Memory 302 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk. The multimedia components 303 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 302 or transmitted through the communication component 305. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 304 provides an interface between the processor 301 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 305 is used for wired or wireless communication between the electronic device 300 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G or 4G, or a combination of one or more of them, so that the corresponding Communication component 305 may include: Wi-Fi module, bluetooth module, NFC module.

In an exemplary embodiment, the electronic Device 300 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the above-described information query method.

In another exemplary embodiment, there is also provided a computer readable storage medium including program instructions, which when executed by a processor, implement the steps of the information query method described above. For example, the computer readable storage medium may be the memory 302 including program instructions executable by the processor 301 of the electronic device 300 to perform the information query method described above.

Preferred embodiments of the present disclosure are described in detail above with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and other embodiments of the present disclosure may be easily conceived by those skilled in the art within the technical spirit of the present disclosure after considering the description and practicing the present disclosure, and all fall within the protection scope of the present disclosure.

It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. Meanwhile, any combination can be made between various different embodiments of the disclosure, and the disclosure should be regarded as the disclosure of the disclosure as long as the combination does not depart from the idea of the disclosure. The present disclosure is not limited to the precise structures that have been described above, and the scope of the present disclosure is limited only by the appended claims.

Claims

1. An information query method, the method comprising:

determining answers matched with the target questions according to the matching scores of each record;

the obtaining of the matching score between the target problem and each record in the knowledge base according to the second vocabulary set and a preset knowledge base and a preset algorithm includes:

according to the second vocabulary set and the knowledge base, obtaining a matching score of the target problem and each record in the knowledge base according to a preset algorithm;

the first calculation formula includes:

2. The method of claim 1, wherein synonymously expanding the first vocabulary set according to a predetermined word vector to obtain a second vocabulary set comprises:

3. The method according to claim 1, wherein said synonym augmenting each of said records in said knowledge base using a neural machine translation, NMT, algorithm comprises:

4. The method of claim 1, wherein determining an answer that matches the target question based on the match score for each record comprises:

5. An information query apparatus, comprising:

the determining module is used for determining answers matched with the target questions according to the matching scores of each record;

the scoring module comprises:

the second expansion submodule is used for carrying out synonym expansion on each record in the knowledge base according to the word vector, preset stop words and professional words of the target field to which the target problem belongs;

the scoring submodule is used for acquiring the matching score of the target problem and each record in the knowledge base according to a preset algorithm according to the second vocabulary set and the knowledge base;

the scoring module is configured to:

the first calculation formula includes:

6. The apparatus of claim 5, wherein the expansion module comprises:

7. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.

8. An electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 1 to 4.