WO2021017721A1 - Intelligent question answering method and apparatus, medium and electronic device - Google Patents

Intelligent question answering method and apparatus, medium and electronic device Download PDF

Info

Publication number
WO2021017721A1
WO2021017721A1 PCT/CN2020/098948 CN2020098948W WO2021017721A1 WO 2021017721 A1 WO2021017721 A1 WO 2021017721A1 CN 2020098948 W CN2020098948 W CN 2020098948W WO 2021017721 A1 WO2021017721 A1 WO 2021017721A1
Authority
WO
WIPO (PCT)
Prior art keywords
question
template
keyword
similarity
question sentence
Prior art date
Application number
PCT/CN2020/098948
Other languages
French (fr)
Chinese (zh)
Inventor
张国辉
钱柏丞
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021017721A1 publication Critical patent/WO2021017721A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Definitions

  • This application relates to the field of artificial intelligence, and is specifically applied to the technical fields of natural language processing and semantic analysis, and in particular to an intelligent question answering method, device, medium and electronic equipment.
  • Intelligent question answering systems and chat robots have always been research hotspots in the field of natural language processing. These research directions are mainly to find ways to make computers make appropriate answers to user questions.
  • the purpose of this application is to provide an intelligent question answering method, device, medium and electronic equipment.
  • an intelligent question answering method which includes:
  • an intelligent question answering device comprising:
  • the keyword acquisition module is configured to preprocess the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
  • the similarity acquisition module is configured to input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each of the question sentence templates to the established similarity determination including multi-layer words
  • the semantic parser of the model obtains the similarity between the question sentence output by the semantic parser and each question template;
  • a template obtaining module configured to obtain a standard question template from all question templates in a preset template library according to the similarity
  • the output module is configured to determine a question answer corresponding to the standard question template and output the question answer.
  • an electronic device including:
  • Processor memory, the memory stores computer-readable instructions, when the computer-readable instructions are executed by the processor, the method as described above is implemented.
  • a computer-readable storage medium which stores computer program instructions, and when the computer program instructions are executed by a computer, the computer executes the aforementioned method
  • the intelligent question answering method includes the following steps: preprocessing the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain the keywords of the question sentence and the question sentence template respectively ; Input the question sentence and template keyword group pair based on the keywords of the question sentence and the keywords of each of the question sentence templates into the established semantic parser including the multi-layer word similarity judgment model to obtain The degree of similarity between the question sentence output by the semantic parser and each of the question sentence templates; the standard question template is obtained from all the question sentence templates in the preset template library according to the similarity; State the answer to the question corresponding to the standard question template and output the answer to the question.
  • Fig. 1 is a schematic diagram showing an application scenario of an intelligent question answering method according to an exemplary embodiment
  • Fig. 2 is a flowchart showing an intelligent question answering method according to an exemplary embodiment
  • FIG. 3 is a flowchart of the similarity between question sentences and question sentence templates output by the semantic parser according to an embodiment shown in the embodiment corresponding to FIG. 2;
  • FIG. 4 is a flowchart showing details of step 330 in an embodiment according to the embodiment corresponding to FIG. 3;
  • Fig. 5 is a schematic diagram showing a logical framework of a semantic parser outputting an inferred similarity matrix according to an exemplary embodiment
  • Fig. 6 is a block diagram showing an intelligent question answering device according to an exemplary embodiment
  • Fig. 7 is a block diagram showing an example of an electronic device implementing the above intelligent question answering method according to an exemplary embodiment
  • Fig. 8 shows a computer-readable storage medium for implementing the above intelligent question answering method according to an exemplary embodiment.
  • Intelligent Q&A is the process of outputting corresponding answers based on questions. Regardless of whether it is a question or an answer, it can be a symbol or text in essence, and is often processed as a text in the computer field. In terms of specific expression, the expression of the question and the answer can be the same or different. Questions and answers can exist in various forms of expression such as phrases, phrases, sentences, paragraphs, and articles. Typically, questions can exist in the form of sentences, and answers can exist in the form of sentences or paragraphs. The content of the question is generally the doubt or question to be answered, and the answer is the answer or feedback to the corresponding question.
  • the implementation terminal of this application can be any device with computing and processing functions.
  • the device can be connected to an external device to receive or send data.
  • it can be a portable mobile device, such as a smart phone, a tablet computer, a notebook computer, or a PDA ( Personal Digital Assistant), etc., can also be fixed devices, such as computer equipment, field terminals, desktop computers, servers, workstations, etc., or a collection of multiple devices, such as the physical infrastructure of cloud computing.
  • the implementation terminal of this application may be a server or a physical infrastructure of cloud computing.
  • Fig. 1 is a schematic diagram showing an application scenario of an intelligent question answering method according to an exemplary embodiment. As shown in Figure 1, it includes a server 110, a database 120, and a user terminal 130. The database 120 and the user terminal 130 are respectively connected to the server 110 through a communication link.
  • the server 110 is the implementation terminal of the application.
  • the user terminal 130 can also be any device with computing and processing functions. It can be the same type of terminal as the implementation terminal of this application, or it can be a different type of terminal, and can be the same terminal as the implementation terminal of this application, or it can For different terminals.
  • the user When the user needs an answer to a search question, he can first input the question he wants to find an answer through the input device (such as keyboard, mouse, touch screen, etc.) of the user terminal 130.
  • the question may be sent to the server 110 through the user terminal 130, and The server 110 finds an answer that matches the question.
  • the database 120 stores multiple question templates and question answers corresponding to the question templates.
  • the server 110 wants to find an answer that matches the target question, it can first find the question template that best matches the target question in the database 120 according to the target question. , And then use the answer to the question corresponding to the found question template as the answer that matches the target question.
  • the server 110 may return the answer to the user terminal 130 to provide the user of the user terminal 130 with a corresponding answer.
  • the implementation terminal of this application is a server, and the answer to the question corresponding to the question template is stored in the database, in other embodiments or specific applications, various terminals can be used as needed. It is selected as the implementation terminal of this application, and the question template and the answer to the question can be stored on any same or different terminal.
  • This application does not make any limitation on this, and the protection scope of this application should not be restricted in any way.
  • Fig. 2 is a flow chart showing a method for intelligent question answering according to an exemplary embodiment. As shown in Figure 2, the following steps can be included:
  • Step 210 Preprocessing the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain keywords of the question sentence and the question sentence template respectively.
  • the expression form of question sentence can be in various forms such as phrase, phrase, sentence, paragraph.
  • the question template can exist in the same form as the question sentence, or in a different form, and the question template can be matched with the question sentence.
  • the question can be "Excuse me, what is the phone number of Zhang San", then the question template corresponding to the question can be: the mobile phone number of ⁇ name entity>.
  • the preprocessing is to obtain the keywords of the question sentence or question template.
  • Keywords are words, words or phrases that record the core content of a question or question template.
  • the preprocessing of the question sentence input by the user and the multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively includes:
  • stop words are removed from the words cut from the question entered for the user and the question template in the preset template library , Get the key words of the question and the question template.
  • Word cutting of question sentence and question sentence template is the process of dividing question sentence and question sentence model into words.
  • the advantage of this embodiment is that the keywords are obtained by removing only the stop words in the divided words, which ensures that the obtained keywords can represent more question sentences or question template information.
  • the question sentence input by the user and the multiple question sentence templates in the preset template library are segmented by calling a preset word segmentation interface.
  • Stop words mean that certain words or words will be automatically filtered before or after processing natural language data (or text) in order to save storage space and improve search efficiency in information retrieval.
  • the preprocessing of the question sentence input by the user and the multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively includes:
  • the words that are cut from the question input for the user or the question template in the preset template library are obtained in advance.
  • Set the words in the keyword library to obtain the keywords of the question sentence or the question sentence template.
  • the advantage of this embodiment is that the range of keywords of the question or question template obtained is limited by the preset keyword library, so that the keywords of the obtained question and the question template can be kept relatively high the quality of.
  • the question template in the question input for the user or the preset template library is selected. From the obtained words, obtaining the words existing in the preset keyword library, and obtaining the keywords of the question sentence or the question sentence template may include:
  • Step 220 Input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each question template to the established semantic parser including the multi-layer word similarity judgment model To obtain the similarity between the question output by the semantic parser and each question template.
  • the multi-layer word similarity judgment model in the semantic parser includes:
  • Jaccard index model layer Jaccard index model layer, word2vec model layer, Glove model layer and C&W model layer.
  • the above-mentioned word similarity judgment model can output the similarity of question sentence and template keyword group pair, and one model layer is a word similarity judgment model.
  • the Jaccard index model contained in the Jaccard index model layer can directly output the similarity, while the models in the word2vec model layer, the Glove model layer and the C&W model layer may not directly output the similarity, but output Word vectors.
  • these model layers can also include a similarity calculation layer, which can be used to calculate the similarity of word vectors, so that the model layer can output the similarity of the question sentence and the template keyword group pair.
  • the similarity calculation layer The similarity of word vectors can be calculated based on cosine distance or Euclidean distance.
  • the word similarity judgment model is a model layer, and some model layers may include the model and the similarity calculation layer, but here is only for ease of understanding and description, the model and similarity calculation The layers are collectively called the word similarity judgment model.
  • the word similarity judgment model may not be a separate model, but a collection of units or components that can complete the function of word similarity judgment.
  • the semantic parser in addition to the Jaccard index model layer, the word2vec model layer, the Glove model layer, and the C&W model layer, the semantic parser further includes: a word similarity judgment model layer in a specific field.
  • the word similarity judgment model in a specific field can be a model that can be calculated based on experience in certain fields.
  • the specific implementation form can be arbitrary. For example, it can be specified by pre-establishing a word correlation database in a specific field. Judgment of similarity of words in the field.
  • the advantage of this embodiment is that by adding a layer of word similarity judgment model, the accuracy of the similarity between the question sentence output by the semantic parser and each question sentence template can be improved to a certain extent.
  • FIG. 3 is a flowchart of the semantic parser outputting question sentence and question sentence template similarity according to an embodiment of the embodiment corresponding to Fig. 2, as shown in Fig. 3, including the following steps:
  • Step 310 Obtain a keyword pair matrix corresponding to each of the question sentence and the template keyword group pair.
  • each element in the keyword pair matrix is a combination of the keywords of the question sentence and the question template, and all elements in the keyword pair matrix contain the key of the question sentence and the question template.
  • the number of keywords in the question sentence is the row width or column width of the keyword pair matrix.
  • the number of keywords in the question sentence template is the column width or row width of the keyword pair matrix.
  • the obtaining the keyword pair matrix corresponding to each of the question sentence and the template keyword group pair includes:
  • the keyword pairs are sorted according to the generation order, starting from the first keyword pair, each time the number of keyword pairs that have not been obtained are obtained in the stated order, as a keyword pair matrix In the nth row in, where n is the number of times of obtaining the number of keyword pairs.
  • Step 320 Use each layer of word similarity judgment model in the semantic parser to perform word similarity judgment on each keyword pair matrix, and obtain the word similarity judgment model for each layer of word similarity judgment for each question.
  • the word similarity judgment model can be the Jaccard index model layer, the word2vec model layer, the Glove model layer, the C&W model layer, the word similarity judgment model layer in a specific field, etc.
  • the word similarity judgment model can be used for key The similarity is calculated for each keyword pair in the word pair matrix to obtain the similarity matrix.
  • Step 330 For each of the question sentence and the template keyword group pair, the question sentence and the template are obtained based on the similarity matrix of the keyword pair of the question sentence and the template keyword group pair output by the word similarity judgment model of each layer The inferred similarity matrix of the keyword pair.
  • the inferred similarity matrix is a combination of the similarity matrices output by the word similarity judgment model of each layer. By integrating the output results of the word similarity judgment model of each layer, the obtained inferred similarity matrix can be more reliable and accurate. .
  • each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and for each of the question sentence and template keyword group pair, based on The similarity matrix of the keyword pairs of the question sentence and the template keyword group pair output by the layer word similarity judgment model, to obtain the inferred similarity matrix of the question sentence and the template keyword group pair, including:
  • the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer is compared with This keyword sorts the corresponding elements from large to small;
  • the elements corresponding to the first predetermined number of keyword pairs in the first order are obtained as the inferred similarity matrix obtaining elements;
  • elements are obtained based on the obtained inferred similarity matrix to obtain an inferred similarity matrix.
  • the inferred similarity matrix acquisition element is used to obtain elements from which to construct the inferred similarity matrix.
  • obtaining elements based on the obtained inferred similarity matrix to obtain an inferred similarity matrix includes:
  • any one element is selected from the elements corresponding to the keyword pair in the second predetermined number in the first order, and the second predetermined number is less than the first predetermined number;
  • the similarity matrix is formed according to the order of each corresponding element in the similarity matrix, as the inference similarity of the question and the template keyword group pair Degree matrix.
  • the advantage of this embodiment is that when the variance is not greater than the predetermined variance threshold, that is, when the value of each element corresponding to the same keyword is sufficiently similar, the value of each element corresponding to the keyword is ranked sufficiently reliable.
  • any one of the previous elements is used to construct the inferred similarity matrix of the question and template keyword group pair, which improves the randomness and fairness of the constructed inferred similarity matrix, and may improve the acquisition to a certain extent The accuracy of the inferred similarity matrix.
  • obtaining elements based on the obtained inferred similarity matrix to obtain an inferred similarity matrix includes:
  • the ratio is greater than or equal to a predetermined ratio threshold, obtaining an average value of elements for each inferred similarity matrix corresponding to the keyword pair;
  • the average value of the elements obtained by the inferred similarity matrix reflects the central tendency of the elements obtained by the inferred similarity matrix.
  • the difference between the average value of the elements obtained by the inferred similarity matrix and the maximum value of the corresponding element of the keyword pair is different.
  • the similarity matrix is constructed by obtaining the average value of each inferred similarity matrix, so that the elements in the similarity matrix obtained by obtaining the average value of the elements based on each inferred similarity matrix can better reflect the similarity of words in each layer
  • the judgment result of the degree judgment model improves the fairness of constructing the inferred similarity matrix.
  • the maximum value of the element and the difference between the average value and the maximum value are obtained.
  • the ratio is used as a condition for determining whether to construct the similarity matrix by obtaining the average value of the elements obtained by each inferred similarity matrix. Whether the maximum value of the elements obtained by the inferred similarity matrix is an extreme value, this condition can be passed or not to a certain extent. Obtain each inferred similarity matrix to obtain the average value of the elements to construct the similarity matrix for good distinction, which improves the generalization ability and universality of the condition.
  • Fig. 5 is a schematic diagram showing a logical framework of a semantic parser outputting an inferred similarity matrix according to an exemplary embodiment. As shown in FIG. 5, it includes a keyword pair matrix 510, a first-level word similarity determination model 520, a second-level word similarity determination model 530, a third-level word similarity determination model 540, a similarity matrix 550, and inferred similarity
  • the degree matrix 560 in which the three-layer word similarity judgment model belongs to the semantic parser 500, the similarity matrix pointed by the arrow pointed by the word similarity judgment model of each layer is the similarity output of the word similarity judgment model of each layer
  • the inferred similarity matrix 560 is obtained based on the similarity matrix output by the word similarity judgment model of each layer.
  • Step 340 for each question sentence and template keyword group pair, obtain the question sentence and question sentence corresponding to the question sentence and the template keyword group pair according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair The similarity of the template.
  • the similarity of the question sentence and the question template corresponding to the question sentence and the template keyword group pair is obtained by the following formula:
  • S1 and S2 are the question sentence and the question sentence template respectively
  • Sim(S1, S2) is the similarity between the question sentence and the question sentence template
  • w i is the key word in the question sentence
  • w j is the keyword in the question template
  • maxSim(w j ) is the maximum value of the similarity between each keyword in the question template and the keyword w i in the question
  • maxSim (w i ) is the maximum value of the similarity between each keyword in the question sentence and the keyword w j in the question sentence
  • idf(w i ) is the reverse of the keyword w i in the question sentence
  • File frequency idf(w j ) is the reverse file frequency of the keyword w j in the question sentence
  • D is the number of all question sentence templates
  • Is the number of question template for the keyword w i in the question sentence.
  • step 220 the following steps are included after step 220:
  • Step 230 Obtain a standard question template from all question templates in a preset template library according to the similarity.
  • the obtaining the standard question template from all question templates in the preset template library according to the similarity includes: obtaining the standard question template from all question templates in the preset template library The question template with the greatest similarity is used as the standard question template.
  • obtaining a standard question template from all question templates in a preset template library according to the similarity includes: obtaining all question templates in the preset template library according to the similarity from Sorting from large to small; obtaining the predetermined number of question template in the order of similarity from all question templates in the preset template library, as the standard question template.
  • the obtaining standard question template from all question template in a preset template library according to the similarity includes:
  • the question templates ranked less than or equal to the predetermined number threshold are used as standard question templates
  • the question template with the similarity greater than the preset similarity threshold is used as the standard question template.
  • the advantage of this embodiment is that by limiting the number of obtained standard templates, the efficiency of outputting question answers can be improved. At the same time, by using the preset similarity threshold as the basis for selecting standard question templates, the standard question templates obtained are improved. Accuracy.
  • Step 240 Determine the question answer corresponding to the standard question template and output the question answer.
  • the question answer corresponding to each question template is stored in the preset database.
  • the determining the question answer corresponding to the standard question template and outputting the question answer includes: querying The preset database obtains the answer to the question corresponding to the standard question template and outputs the answer to the question.
  • the determining the question answer corresponding to the standard question template and outputting the question answer includes: obtaining the answer corresponding to the standard question template by calling a preset question answer query interface The answer to the question and the answer to the question are output.
  • the question answer data is stored in the form of RDF (Resource Description Framework) data
  • the determining the question answer corresponding to the standard question template and outputting the question answer includes:
  • the question answer data stored in the form of RDF data is queried through the SPARQL sentence to obtain the question answer corresponding to the standard question template and output the question answer.
  • determining the question answer corresponding to the standard question template and outputting the question answer includes:
  • the keyword group pair is constructed and input to the already constructed including
  • the semantic parser of the multi-layer word similarity judgment model can obtain a more accurate output result of the similarity between the question and the question model, and then according to the result, the question template that best matches the question input by the user can be obtained, thereby The answer that best matches the question can be obtained for the user’s question, which improves the efficiency and accuracy of question and answer matching, thereby improving the matching effect.
  • FIG. 4 is a flowchart showing details of step 330 in an embodiment according to the embodiment corresponding to FIG. 3. As shown in Figure 4, the following steps can be included:
  • Step 331 For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, obtain the similarity of the keyword pairs output by the word similarity judgment model of each layer The maximum value of the element corresponding to the keyword pair in the matrix.
  • the maximum value is obtained through the first round of bubble sorting.
  • step 332 the maximum value of the corresponding element obtained for each keyword pair corresponding to the question sentence and the template keyword group pair is formed into a similarity matrix according to the order of each corresponding element in the similarity matrix, as the Inferred similarity matrix of question sentence and template keyword group pair.
  • each element in the inferred similarity matrix 560 is the maximum value of the element corresponding to the keyword pair in the similarity matrix of the keyword pair output by the word similarity judgment model of each layer, for example, for For the keyword pair A1B1, the similarity matrix output by the three-layer word similarity judgment model and the corresponding elements of the keyword pair are 0.85, 0.84, and 0.88, respectively, and the element corresponding to the keyword pair in the inferred similarity matrix is The maximum value of the three is 0.88.
  • the advantage of this embodiment is that the acquisition of elements in the inferred similarity matrix has fair standards, and the accuracy of the inferred similarity matrix is guaranteed.
  • This application also provides an intelligent question answering device.
  • the following are device embodiments of this application.
  • Fig. 6 is a block diagram showing an intelligent question answering device according to an exemplary embodiment. As shown in FIG. 6, the apparatus 600 includes:
  • the keyword acquisition module 610 is configured to preprocess the question input by the user and a plurality of question template in a preset template library to obtain the keywords of the question sentence and the question template respectively;
  • the similarity acquisition module 620 is configured to input the question sentence and the template keyword group pair composed of the keywords of the question sentence and the keywords of each of the question sentence templates into the established similarity degree including multiple layers of words Determine the semantic parser of the model to obtain the similarity between the question sentence output by the semantic parser and each question template;
  • the template obtaining module 630 is configured to obtain a standard question template from all question templates in a preset template library according to the similarity;
  • the output module 640 is configured to determine the answer to the question corresponding to the standard question template and output the answer to the question.
  • an electronic device capable of implementing the above method.
  • the electronic device 700 according to this embodiment of the present application will be described below with reference to FIG. 7.
  • the electronic device 700 shown in FIG. 7 is only an example, and should not bring any limitation to the function and use scope of the embodiments of the present application.
  • the electronic device 700 is represented in the form of a general computing device.
  • the components of the electronic device 700 may include, but are not limited to: the aforementioned at least one processing unit 710, the aforementioned at least one storage unit 720, and a bus 730 connecting different system components (including the storage unit 720 and the processing unit 710).
  • the storage unit stores program code, and the program code can be executed by the processing unit 710, so that the processing unit 710 executes the various exemplary methods described in the "Embodiment Method" section of this specification. Implementation steps.
  • the storage unit 720 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 721 and/or a cache storage unit 722, and may also optionally include a read-only storage unit (ROM) 723.
  • RAM random access storage unit
  • ROM read-only storage unit
  • the storage unit 720 may also include a program/utility tool 724 having a set of (at least one) program module 725.
  • program module 725 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
  • the bus 730 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
  • the electronic device 700 can also communicate with one or more external devices 900 (such as keyboards, pointing devices, Bluetooth devices, etc.), and can also communicate with one or more devices that enable users to interact with the electronic device 700, and/or communicate with Any device (such as a router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. This communication can be performed through an input/output (I/O) interface 750.
  • the electronic device 700 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 760. As shown in the figure, the network adapter 760 communicates with other modules of the electronic device 700 through the bus 730.
  • LAN local area network
  • WAN wide area network
  • public network such as the Internet
  • the exemplary embodiments described herein can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, server, terminal device, or network device, etc.) execute the method according to the embodiment of the present application.
  • a non-volatile storage medium can be a CD-ROM, U disk, mobile hard disk, etc.
  • Including several instructions to make a computing device which may be a personal computer, server, terminal device, or network device, etc.
  • a computer-readable storage medium on which is stored a program product capable of implementing the above method of this specification.
  • various aspects of the present application can also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to enable the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.
  • a program product 800 for implementing the above method according to an embodiment of the present application is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be stored in a terminal device, For example, running on a personal computer.
  • CD-ROM compact disk read-only memory
  • the program product of this application is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or combined with an instruction execution system, device, or device.
  • the program product can use any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the foregoing.
  • the program code used to perform the operations of this application can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or can be connected to an external computing device (for example, using Internet service providers) Business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service providers Internet service providers

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An intelligent question answering method and apparatus, a medium and an electronic device. The method comprises: preprocessing a question inputted by a user and a plurality of question templates in a preset template library to obtain keywords of the question and question templates respectively; inputting question and template keyword group pairs composed of the keywords of the question and the keyword of each question template to an established semantic parser that comprises a multi-layer word similarity determination model to obtain the degree of similarity between the question and each question template outputted by the semantic parser; acquiring a standard question template from all question templates in the preset template library according to the degree of similarity; and determining an answer to a question corresponding to the standard question template and outputting the answer to the question. The described method may obtain the answer that best matches the question of a user with respect to the question, which improves the efficiency and accuracy of question and answer matching, thereby improving the matching effect.

Description

智能问答方法、装置、介质及电子设备Intelligent question answering method, device, medium and electronic equipment
相关申请的交叉引用Cross references to related applications
本申请要求于2019年8月1日提交中国专利局、申请号为201910709165.9,发明名称为“智能问答方法、装置、介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on August 1, 2019, the application number is 201910709165.9, and the invention title is "Intelligent Question Answering Methods, Devices, Media and Electronic Equipment", the entire contents of which are incorporated by reference In this application.
技术领域Technical field
本申请涉及人工智能领域,具体应用于自然语言处理以及语义解析技术领域,特别涉及一种智能问答方法、装置、介质及电子设备。This application relates to the field of artificial intelligence, and is specifically applied to the technical fields of natural language processing and semantic analysis, and in particular to an intelligent question answering method, device, medium and electronic equipment.
背景技术Background technique
智能问答系统和聊天机器人一直是自然语言处理领域的研究热点,这些研究方向主要是想方设法来让计算机对用户的问题做出合适的解答。Intelligent question answering systems and chat robots have always been research hotspots in the field of natural language processing. These research directions are mainly to find ways to make computers make appropriate answers to user questions.
在现有技术的实现中,为了将用户的问题与答案进行匹配,通常是利用单独的一个语言相似度模型将问题和答案模板来进行相似度匹配。然而,发明人意识到用户输入的问题往往是千差万别的,语义可能很复杂,这样就给问题和答案的匹配带来了很高的难度,往往导致问题和答案的匹配效率和精度低下,匹配效果差。因此现有技术亟需一种提高问题和答案匹配效率和准确率,从而提高匹配效果的技术方案。In the implementation of the prior art, in order to match the user's question with the answer, a separate language similarity model is usually used to match the question and the answer template for similarity. However, the inventor realizes that the questions input by the user are often very different, and the semantics may be very complicated, which brings a high degree of difficulty to the matching of the question and the answer, and often leads to low matching efficiency and accuracy of the question and answer, and the matching effect difference. Therefore, the prior art urgently needs a technical solution that improves the efficiency and accuracy of question and answer matching, thereby improving the matching effect.
发明内容Summary of the invention
在人工智能以及语义解析技术领域,为了解决上述技术问题,本申请的目的在于提供一种智能问答方法、装置、介质及电子设备。In the field of artificial intelligence and semantic analysis technology, in order to solve the above technical problems, the purpose of this application is to provide an intelligent question answering method, device, medium and electronic equipment.
根据本申请的第一方面,提供了一种智能问答方法,所述方法包括:According to the first aspect of the present application, there is provided an intelligent question answering method, which includes:
对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;Preprocessing the question sentence input by the user and multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;Input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each question template into the established semantic parser including the multi-layer word similarity judgment model to obtain the The similarity between the question output from the semantic parser and each question template;
根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板;Obtaining a standard question template from all question templates in a preset template library according to the similarity;
确定与所述标准问句模板对应的问题答案并将所述问题答案输出。Determine the answer to the question corresponding to the standard question template and output the answer to the question.
根据本申请的第二方面,提供了一种智能问答装置,所述装置包括:According to the second aspect of the present application, there is provided an intelligent question answering device, the device comprising:
关键词获取模块,被配置为对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;The keyword acquisition module is configured to preprocess the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
相似度获取模块,被配置为将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;The similarity acquisition module is configured to input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each of the question sentence templates to the established similarity determination including multi-layer words The semantic parser of the model obtains the similarity between the question sentence output by the semantic parser and each question template;
模板获取模块,被配置为根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板;A template obtaining module configured to obtain a standard question template from all question templates in a preset template library according to the similarity;
输出模块,被配置为确定与所述标准问句模板对应的问题答案并将所述问题答案输出。The output module is configured to determine a question answer corresponding to the standard question template and output the question answer.
根据本申请的第三方面,提供了一种电子设备,所述电子设备包括:According to a third aspect of the present application, there is provided an electronic device, the electronic device including:
处理器;存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,实现如前所述的方法。Processor; memory, the memory stores computer-readable instructions, when the computer-readable instructions are executed by the processor, the method as described above is implemented.
根据本申请的第四方面,提供了一种计算机可读存储介质,其存储有计算机程序指令,当所述计算机程序指令被计算机执行时,使计算机执行如前所述的方法According to the fourth aspect of the present application, a computer-readable storage medium is provided, which stores computer program instructions, and when the computer program instructions are executed by a computer, the computer executes the aforementioned method
本申请的实施例提供的技术方案可以包括以下有益效果:The technical solutions provided by the embodiments of the present application may include the following beneficial effects:
本申请所提供的智能问答方法包括如下步骤:对用户输入的问句和预设的模板库中的 多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板;确定与所述标准问句模板对应的问题答案并将所述问题答案输出。The intelligent question answering method provided by this application includes the following steps: preprocessing the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain the keywords of the question sentence and the question sentence template respectively ; Input the question sentence and template keyword group pair based on the keywords of the question sentence and the keywords of each of the question sentence templates into the established semantic parser including the multi-layer word similarity judgment model to obtain The degree of similarity between the question sentence output by the semantic parser and each of the question sentence templates; the standard question template is obtained from all the question sentence templates in the preset template library according to the similarity; State the answer to the question corresponding to the standard question template and output the answer to the question.
此方法下,在分别获取到用户输入的问句和预设的问句模板的关键词后,通过构建关键词组对并输入至已经构建好的包括多层词语相似度判定模型的语义解析器,可以得到更为准确的问句与问句模型的相似度输出结果,然后根据该结果可以得到与用户输入的问句最匹配的问句模板,从而可以针对用户的问题得到与该问题最匹配的答案,提高了问题和答案匹配效率和准确率,进而提高了匹配效果。In this method, after obtaining the keywords of the question input by the user and the preset question template respectively, by constructing a keyword group pair and inputting it to the already constructed semantic parser including a multi-layer word similarity judgment model, A more accurate output result of the similarity between the question and the question model can be obtained, and then the question template that best matches the question entered by the user can be obtained according to the result, so that the best match to the question can be obtained for the user’s question The answer improves the efficiency and accuracy of question and answer matching, thereby improving the matching effect.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本申请。It should be understood that the above general description and the following detailed description are only exemplary and cannot limit the application.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。The drawings here are incorporated into the specification and constitute a part of the specification, show embodiments that conform to the application, and are used together with the specification to explain the principle of the application.
图1是根据一示例性实施例示出的一种智能问答方法的应用场景示意图;Fig. 1 is a schematic diagram showing an application scenario of an intelligent question answering method according to an exemplary embodiment;
图2是根据一示例性实施例示出的一种智能问答方法的流程图;Fig. 2 is a flowchart showing an intelligent question answering method according to an exemplary embodiment;
图3是根据图2对应实施例示出的一实施例的语义解析器输出问句与问句模板相似度的流程图;FIG. 3 is a flowchart of the similarity between question sentences and question sentence templates output by the semantic parser according to an embodiment shown in the embodiment corresponding to FIG. 2;
图4是根据图3对应实施例示出的一实施例的步骤330的细节的流程图;FIG. 4 is a flowchart showing details of step 330 in an embodiment according to the embodiment corresponding to FIG. 3;
图5是根据一示例性实施例示出的语义解析器输出推断相似度矩阵的逻辑框架示意图;Fig. 5 is a schematic diagram showing a logical framework of a semantic parser outputting an inferred similarity matrix according to an exemplary embodiment;
图6是根据一示例性实施例示出的一种智能问答装置的框图;Fig. 6 is a block diagram showing an intelligent question answering device according to an exemplary embodiment;
图7是根据一示例性实施例示出的一种实现上述智能问答方法的电子设备示例框图;Fig. 7 is a block diagram showing an example of an electronic device implementing the above intelligent question answering method according to an exemplary embodiment;
图8是根据一示例性实施例示出的一种实现上述智能问答方法的计算机可读存储介质。Fig. 8 shows a computer-readable storage medium for implementing the above intelligent question answering method according to an exemplary embodiment.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。Here, exemplary embodiments will be described in detail, and examples thereof are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present application. On the contrary, they are only examples of devices and methods consistent with some aspects of the application as detailed in the appended claims.
此外,附图仅为本申请的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。In addition, the drawings are only schematic illustrations of the application, and are not necessarily drawn to scale. The same reference numerals in the figures denote the same or similar parts, and thus their repeated description will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities.
本申请首先提供了一种智能问答方法。智能问答即根据问题输出对应的答案的过程。无论是问题还是答案,本质上都可以为符号或者文字,常作为文本在计算机领域进行处理,在具体表现上,问题和答案的表现形式可以相同也可以不同。问题和答案都可以以词组、短语、句子、段落、文章等各种表现形式存在,比较典型的,问题可以以句子的形式存在,答案可以是以句子或段落的形式存在。问题的内容一般为待解答的疑惑或者题目,而答案为对相应的问题做出的解答或者反馈。一般情况下,对于人们遇到的多数问题都已经存在相应的经过了科学或者实践检验的相对正确的答案,人们探索未知,想获得问题的答案,所以有必要根据人们的问题为人们提供准确的答案,由于问题和对应的答案完全是两个维度的内容,如何将答案和对应的问题做精准的匹配正是本申请提供的智能问答方法要解决的问题,通过根据本申请提供的智能问答方法,可以实现答案和对应的问题之间的准确匹配。This application first provides an intelligent question answering method. Intelligent Q&A is the process of outputting corresponding answers based on questions. Regardless of whether it is a question or an answer, it can be a symbol or text in essence, and is often processed as a text in the computer field. In terms of specific expression, the expression of the question and the answer can be the same or different. Questions and answers can exist in various forms of expression such as phrases, phrases, sentences, paragraphs, and articles. Typically, questions can exist in the form of sentences, and answers can exist in the form of sentences or paragraphs. The content of the question is generally the doubt or question to be answered, and the answer is the answer or feedback to the corresponding question. Under normal circumstances, for most of the problems that people encounter, there are corresponding relatively correct answers that have been scientifically or practically tested. People explore the unknown and want to get answers to questions. Therefore, it is necessary to provide people with accurate answers based on people’s problems. The answer, since the question and the corresponding answer are completely two-dimensional content, how to accurately match the answer with the corresponding question is exactly the problem to be solved by the intelligent question and answer method provided by this application, through the intelligent question and answer method provided by this application , Can achieve an accurate match between the answer and the corresponding question.
本申请的实施终端可以是任何具有运算和处理功能的设备,该设备可以与外部设 备相连,用于接收或者发送数据,具体可以是便携移动设备,例如智能手机、平板电脑、笔记本电脑、PDA(Personal Digital Assistant)等,也可以是固定式设备,例如,计算机设备、现场终端、台式电脑、服务器、工作站等,还可以是多个设备的集合,比如云计算的物理基础设施。The implementation terminal of this application can be any device with computing and processing functions. The device can be connected to an external device to receive or send data. Specifically, it can be a portable mobile device, such as a smart phone, a tablet computer, a notebook computer, or a PDA ( Personal Digital Assistant), etc., can also be fixed devices, such as computer equipment, field terminals, desktop computers, servers, workstations, etc., or a collection of multiple devices, such as the physical infrastructure of cloud computing.
可选地,本申请的实施终端可以为服务器或者云计算的物理基础设施。Optionally, the implementation terminal of this application may be a server or a physical infrastructure of cloud computing.
图1是根据一示例性实施例示出的一种智能问答方法的应用场景示意图。如图1所示,包括服务器110、数据库120以及用户终端130,其中,数据库120和用户终端130分别通过通信链路与服务器110相连,在本实施例中,服务器110为本申请的实施终端,用户终端130也可以是任何具有运算和处理功能的设备,其可以与本申请的实施终端为相同类型的终端,也可以是不同类型的终端,可以与本申请的实施终端为同一终端,也可以为不同终端。当用户需要对查找问题对应的答案时,首先可以通过用户终端130的输入设备(如键盘、鼠标、触摸屏等)输入想要寻求解答的问题,该问题可以通过用户终端130发送至服务器110,由服务器110来找出与该问题匹配的答案。数据库120中存储了多个问题模板以及与问题模板对应的问题答案,当服务器110要寻找与目标问题匹配的答案时,可以先根据目标问题在数据库120中找出与目标问题最匹配的问题模板,然后将与找的问题模板对应的问题答案作为与目标问题匹配的答案,最后,服务器110可以将该答案返回给用户终端130,从而为用户终端130的用户提供相应的解答。Fig. 1 is a schematic diagram showing an application scenario of an intelligent question answering method according to an exemplary embodiment. As shown in Figure 1, it includes a server 110, a database 120, and a user terminal 130. The database 120 and the user terminal 130 are respectively connected to the server 110 through a communication link. In this embodiment, the server 110 is the implementation terminal of the application. The user terminal 130 can also be any device with computing and processing functions. It can be the same type of terminal as the implementation terminal of this application, or it can be a different type of terminal, and can be the same terminal as the implementation terminal of this application, or it can For different terminals. When the user needs an answer to a search question, he can first input the question he wants to find an answer through the input device (such as keyboard, mouse, touch screen, etc.) of the user terminal 130. The question may be sent to the server 110 through the user terminal 130, and The server 110 finds an answer that matches the question. The database 120 stores multiple question templates and question answers corresponding to the question templates. When the server 110 wants to find an answer that matches the target question, it can first find the question template that best matches the target question in the database 120 according to the target question. , And then use the answer to the question corresponding to the found question template as the answer that matches the target question. Finally, the server 110 may return the answer to the user terminal 130 to provide the user of the user terminal 130 with a corresponding answer.
值得一提的是,虽然在本实施例中,本申请的实施终端为服务器,并且问题模板对应的问题答案保存在数据库中,但在其他实施例或者具体应用中,可以根据需要将各种终端选为本申请的实施终端,并且可以将问题模板和问题答案保存在任意相同或者不同的终端上,本申请对此不作任何限定,本申请的保护范围也不应因此而受到任何限制。It is worth mentioning that although in this embodiment, the implementation terminal of this application is a server, and the answer to the question corresponding to the question template is stored in the database, in other embodiments or specific applications, various terminals can be used as needed. It is selected as the implementation terminal of this application, and the question template and the answer to the question can be stored on any same or different terminal. This application does not make any limitation on this, and the protection scope of this application should not be restricted in any way.
图2是根据一示例性实施例示出的一种智能问答方法的流程图。如图2所示,可以包括以下步骤:Fig. 2 is a flow chart showing a method for intelligent question answering according to an exemplary embodiment. As shown in Figure 2, the following steps can be included:
步骤210,对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词。Step 210: Preprocessing the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain keywords of the question sentence and the question sentence template respectively.
如前所述,问句的表现形式可以是词组、短语、句子、段落等各种形式。As mentioned above, the expression form of question sentence can be in various forms such as phrase, phrase, sentence, paragraph.
问句模板可以与问句以相同的形式存在,也可以以不同的形式存在,问句模板可以与问句进行匹配。The question template can exist in the same form as the question sentence, or in a different form, and the question template can be matched with the question sentence.
比如,问句可以为“请问,张三的电话是多少”,那么与该问句对应的问句模板可以是:<姓名实体>的手机号码。For example, the question can be "Excuse me, what is the phone number of Zhang San", then the question template corresponding to the question can be: the mobile phone number of <name entity>.
在一个实施例中,预处理是获得问句或者问句模板的关键词。In one embodiment, the preprocessing is to obtain the keywords of the question sentence or question template.
关键词是记录了问句或者问句模板的核心内容的单字、词语或者短语。Keywords are words, words or phrases that record the core content of a question or question template.
在一个实施例中,所述对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词,包括:In one embodiment, the preprocessing of the question sentence input by the user and the multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively includes:
分别对用户输入的问句和预设的模板库中的多个问句模板进行切词;Separate the question sentence input by the user and multiple question sentence templates in the preset template library;
针对所述用户输入的问句和预设的模板库中的每一问句模板,在为该用户输入的问句和预设的模板库中的问句模板切得的词中去除停用词,得到所述问句和该问句模板的关键词。For the question entered by the user and each question template in the preset template library, stop words are removed from the words cut from the question entered for the user and the question template in the preset template library , Get the key words of the question and the question template.
对问句和问句模板切词即为将问句和问句模型分成词的过程。Word cutting of question sentence and question sentence template is the process of dividing question sentence and question sentence model into words.
本实施例的好处在于,通过仅将分成的词中的停用词去除来获取关键词,保证了获取的关键词能够表示更多的问句或者问句模板的信息。The advantage of this embodiment is that the keywords are obtained by removing only the stop words in the divided words, which ensures that the obtained keywords can represent more question sentences or question template information.
在一个实施例中,通过调用预设的分词接口对用户输入的问句和预设的模板库中的多个问句模板进行切词。In one embodiment, the question sentence input by the user and the multiple question sentence templates in the preset template library are segmented by calling a preset word segmentation interface.
停用词是指在信息检索中,为节省存储空间和提高搜索效率,在处理自然语言数据(或文本)之前或之后会自动过滤掉某些字或词。Stop words mean that certain words or words will be automatically filtered before or after processing natural language data (or text) in order to save storage space and improve search efficiency in information retrieval.
比如,在英文中,可以将“the”、“at”、“is”等词作为停用词去除;在中文中,可以将“的”、“是”、“在”等词作为停用词去除。For example, in English, words such as "the", "at", and "is" can be removed as stop words; in Chinese, words such as "的", "是", "在" can be used as stop words Remove.
在一个实施例中,所述对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词,包括:In one embodiment, the preprocessing of the question sentence input by the user and the multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively includes:
分别对用户输入的问句和预设的模板库中的多个问句模板进行切词;Separate the question sentence input by the user and multiple question sentence templates in the preset template library;
针对所述用户输入的问句或预设的模板库中的每一问句模板,在为该用户输入的问句或预设的模板库中的问句模板切得的词中获取存在于预设的关键词库中的词,得到所述问句或所述问句模板的关键词。For the question input by the user or each question template in the preset template library, the words that are cut from the question input for the user or the question template in the preset template library are obtained in advance. Set the words in the keyword library to obtain the keywords of the question sentence or the question sentence template.
本实施例的好处在于,通过预设的关键词库来限制获得的问句或问句模板的关键词的范围,使得得到的问句和所述问句模板的关键词都能够保持相对较高的质量。The advantage of this embodiment is that the range of keywords of the question or question template obtained is limited by the preset keyword library, so that the keywords of the obtained question and the question template can be kept relatively high the quality of.
在一个实施例中,所述针对所述用户输入的问句或预设的模板库中的每一问句模板,在为该用户输入的问句或预设的模板库中的问句模板切得的词中获取存在于预设的关键词库中的词,得到所述问句或所述问句模板的关键词,可以包括:In one embodiment, for the question input by the user or each question template in the preset template library, the question template in the question input for the user or the preset template library is selected. From the obtained words, obtaining the words existing in the preset keyword library, and obtaining the keywords of the question sentence or the question sentence template may include:
针对所述用户输入的问句或预设的模板库中的每一问句模板,针对为该用户输入的问句或预设的模板库中的问句模板切得的词中的每一词,判断该词是否存在于预设的关键词库中;如果是,则将该词作为所述问句或所述问句模板的关键词。For the question entered by the user or each question template in the preset template library, for each word in the question entered for the user or the question template in the preset template library , Judge whether the word exists in the preset keyword library; if so, use the word as the keyword of the question sentence or the question sentence template.
步骤220,将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度。Step 220: Input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each question template to the established semantic parser including the multi-layer word similarity judgment model To obtain the similarity between the question output by the semantic parser and each question template.
在一个实施例中,所述语义解析器中的多层词语相似度判定模型包括:In an embodiment, the multi-layer word similarity judgment model in the semantic parser includes:
杰卡德指数模型层,word2vec模型层,Glove模型层和C&W模型层。Jaccard index model layer, word2vec model layer, Glove model layer and C&W model layer.
上述的词语相似度判定模型都可以输出问句和模板关键词组对的相似度,一个模型层为一个词语相似度判定模型。The above-mentioned word similarity judgment model can output the similarity of question sentence and template keyword group pair, and one model layer is a word similarity judgment model.
具体而言,杰卡德指数模型层包含的杰卡德指数模型可以直接输出相似度,而word2vec模型层、Glove模型层和C&W模型层中的模型本身可能并不直接输出相似度,而是输出词向量,这些模型层除了模型之外,还可以包含相似度计算层,可用于计算词向量的相似度,从而可以使模型层输出问句和模板关键词组对的相似度,其中相似度计算层可以基于余弦距离或者欧式距离计算词向量的相似性。Specifically, the Jaccard index model contained in the Jaccard index model layer can directly output the similarity, while the models in the word2vec model layer, the Glove model layer and the C&W model layer may not directly output the similarity, but output Word vectors. In addition to the model, these model layers can also include a similarity calculation layer, which can be used to calculate the similarity of word vectors, so that the model layer can output the similarity of the question sentence and the template keyword group pair. The similarity calculation layer The similarity of word vectors can be calculated based on cosine distance or Euclidean distance.
应当注意,虽然在上述实施例中,词语相似度判定模型为一个模型层,有的模型层可以包含模型和相似度计算层,但此处仅为便于理解和描述起见,将模型和相似度计算层统称为词语相似度判定模型,在实际情况下,词语相似度判定模型可能并非是一个单独的模型,而是能完成词语相似度判定这一功能的单元或者组件的集合。It should be noted that although in the above embodiment, the word similarity judgment model is a model layer, and some model layers may include the model and the similarity calculation layer, but here is only for ease of understanding and description, the model and similarity calculation The layers are collectively called the word similarity judgment model. In actual situations, the word similarity judgment model may not be a separate model, but a collection of units or components that can complete the function of word similarity judgment.
在一个实施例中,所述语义解析器中除了杰卡德指数模型层、word2vec模型层、Glove模型层以及C&W模型层之外,还包括:特定领域词语相似度判断模型层。In one embodiment, in addition to the Jaccard index model layer, the word2vec model layer, the Glove model layer, and the C&W model layer, the semantic parser further includes: a word similarity judgment model layer in a specific field.
特定领域词语相似度判断模型可以是人为根据经验设置的在某些领域能够进行词语相似度的计算的模型,具体实现形式可以是任意的,比如可以通过预先建立特定领域的词语相关库来进行特定领域的词语相似度的判断。The word similarity judgment model in a specific field can be a model that can be calculated based on experience in certain fields. The specific implementation form can be arbitrary. For example, it can be specified by pre-establishing a word correlation database in a specific field. Judgment of similarity of words in the field.
本实施例的好处在于,通过增加一层词语相似度判定模型,在一定程度上可以提高所述语义解析器输出的所述问句与每一所述问句模板的相似度的准确性。The advantage of this embodiment is that by adding a layer of word similarity judgment model, the accuracy of the similarity between the question sentence output by the semantic parser and each question sentence template can be improved to a certain extent.
在一个实施例中,语义解析器输出的问句与每一问句模板的相似度的具体方式可以如图3所示。图3是根据图2对应实施例示出的一实施例的语义解析器输出问句与问句模板相似度的流程图,如图3所示,包括以下步骤:In an embodiment, the specific manner of the similarity between the question sentence output by the semantic parser and each question sentence template may be shown in FIG. 3. Fig. 3 is a flowchart of the semantic parser outputting question sentence and question sentence template similarity according to an embodiment of the embodiment corresponding to Fig. 2, as shown in Fig. 3, including the following steps:
步骤310,获取每一所述问句和模板关键词组对对应的关键词对矩阵。Step 310: Obtain a keyword pair matrix corresponding to each of the question sentence and the template keyword group pair.
在一个实施例中,关键词对矩阵中的每一元素为所述问句和问句模板的关键词的一种组合,关键词对矩阵中的所有元素包含了问句和问句模板的关键词的所有组合,问句的关 键词的数量为所述关键词对矩阵的行宽或列宽,对应地,问句模板的关键词的数量为所述关键词对矩阵的列宽或行宽。In an embodiment, each element in the keyword pair matrix is a combination of the keywords of the question sentence and the question template, and all elements in the keyword pair matrix contain the key of the question sentence and the question template. For all combinations of words, the number of keywords in the question sentence is the row width or column width of the keyword pair matrix. Correspondingly, the number of keywords in the question sentence template is the column width or row width of the keyword pair matrix. .
在一个实施例中,所述获取每一所述问句和模板关键词组对对应的关键词对矩阵,包括:In an embodiment, the obtaining the keyword pair matrix corresponding to each of the question sentence and the template keyword group pair includes:
从所述问句和模板关键词组对中的问句关键词组中的第一个问句关键词开始,按照所述问句关键词组中问句关键词的排序,针对所述问句关键词组中的每一问句关键词,在问句和模板关键词组对的模板关键词组中从第一个模板关键词开始按照所述模板关键词组中模板关键词的排序,依次获取模板关键词,并将获取的模板关键词与该问句关键词组成关键词对;Starting from the first question keyword in the question keyword group in the question sentence and the template keyword group pair, according to the order of the question sentence keywords in the question sentence keyword group, aiming at the question keyword group For each question keyword, in the template keyword group of the question sentence and the template keyword group, start from the first template keyword in the order of the template keywords in the template keyword group, get the template keywords in turn, and The obtained template keyword and the question keyword form a keyword pair;
获取所述问句和模板关键词组对中的问句关键词组中问句关键词的数目;Acquiring the number of question keywords in the question keyword group in the question sentence and the template keyword group pair;
对所述关键词对按照生成顺序进行排序,从第一个关键词对开始,每次在没获取过的关键词对中按照所述顺序获取所述数目个关键词对,作为关键词对矩阵中的第n行,其中,n为获取所述数目个关键词对的获取次数。The keyword pairs are sorted according to the generation order, starting from the first keyword pair, each time the number of keyword pairs that have not been obtained are obtained in the stated order, as a keyword pair matrix In the nth row in, where n is the number of times of obtaining the number of keyword pairs.
步骤320,分别利用所述语义解析器中的每一层词语相似度判定模型,对每一关键词对矩阵进行词语相似度判定,得到由每一层词语相似度判定模型针对每一所述问句和模板关键词组对输出的关键词对的相似度矩阵。Step 320: Use each layer of word similarity judgment model in the semantic parser to perform word similarity judgment on each keyword pair matrix, and obtain the word similarity judgment model for each layer of word similarity judgment for each question. The similarity matrix of the output keyword pairs of sentence and template keyword group.
如前所述,词语相似度判定模型可以为杰卡德指数模型层、word2vec模型层、Glove模型层、C&W模型层、特定领域词语相似度判断模型层等,利用词语相似度判定模型可以针对关键词对矩阵中的每一关键词对计算出相似度,从而得到相似度矩阵。As mentioned earlier, the word similarity judgment model can be the Jaccard index model layer, the word2vec model layer, the Glove model layer, the C&W model layer, the word similarity judgment model layer in a specific field, etc. The word similarity judgment model can be used for key The similarity is calculated for each keyword pair in the word pair matrix to obtain the similarity matrix.
步骤330,针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵。Step 330: For each of the question sentence and the template keyword group pair, the question sentence and the template are obtained based on the similarity matrix of the keyword pair of the question sentence and the template keyword group pair output by the word similarity judgment model of each layer The inferred similarity matrix of the keyword pair.
推断相似度矩阵是对各层词语相似度判定模型输出的相似度矩阵的结合,通过综合各层词语相似度判定模型的输出结果,能够使得获得的推断相似度矩阵可信度和准确度更高。The inferred similarity matrix is a combination of the similarity matrices output by the word similarity judgment model of each layer. By integrating the output results of the word similarity judgment model of each layer, the obtained inferred similarity matrix can be more reliable and accurate. .
在一个实施例中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:In an embodiment, each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and for each of the question sentence and template keyword group pair, based on The similarity matrix of the keyword pairs of the question sentence and the template keyword group pair output by the layer word similarity judgment model, to obtain the inferred similarity matrix of the question sentence and the template keyword group pair, including:
针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,对由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素从大到小进行排序;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer is compared with This keyword sorts the corresponding elements from large to small;
针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,在由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素中,获取排序在前第一预定数目的关键词对对应的元素,作为推断相似度矩阵获取元素;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, in the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer and Among the elements corresponding to the keyword pairs, the elements corresponding to the first predetermined number of keyword pairs in the first order are obtained as the inferred similarity matrix obtaining elements;
针对每一所述问句和模板关键词组对,基于获取的所述推断相似度矩阵获取元素,获取推断相似度矩阵。For each pair of the question and template keyword, elements are obtained based on the obtained inferred similarity matrix to obtain an inferred similarity matrix.
推断相似度矩阵获取元素是用来从中获取构建推断相似度矩阵的元素。The inferred similarity matrix acquisition element is used to obtain elements from which to construct the inferred similarity matrix.
在一个实施例中,所述针对每一所述问句和模板关键词组对,基于获取的所述推断相似度矩阵获取元素,获取推断相似度矩阵,包括:In one embodiment, for each pair of the question and template keyword, obtaining elements based on the obtained inferred similarity matrix to obtain an inferred similarity matrix includes:
针对该问句和模板关键词组对对应的每一关键词对,确定各个推断相似度矩阵获取元素的方差;For each keyword pair corresponding to the question and template keyword group pair, determine the variance of each inferred similarity matrix to obtain elements;
在所述方差大于预定方差阈值的情况下,获取针对该关键词对对应的元素的最大值;In the case where the variance is greater than the predetermined variance threshold, obtain the maximum value of the element corresponding to the keyword pair;
在所述方差不大于预定方差阈值的情况下,在排序在前第二预定数目的针对该关键词 对对应的元素中任取一个元素,所述第二预定数目小于第一预定数目;In the case that the variance is not greater than the predetermined variance threshold, any one element is selected from the elements corresponding to the keyword pair in the second predetermined number in the first order, and the second predetermined number is less than the first predetermined number;
对针对各个关键词对获取的对应元素的最大值和/或任取的元素,按照各个对应的元素在相似度矩阵中的排序组成相似度矩阵,作为该问句和模板关键词组对的推断相似度矩阵。For the maximum value of the corresponding elements obtained for each keyword pair and/or arbitrary elements, the similarity matrix is formed according to the order of each corresponding element in the similarity matrix, as the inference similarity of the question and the template keyword group pair Degree matrix.
由于各个关键词对对应的元素的值的大小并不一定能完全准确反映最终获得的推断相似度矩阵中元素的值,特别是当同一关键词对对应的各个元素的值相差不大时,用关键词对对应的元素的最大值来构建推断相似度矩阵可能并不合理。所以本实施例的好处在于,通过在方差不大于预定方差阈值时,即在同一关键词对对应的各个元素的值足够相似的情况下,在该关键词对对应的各个元素的值排名足够靠前的元素中任选一个元素,用于构建该问句和模板关键词组对的推断相似度矩阵,提高了构建出的推断相似度矩阵的随机性和公平性,并可能在一定程度上提高获取的推断相似度矩阵的准确性。Since the value of each keyword to the corresponding element may not completely accurately reflect the value of the element in the final inferred similarity matrix, especially when the value of each element corresponding to the same keyword is not much different, use It may be unreasonable for keywords to construct an inferred similarity matrix based on the maximum value of corresponding elements. Therefore, the advantage of this embodiment is that when the variance is not greater than the predetermined variance threshold, that is, when the value of each element corresponding to the same keyword is sufficiently similar, the value of each element corresponding to the keyword is ranked sufficiently reliable. Any one of the previous elements is used to construct the inferred similarity matrix of the question and template keyword group pair, which improves the randomness and fairness of the constructed inferred similarity matrix, and may improve the acquisition to a certain extent The accuracy of the inferred similarity matrix.
在一个实施例中,所述针对每一所述问句和模板关键词组对,基于获取的所述推断相似度矩阵获取元素,获取推断相似度矩阵,包括:In one embodiment, for each pair of the question and template keyword, obtaining elements based on the obtained inferred similarity matrix to obtain an inferred similarity matrix includes:
针对该问句和模板关键词组对对应的每一关键词对,确定各个推断相似度矩阵获取元素的平均值;For each keyword pair corresponding to the question and template keyword group pair, determine the average value of each inferred similarity matrix to obtain elements;
针对该问句和模板关键词组对对应的每一关键词对,获取该关键词对对应的推断相似度矩阵获取元素的最大值和所述平均值之差与所述最大值的比值;For each keyword pair corresponding to the question and template keyword group pair, obtain the inferred similarity matrix corresponding to the keyword pair to obtain the maximum value of the element and the ratio of the difference between the average value and the maximum value;
在所述比值小于预定比值阈值的情况下,获取针对该关键词对对应的元素的最大值;In the case that the ratio is less than the predetermined ratio threshold, obtain the maximum value of the corresponding element for the keyword;
在所述比值大于或等于预定比值阈值的情况下,获取针对该关键词对对应的各个推断相似度矩阵获取元素的平均值;In the case that the ratio is greater than or equal to a predetermined ratio threshold, obtaining an average value of elements for each inferred similarity matrix corresponding to the keyword pair;
对针对各个关键词对获取的对应元素的最大值和/或推断相似度矩阵获取元素的平均值,按照各个对应的元素在相似度矩阵中的排序组成相似度矩阵,作为该问句和模板关键词组对的推断相似度矩阵。Obtain the maximum value of the corresponding elements for each keyword pair and/or the average value of the inferred similarity matrix, and form the similarity matrix according to the order of each corresponding element in the similarity matrix, as the key to the question and template Inferred similarity matrix of phrase pairs.
推断相似度矩阵获取元素的平均值反映了推断相似度矩阵获取元素的集中趋势,在本实施例中,在各个推断相似度矩阵获取元素的平均值与关键词对对应的元素的最大值差距不大的情况下通过获取各个推断相似度矩阵获取元素的平均值来构建相似度矩阵,使得基于各个推断相似度矩阵获取元素的平均值得到的相似度矩阵中的元素能够更好反映各层词语相似度判定模型的判定结果,提高了构建推断相似度矩阵的公平性,同时,由于选择了关键词对对应的推断相似度矩阵获取元素的最大值和所述平均值之差与所述最大值的比值作为确定是否通过获取各个推断相似度矩阵获取元素的平均值来构建相似度矩阵的条件,无论推断相似度矩阵获取元素的最大值是否为极端值,该条件在一定程度上都能对是否通过获取各个推断相似度矩阵获取元素的平均值来构建相似度矩阵进行很好的区分,提高了该条件的泛化能力和普适性。The average value of the elements obtained by the inferred similarity matrix reflects the central tendency of the elements obtained by the inferred similarity matrix. In this embodiment, the difference between the average value of the elements obtained by the inferred similarity matrix and the maximum value of the corresponding element of the keyword pair is different. In large cases, the similarity matrix is constructed by obtaining the average value of each inferred similarity matrix, so that the elements in the similarity matrix obtained by obtaining the average value of the elements based on each inferred similarity matrix can better reflect the similarity of words in each layer The judgment result of the degree judgment model improves the fairness of constructing the inferred similarity matrix. At the same time, because the keyword pair corresponding to the inferred similarity matrix is selected, the maximum value of the element and the difference between the average value and the maximum value are obtained. The ratio is used as a condition for determining whether to construct the similarity matrix by obtaining the average value of the elements obtained by each inferred similarity matrix. Whether the maximum value of the elements obtained by the inferred similarity matrix is an extreme value, this condition can be passed or not to a certain extent. Obtain each inferred similarity matrix to obtain the average value of the elements to construct the similarity matrix for good distinction, which improves the generalization ability and universality of the condition.
图5是根据一示例性实施例示出的语义解析器输出推断相似度矩阵的逻辑框架示意图。如图5所示,包括关键词对矩阵510、第一层词语相似度判定模型520、第二层词语相似度判定模型530、第三层词语相似度判定模型540、相似度矩阵550以及推断相似度矩阵560,其中三层词语相似度判定模型都属于语义解析器500,每一层词语相似度判定模型引出的箭头所指向的相似度矩阵即为每一层词语相似度判定模型输出的相似度矩阵,而推断相似度矩阵560为基于各层词语相似度判定模型输出的相似度矩阵得到的。Fig. 5 is a schematic diagram showing a logical framework of a semantic parser outputting an inferred similarity matrix according to an exemplary embodiment. As shown in FIG. 5, it includes a keyword pair matrix 510, a first-level word similarity determination model 520, a second-level word similarity determination model 530, a third-level word similarity determination model 540, a similarity matrix 550, and inferred similarity The degree matrix 560, in which the three-layer word similarity judgment model belongs to the semantic parser 500, the similarity matrix pointed by the arrow pointed by the word similarity judgment model of each layer is the similarity output of the word similarity judgment model of each layer The inferred similarity matrix 560 is obtained based on the similarity matrix output by the word similarity judgment model of each layer.
步骤340,针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度。Step 340, for each question sentence and template keyword group pair, obtain the question sentence and question sentence corresponding to the question sentence and the template keyword group pair according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair The similarity of the template.
在一个实施例中,所述针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度,包括:In one embodiment, for each question sentence and template keyword group pair, according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair, obtain the corresponding question sentence and the template keyword group pair The similarity between question and question template, including:
通过下列公式根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度:According to the similarity in the inferred similarity matrix of the question sentence and the template keyword group pair, the similarity of the question sentence and the question template corresponding to the question sentence and the template keyword group pair is obtained by the following formula:
Figure PCTCN2020098948-appb-000001
Figure PCTCN2020098948-appb-000001
Figure PCTCN2020098948-appb-000002
Figure PCTCN2020098948-appb-000002
其中,S1和S2分别是所述问句和所述问句模板,Sim(S1,S2)为所述问句与所述问句模板的相似度,w i为所述问句中的关键词,而w j为所述问句模板中的关键词,maxSim(w j)为所述问句模板的各关键词与所述问句中的关键词w i的相似度中的最大值,maxSim(w i)为所述问句中的各关键词与所述问句中的关键词w j的相似度中的最大值,idf(w i)为所述问句中的关键词w i逆向文件频率,idf(w j)为所述问句中的关键词w j逆向文件频率,D为所有所述问句模板的数目,
Figure PCTCN2020098948-appb-000003
为存在所述问句中的关键词w i的问句模板的数目。
Wherein, S1 and S2 are the question sentence and the question sentence template respectively, Sim(S1, S2) is the similarity between the question sentence and the question sentence template, and w i is the key word in the question sentence , And w j is the keyword in the question template, maxSim(w j ) is the maximum value of the similarity between each keyword in the question template and the keyword w i in the question, maxSim (w i ) is the maximum value of the similarity between each keyword in the question sentence and the keyword w j in the question sentence, idf(w i ) is the reverse of the keyword w i in the question sentence File frequency, idf(w j ) is the reverse file frequency of the keyword w j in the question sentence, D is the number of all question sentence templates,
Figure PCTCN2020098948-appb-000003
Is the number of question template for the keyword w i in the question sentence.
下面继续参考图2所示,步骤220之后还包括以下步骤:Next, referring to FIG. 2 continuously, the following steps are included after step 220:
步骤230,根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板。Step 230: Obtain a standard question template from all question templates in a preset template library according to the similarity.
在一个实施例中,所述根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板,包括:在预设的模板库中的所有问句模板中获取所述相似度最大的问句模板,作为标准问句模板。In one embodiment, the obtaining the standard question template from all question templates in the preset template library according to the similarity includes: obtaining the standard question template from all question templates in the preset template library The question template with the greatest similarity is used as the standard question template.
在一个实施例中,根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板,包括:对预设的模板库中的所有问句模板按照所述相似度从大到小进行排序;在预设的模板库中的所有问句模板中获取所述相似度排序在前预定数目的问句模板,作为标准问句模板。In one embodiment, obtaining a standard question template from all question templates in a preset template library according to the similarity includes: obtaining all question templates in the preset template library according to the similarity from Sorting from large to small; obtaining the predetermined number of question template in the order of similarity from all question templates in the preset template library, as the standard question template.
在一个实施例中,所述根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板,包括:In one embodiment, the obtaining standard question template from all question template in a preset template library according to the similarity includes:
对预设的模板库中的所有问句模板按照所述相似度从大到小进行排序;Sort all question templates in the preset template library from large to small according to the similarity;
在所述相似度大于预设相似度阈值的问句模板的数目大于或等于预定数目阈值的情况下,将排序小于或等于预定数目阈值的问句模板,作为标准问句模板;In the case that the number of question templates whose similarity is greater than the preset similarity threshold is greater than or equal to the predetermined number threshold, the question templates ranked less than or equal to the predetermined number threshold are used as standard question templates;
在所述相似度大于预设相似度阈值的问句模板的数目小于预定数目阈值的情况下,将所述相似度大于预设相似度阈值的问句模板,作为标准问句模板。In the case that the number of question template with the similarity greater than the preset similarity threshold is less than the predetermined number threshold, the question template with the similarity greater than the preset similarity threshold is used as the standard question template.
本实施例的好处在于,通过限定获取的标准模板的数目,能够提高输出问题答案的效率,同时,通过将预设相似度阈值作为选择标准问句模板的基准,提高了获取的标准问题模板的准确率。The advantage of this embodiment is that by limiting the number of obtained standard templates, the efficiency of outputting question answers can be improved. At the same time, by using the preset similarity threshold as the basis for selecting standard question templates, the standard question templates obtained are improved. Accuracy.
步骤240,确定与所述标准问句模板对应的问题答案并将所述问题答案输出。Step 240: Determine the question answer corresponding to the standard question template and output the question answer.
在一个实施例中,预设的数据库中保存着与每一问句模板对应的问题答案,所述确定与所述标准问句模板对应的问题答案并将所述问题答案输出,包括:通过查询预设的数据库获取与所述标准问句模板对应的问题答案并将所述问题答案输出。In one embodiment, the question answer corresponding to each question template is stored in the preset database. The determining the question answer corresponding to the standard question template and outputting the question answer includes: querying The preset database obtains the answer to the question corresponding to the standard question template and outputs the answer to the question.
在一个实施例中,所述确定与所述标准问句模板对应的问题答案并将所述问题答案输出,包括:通过调用预设的问题答案查询接口,获取与所述标准问句模板对应的问题答案并将所述问题答案输出。In one embodiment, the determining the question answer corresponding to the standard question template and outputting the question answer includes: obtaining the answer corresponding to the standard question template by calling a preset question answer query interface The answer to the question and the answer to the question are output.
在一个实施例中,问题答案数据以RDF(Resource Description Framework,资源描述框架)数据的形式存储,所述确定与所述标准问句模板对应的问题答案并将所述问题答案输出,包括:In one embodiment, the question answer data is stored in the form of RDF (Resource Description Framework) data, and the determining the question answer corresponding to the standard question template and outputting the question answer includes:
通过SPARQL语句查询以RDF数据的形式存储的问题答案数据,得到与所述标准问句模板对应的问题答案并将所述问题答案输出。The question answer data stored in the form of RDF data is queried through the SPARQL sentence to obtain the question answer corresponding to the standard question template and output the question answer.
在一个实施例中,确定与所述标准问句模板对应的问题答案并将所述问题答案输出,包括:In one embodiment, determining the question answer corresponding to the standard question template and outputting the question answer includes:
确定与所述标准问句模板对应的问题答案;获取问题答案模板;将所述问题答案注入至所述问题答案模板进行输出。Determine the question answer corresponding to the standard question template; obtain the question answer template; inject the question answer into the question answer template for output.
综上所述,根据图2实施例提供的智能问答方法,在分别获取到用户输入的问句和预设的问句模板的关键词后,通过构建关键词组对并输入至已经构建好的包括多层词语相似度判定模型的语义解析器,可以得到更为准确的问句与问句模型的相似度输出结果,然后根据该结果可以得到与用户输入的问句最匹配的问句模板,从而可以针对用户的问题得到与该问题最匹配的答案,提高了问题和答案匹配效率和准确率,进而提高了匹配效果。To sum up, according to the intelligent question and answer method provided by the embodiment of FIG. 2, after obtaining the user input question and the keywords of the preset question template, the keyword group pair is constructed and input to the already constructed including The semantic parser of the multi-layer word similarity judgment model can obtain a more accurate output result of the similarity between the question and the question model, and then according to the result, the question template that best matches the question input by the user can be obtained, thereby The answer that best matches the question can be obtained for the user’s question, which improves the efficiency and accuracy of question and answer matching, thereby improving the matching effect.
图4是根据图3对应实施例示出的一实施例的步骤330的细节的流程图。如图4所示,可以包括以下步骤:FIG. 4 is a flowchart showing details of step 330 in an embodiment according to the embodiment corresponding to FIG. 3. As shown in Figure 4, the following steps can be included:
步骤331,针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,获取由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素的最大值。Step 331: For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, obtain the similarity of the keyword pairs output by the word similarity judgment model of each layer The maximum value of the element corresponding to the keyword pair in the matrix.
在一个实施例中,通过冒泡排序的第一轮获取所述最大值。In one embodiment, the maximum value is obtained through the first round of bubble sorting.
步骤332,将针对该问句和模板关键词组对对应的每一关键词对获取的所述对应的元素的最大值,按照各个对应的元素在相似度矩阵中的排序组成相似度矩阵,作为该问句和模板关键词组对的推断相似度矩阵。In step 332, the maximum value of the corresponding element obtained for each keyword pair corresponding to the question sentence and the template keyword group pair is formed into a similarity matrix according to the order of each corresponding element in the similarity matrix, as the Inferred similarity matrix of question sentence and template keyword group pair.
继续参考图5所示,推断相似度矩阵560中的每一元素都是各层词语相似度判定模型输出的关键词对的相似度矩阵中与关键词对对应的元素的最大值,比如,对于A1B1这一关键词对,三层词语相似度判定模型输出的相似度矩阵与该关键词对对应的元素分别为0.85、0.84和0.88,而推断相似度矩阵中与该关键词对对应的元素即为三者中的最大值0.88。Continuing to refer to FIG. 5, each element in the inferred similarity matrix 560 is the maximum value of the element corresponding to the keyword pair in the similarity matrix of the keyword pair output by the word similarity judgment model of each layer, for example, for For the keyword pair A1B1, the similarity matrix output by the three-layer word similarity judgment model and the corresponding elements of the keyword pair are 0.85, 0.84, and 0.88, respectively, and the element corresponding to the keyword pair in the inferred similarity matrix is The maximum value of the three is 0.88.
本实施例的好处在于,使推断相似度矩阵中元素的获取都有公平的标准,保证了推断相似度矩阵的获取精度。The advantage of this embodiment is that the acquisition of elements in the inferred similarity matrix has fair standards, and the accuracy of the inferred similarity matrix is guaranteed.
本申请还提供了一种智能问答装置,以下是本申请的装置实施例。This application also provides an intelligent question answering device. The following are device embodiments of this application.
图6是根据一示例性实施例示出的一种智能问答装置的框图。如图6所示,装置600包括:Fig. 6 is a block diagram showing an intelligent question answering device according to an exemplary embodiment. As shown in FIG. 6, the apparatus 600 includes:
关键词获取模块610,被配置为对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;The keyword acquisition module 610 is configured to preprocess the question input by the user and a plurality of question template in a preset template library to obtain the keywords of the question sentence and the question template respectively;
相似度获取模块620,被配置为将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;The similarity acquisition module 620 is configured to input the question sentence and the template keyword group pair composed of the keywords of the question sentence and the keywords of each of the question sentence templates into the established similarity degree including multiple layers of words Determine the semantic parser of the model to obtain the similarity between the question sentence output by the semantic parser and each question template;
模板获取模块630,被配置为根据所述相似度在预设的模板库中的所有问句模板中获取标准问句模板;The template obtaining module 630 is configured to obtain a standard question template from all question templates in a preset template library according to the similarity;
输出模块640,被配置为确定与所述标准问句模板对应的问题答案并将所述问题答案输出。The output module 640 is configured to determine the answer to the question corresponding to the standard question template and output the answer to the question.
据本申请的第三方面,还提供了一种能够实现上述方法的电子设备。According to the third aspect of the present application, there is also provided an electronic device capable of implementing the above method.
所属技术领域的技术人员能够理解,本申请的各个方面可以实现为系统、方法或程序产品。因此,本申请的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。Those skilled in the art can understand that various aspects of the present application can be implemented as a system, method, or program product. Therefore, each aspect of the present application can be specifically implemented in the following forms, namely: complete hardware implementation, complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software implementations, which can be collectively referred to herein as "Circuit", "Module" or "System".
下面参照图7来描述根据本申请的这种实施方式的电子设备700。图7显示的电子设 备700仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。The electronic device 700 according to this embodiment of the present application will be described below with reference to FIG. 7. The electronic device 700 shown in FIG. 7 is only an example, and should not bring any limitation to the function and use scope of the embodiments of the present application.
如图7所示,电子设备700以通用计算设备的形式表现。电子设备700的组件可以包括但不限于:上述至少一个处理单元710、上述至少一个存储单元720、连接不同系统组件(包括存储单元720和处理单元710)的总线730。As shown in FIG. 7, the electronic device 700 is represented in the form of a general computing device. The components of the electronic device 700 may include, but are not limited to: the aforementioned at least one processing unit 710, the aforementioned at least one storage unit 720, and a bus 730 connecting different system components (including the storage unit 720 and the processing unit 710).
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元710执行,使得所述处理单元710执行本说明书上述“实施例方法”部分中描述的根据本申请各种示例性实施方式的步骤。Wherein, the storage unit stores program code, and the program code can be executed by the processing unit 710, so that the processing unit 710 executes the various exemplary methods described in the "Embodiment Method" section of this specification. Implementation steps.
存储单元720可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)721和/或高速缓存存储单元722,还可以可选地包括只读存储单元(ROM)723。The storage unit 720 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 721 and/or a cache storage unit 722, and may also optionally include a read-only storage unit (ROM) 723.
存储单元720还可以包括具有一组(至少一个)程序模块725的程序/实用工具724,这样的程序模块725包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。The storage unit 720 may also include a program/utility tool 724 having a set of (at least one) program module 725. Such program module 725 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
总线730可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。The bus 730 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
电子设备700也可以与一个或多个外部设备900(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备700交互的设备通信,和/或与使得该电子设备700能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口750进行。并且,电子设备700还可以通过网络适配器760与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器760通过总线730与电子设备700的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备700使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The electronic device 700 can also communicate with one or more external devices 900 (such as keyboards, pointing devices, Bluetooth devices, etc.), and can also communicate with one or more devices that enable users to interact with the electronic device 700, and/or communicate with Any device (such as a router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. This communication can be performed through an input/output (I/O) interface 750. In addition, the electronic device 700 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 760. As shown in the figure, the network adapter 760 communicates with other modules of the electronic device 700 through the bus 730. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本申请实施方式的方法。Through the description of the foregoing embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, server, terminal device, or network device, etc.) execute the method according to the embodiment of the present application.
根据本申请的第四方面,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施方式中,本申请的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本申请各种示例性实施方式的步骤。According to the fourth aspect of the present application, there is also provided a computer-readable storage medium on which is stored a program product capable of implementing the above method of this specification. In some possible implementation manners, various aspects of the present application can also be implemented in the form of a program product, which includes program code. When the program product runs on a terminal device, the program code is used to enable the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.
参考图8所示,描述了根据本申请的实施方式的用于实现上述方法的程序产品800,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Referring to FIG. 8, a program product 800 for implementing the above method according to an embodiment of the present application is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be stored in a terminal device, For example, running on a personal computer. However, the program product of this application is not limited to this. In this document, the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or combined with an instruction execution system, device, or device.
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读 存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product can use any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。所述计算机可读存储介质可以是非易失性,也可以是易失性。The computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device. The computer-readable storage medium may be non-volatile or volatile.
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。The program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。The program code used to perform the operations of this application can be written in any combination of one or more programming languages. The programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming language-such as "C" language or similar programming language. The program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on. In the case of a remote computing device, the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or can be connected to an external computing device (for example, using Internet service providers) Business to connect via the Internet).
此外,上述附图仅是根据本申请示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。In addition, the above-mentioned drawings are only schematic illustrations of the processing included in the method according to the exemplary embodiments of the present application, and are not intended for limitation. It is easy to understand that the processing shown in the above drawings does not indicate or limit the time sequence of these processings. In addition, it is easy to understand that these processes can be executed synchronously or asynchronously in multiple modules, for example.
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围执行各种修改和改变。本申请的范围仅由所附的权利要求来限制。It should be understood that the present application is not limited to the precise structure that has been described above and shown in the drawings, and various modifications and changes can be performed without departing from its scope. The scope of the application is only limited by the appended claims.

Claims (20)

  1. 一种智能问答方法,其中,所述方法包括:An intelligent question answering method, wherein the method includes:
    对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;Preprocessing the question sentence input by the user and multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
    将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;Input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each question template into the established semantic parser including the multi-layer word similarity judgment model to obtain the The similarity between the question output from the semantic parser and each question template;
    根据所述相似度在所述预设的模板库中的所有问句模板中获取标准问句模板;Obtaining a standard question template from all question templates in the preset template library according to the similarity;
    确定与所述标准问句模板对应的问题答案并将所述问题答案输出。Determine the answer to the question corresponding to the standard question template and output the answer to the question.
  2. 根据权利要求1所述的方法,其中,所述语义解析器通过如下方式根据输入至该语义解析器的由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输出所述问句与每一所述问句模板的相似度:4. The method according to claim 1, wherein the semantic parser is input to the semantic parser according to the question composed of the keywords of the question sentence and the keywords of each question template To output the similarity between the question sentence and each question sentence template with the template keyword group:
    获取每一所述问句和模板关键词组对对应的关键词对矩阵;Acquire the keyword pair matrix corresponding to each of the question sentence and the template keyword pair;
    分别利用所述语义解析器中的每一层词语相似度判定模型,对每一关键词对矩阵进行词语相似度判定,得到由每一层词语相似度判定模型针对每一所述问句和模板关键词组对输出的关键词对的相似度矩阵;Each layer of word similarity judgment model in the semantic parser is used to make word similarity judgment for each keyword pair matrix, and each layer of word similarity judgment model is obtained for each question and template. The similarity matrix of the keyword pairs output by the keyword group;
    针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵;For each of the question sentence and template keyword group pair, the question sentence and the template keyword group pair are obtained based on the similarity matrix between the question sentence and the template keyword group pair output by the word similarity judgment model of each layer Inferred similarity matrix;
    针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度。For each question sentence and template keyword group pair, according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair, obtain the similarity of the question sentence and the question sentence template corresponding to the question sentence and the template keyword group pair degree.
  3. 根据权利要求2所述的方法,其中,所述获取每一所述问句和模板关键词组对对应的关键词对矩阵,包括:3. The method according to claim 2, wherein said obtaining a keyword pair matrix corresponding to each of said question sentence and template keyword group pair comprises:
    从所述问句和模板关键词组对中的问句关键词组中的第一个问句关键词开始,按照所述问句关键词组中问句关键词的排序,针对所述问句关键词组中的每一问句关键词,在问句和模板关键词组对的模板关键词组中从第一个模板关键词开始按照所述模板关键词组中模板关键词的排序,依次获取模板关键词,并将获取的模板关键词与该问句关键词组成关键词对;Starting from the first question keyword in the question keyword group in the question sentence and the template keyword group pair, according to the order of the question sentence keywords in the question sentence keyword group, aiming at the question keyword group For each question keyword, in the template keyword group of the question sentence and the template keyword group, start from the first template keyword in the order of the template keywords in the template keyword group, get the template keywords in turn, and The obtained template keyword and the question keyword form a keyword pair;
    获取所述问句和模板关键词组对中的问句关键词组中问句关键词的数目;Acquiring the number of question keywords in the question keyword group in the question sentence and the template keyword group pair;
    对所述关键词对按照生成顺序进行排序,从第一个关键词对开始,每次在没获取过的关键词对中按照所述顺序获取所述数目个关键词对,作为关键词对矩阵中的第n行,其中,n为获取所述数目个关键词对的获取次数。The keyword pairs are sorted according to the generation order, starting from the first keyword pair, each time the number of keyword pairs that have not been obtained are obtained in the stated order, as a keyword pair matrix In the nth row in, where n is the number of times of obtaining the number of keyword pairs.
  4. 根据权利要求2所述的方法,其中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:The method according to claim 2, wherein each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and each of the question sentence and template keyword group Yes, based on the similarity matrix of the keyword pair between the question and the template keyword set output by the word similarity judgment model of each layer, the inferred similarity matrix of the question and the template keyword set is obtained, including:
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,获取由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素的最大值;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, obtain the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer. The maximum value of the corresponding element of the keyword;
    将针对该问句和模板关键词组对对应的每一关键词对获取的所述对应的元素的最大值,按照各个对应的元素在相似度矩阵中的排序组成相似度矩阵,作为该问句和模板关键词组对的推断相似度矩阵。The maximum value of the corresponding element obtained for each keyword pair corresponding to the question and the template keyword group pair is formed into a similarity matrix according to the ordering of each corresponding element in the similarity matrix, as the question sum Inferred similarity matrix of template keyword group pairs.
  5. 根据权利要求2所述的方法,其中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各 层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:The method according to claim 2, wherein each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and each of the question sentence and template keyword group Yes, based on the similarity matrix of the keyword pair between the question and the template keyword set output by the word similarity judgment model of each layer, the inferred similarity matrix of the question and the template keyword set is obtained, including:
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,对由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素从大到小进行排序;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer is compared with This keyword sorts the corresponding elements from large to small;
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,在由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素中,获取排序在前第一预定数目的关键词对对应的元素,作为推断相似度矩阵获取元素;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, in the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer and Among the elements corresponding to the keyword pairs, the elements corresponding to the first predetermined number of keyword pairs in the first order are obtained as the inferred similarity matrix obtaining elements;
    针对每一所述问句和模板关键词组对,基于获取的所述推断相似度矩阵获取元素,获取推断相似度矩阵。For each pair of the question and template keyword, elements are obtained based on the obtained inferred similarity matrix to obtain an inferred similarity matrix.
  6. 根据权利要求2所述的方法,其中,所述针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度,包括:The method according to claim 2, wherein, for each question sentence and template keyword group pair, the question sentence and the template are obtained according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair The similarity of the keyword group to the corresponding question sentence and question template, including:
    通过下列公式根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度:According to the similarity in the inferred similarity matrix of the question sentence and the template keyword group pair, the similarity of the question sentence and the question template corresponding to the question sentence and the template keyword group pair is obtained by the following formula:
    Figure PCTCN2020098948-appb-100001
    Figure PCTCN2020098948-appb-100001
    Figure PCTCN2020098948-appb-100002
    Figure PCTCN2020098948-appb-100002
    其中,S1和S2分别是所述问句和所述问句模板,Sim(S1,S2)为所述问句与所述问句模板的相似度,w i为所述问句中的关键词,而w j为所述问句模板中的关键词,maxSim(w j)为所述问句模板的各关键词与所述问句中的关键词w i的相似度中的最大值,maxSim(w i)为所述问句中的各关键词与所述问句中的关键词w j的相似度中的最大值,idf(w i)为所述问句中的关键词w i逆向文件频率,idf(w j)为所述问句中的关键词w j逆向文件频率,D为所有所述问句模板的数目,
    Figure PCTCN2020098948-appb-100003
    为存在所述问句中的关键词w i的问句模板的数目。
    Wherein, S1 and S2 are the question sentence and the question sentence template respectively, Sim(S1, S2) is the similarity between the question sentence and the question sentence template, and w i is the key word in the question sentence , And w j is the keyword in the question template, maxSim(w j ) is the maximum value of the similarity between each keyword in the question template and the keyword w i in the question, maxSim (w i ) is the maximum value of the similarity between each keyword in the question sentence and the keyword w j in the question sentence, idf(w i ) is the reverse of the keyword w i in the question sentence File frequency, idf(w j ) is the reverse file frequency of the keyword w j in the question sentence, D is the number of all question sentence templates,
    Figure PCTCN2020098948-appb-100003
    Is the number of question template for the keyword w i in the question sentence.
  7. 根据权利要求1所述的方法,其中,所述根据所述相似度在所述预设的模板库中的所有问句模板中获取标准问句模板,包括:The method according to claim 1, wherein said obtaining a standard question template from all question templates in said preset template library according to said similarity comprises:
    对所述预设的模板库中的所有问句模板按照所述相似度从大到小进行排序;Sorting all question templates in the preset template library in descending order of the similarity;
    在所述相似度大于预设相似度阈值的问句模板的数目大于或等于预定数目阈值的情况下,将排序小于或等于所述预定数目阈值的问句模板,作为标准问句模板;In a case where the number of question template whose similarity is greater than the preset similarity threshold is greater than or equal to a predetermined number threshold, the question template whose ranking is less than or equal to the predetermined number threshold is used as the standard question template;
    在所述相似度大于所述预设相似度阈值的问句模板的数目小于所述预定数目阈值的情况下,将所述相似度大于所述预设相似度阈值的问句模板,作为所述标准问句模板。In the case that the number of question templates whose similarity is greater than the preset similarity threshold is less than the predetermined number threshold, the question templates with the similarity greater than the preset similarity threshold are used as the Standard question template.
  8. 一种智能问答装置,其中,所述装置包括:An intelligent question answering device, wherein the device includes:
    关键词获取模块,被配置为对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;The keyword acquisition module is configured to preprocess the question sentence input by the user and a plurality of question sentence templates in a preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
    相似度获取模块,被配置为将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;The similarity acquisition module is configured to input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each of the question sentence templates to the established similarity determination including multi-layer words The semantic parser of the model obtains the similarity between the question sentence output by the semantic parser and each question template;
    模板获取模块,被配置为根据所述相似度在所述预设的模板库中的所有问句模板中获取标准问句模板;A template obtaining module configured to obtain a standard question template from all question templates in the preset template library according to the similarity;
    输出模块,被配置为确定与所述标准问句模板对应的问题答案并将所述问题答案输出。The output module is configured to determine a question answer corresponding to the standard question template and output the question answer.
  9. 一种电子设备,其中,所述电子设备包括:An electronic device, wherein the electronic device includes:
    处理器;processor;
    存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,实现智能问答方法,所述智能问答方法具体包括如下步骤:A memory, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, an intelligent question-and-answer method is implemented. The intelligent question-and-answer method specifically includes the following steps:
    对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;Preprocessing the question sentence input by the user and multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
    将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;Input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each question template into the established semantic parser including the multi-layer word similarity judgment model to obtain the The similarity between the question output from the semantic parser and each question template;
    根据所述相似度在所述预设的模板库中的所有问句模板中获取标准问句模板;Obtaining a standard question template from all question templates in the preset template library according to the similarity;
    确定与所述标准问句模板对应的问题答案并将所述问题答案输出。Determine the answer to the question corresponding to the standard question template and output the answer to the question.
  10. 根据权利要求9所述的电子设备,其中,所述语义解析器通过如下方式根据输入至该语义解析器的由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输出所述问句与每一所述问句模板的相似度:9. The electronic device according to claim 9, wherein the semantic parser is inputted to the semantic parser according to the question consisting of the keywords of the question sentence and the keywords of each question template: Sentence and template keyword group pairs output the similarity between the question sentence and each question template:
    获取每一所述问句和模板关键词组对对应的关键词对矩阵;Acquire the keyword pair matrix corresponding to each of the question sentence and the template keyword pair;
    分别利用所述语义解析器中的每一层词语相似度判定模型,对每一关键词对矩阵进行词语相似度判定,得到由每一层词语相似度判定模型针对每一所述问句和模板关键词组对输出的关键词对的相似度矩阵;Each layer of word similarity judgment model in the semantic parser is used to make word similarity judgment for each keyword pair matrix, and each layer of word similarity judgment model is obtained for each question and template. The similarity matrix of the keyword pairs output by the keyword group;
    针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵;For each of the question sentence and template keyword group pair, the question sentence and the template keyword group pair are obtained based on the similarity matrix between the question sentence and the template keyword group pair output by the word similarity judgment model of each layer Inferred similarity matrix;
    针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度。For each question sentence and template keyword group pair, according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair, obtain the similarity of the question sentence and the question sentence template corresponding to the question sentence and the template keyword group pair degree.
  11. 根据权利要求10所述的电子设备,其中,所述获取每一所述问句和模板关键词组对对应的关键词对矩阵,包括:11. The electronic device according to claim 10, wherein said obtaining the keyword pair matrix corresponding to each of the question sentence and the template keyword group pair comprises:
    从所述问句和模板关键词组对中的问句关键词组中的第一个问句关键词开始,按照所述问句关键词组中问句关键词的排序,针对所述问句关键词组中的每一问句关键词,在问句和模板关键词组对的模板关键词组中从第一个模板关键词开始按照所述模板关键词组中模板关键词的排序,依次获取模板关键词,并将获取的模板关键词与该问句关键词组成关键词对;Starting from the first question keyword in the question keyword group in the question sentence and the template keyword group pair, according to the order of the question sentence keywords in the question sentence keyword group, aiming at the question keyword group For each question keyword, in the template keyword group of the question sentence and the template keyword group, start from the first template keyword in the order of the template keywords in the template keyword group, get the template keywords in turn, and The obtained template keyword and the question keyword form a keyword pair;
    获取所述问句和模板关键词组对中的问句关键词组中问句关键词的数目;Acquiring the number of question keywords in the question keyword group in the question sentence and the template keyword group pair;
    对所述关键词对按照生成顺序进行排序,从第一个关键词对开始,每次在没获取过的关键词对中按照所述顺序获取所述数目个关键词对,作为关键词对矩阵中的第n行,其中,n为获取所述数目个关键词对的获取次数。The keyword pairs are sorted according to the generation order, starting from the first keyword pair, each time the number of keyword pairs that have not been obtained are obtained in the stated order, as a keyword pair matrix In the nth row in, where n is the number of times of obtaining the number of keyword pairs.
  12. 根据权利要求10所述的电子设备,其中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:The electronic device according to claim 10, wherein each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and the key to each question sentence and the template Phrase pair, based on the similarity matrix of the keyword pair of the question sentence and the template keyword group output from the word similarity judgment model of each layer, obtain the inferred similarity matrix of the question sentence and the template keyword group pair, including:
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,获取由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素的最大值;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, obtain the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer. The maximum value of the corresponding element of the keyword;
    将针对该问句和模板关键词组对对应的每一关键词对获取的所述对应的元素的最大值,按照各个对应的元素在相似度矩阵中的排序组成相似度矩阵,作为该问句和模板关键 词组对的推断相似度矩阵。The maximum value of the corresponding element obtained for each keyword pair corresponding to the question and the template keyword group pair is formed into a similarity matrix according to the ordering of each corresponding element in the similarity matrix, as the question sum Inferred similarity matrix of template keyword group pairs.
  13. 根据权利要求10所述的电子设备,其中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:The electronic device according to claim 10, wherein each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and the key to each question sentence and the template Phrase pair, based on the similarity matrix of the keyword pair of the question sentence and the template keyword group output from the word similarity judgment model of each layer, obtain the inferred similarity matrix of the question sentence and the template keyword group pair, including:
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,对由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素从大到小进行排序;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer is compared with This keyword sorts the corresponding elements from large to small;
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,在由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素中,获取排序在前第一预定数目的关键词对对应的元素,作为推断相似度矩阵获取元素;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, in the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer and Among the elements corresponding to the keyword pairs, the elements corresponding to the first predetermined number of keyword pairs in the first order are obtained as the inferred similarity matrix obtaining elements;
    针对每一所述问句和模板关键词组对,基于获取的所述推断相似度矩阵获取元素,获取推断相似度矩阵。For each pair of the question and template keyword, elements are obtained based on the obtained inferred similarity matrix to obtain an inferred similarity matrix.
  14. 根据权利要求10所述的电子设备,其中,所述针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度,包括:The electronic device according to claim 10, wherein, for each question sentence and template keyword group pair, the question sentence and the template keyword group are obtained according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair. The similarity of the template keyword group to the corresponding question sentence and question sentence template, including:
    过下列公式根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度:Use the following formula to obtain the similarity between the question and the template keyword group corresponding to the question and the question template according to the similarity in the inferred similarity matrix of the question and the template keyword group pair:
    Figure PCTCN2020098948-appb-100004
    Figure PCTCN2020098948-appb-100004
    Figure PCTCN2020098948-appb-100005
    Figure PCTCN2020098948-appb-100005
    其中,S1和S2分别是所述问句和所述问句模板,Sim(S1,S2)为所述问句与所述问句模板的相似度,w i为所述问句中的关键词,而w j为所述问句模板中的关键词,maxSim(w j)为所述问句模板的各关键词与所述问句中的关键词w i的相似度中的最大值,maxSim(w i)为所述问句中的各关键词与所述问句中的关键词w j的相似度中的最大值,idf(w i)为所述问句中的关键词w i逆向文件频率,idf(w j)为所述问句中的关键词w j逆向文件频率,D为所有所述问句模板的数目,
    Figure PCTCN2020098948-appb-100006
    为存在所述问句中的关键词w i的问句模板的数目。
    Wherein, S1 and S2 are the question sentence and the question sentence template respectively, Sim(S1, S2) is the similarity between the question sentence and the question sentence template, and w i is the key word in the question sentence , And w j is the keyword in the question template, maxSim(w j ) is the maximum value of the similarity between each keyword in the question template and the keyword w i in the question, maxSim (w i ) is the maximum value of the similarity between each keyword in the question sentence and the keyword w j in the question sentence, idf(w i ) is the reverse of the keyword w i in the question sentence File frequency, idf(w j ) is the reverse file frequency of the keyword w j in the question sentence, D is the number of all question sentence templates,
    Figure PCTCN2020098948-appb-100006
    Is the number of question template for the keyword w i in the question sentence.
  15. 根据权利要求9所述的电子设备,其中,所述根据所述相似度在所述预设的模板库中的所有问句模板中获取标准问句模板,包括:9. The electronic device according to claim 9, wherein said obtaining a standard question template from all question templates in said preset template library according to said similarity comprises:
    对所述预设的模板库中的所有问句模板按照所述相似度从大到小进行排序;Sorting all question templates in the preset template library in descending order of the similarity;
    在所述相似度大于预设相似度阈值的问句模板的数目大于或等于预定数目阈值的情况下,将排序小于或等于所述预定数目阈值的问句模板,作为标准问句模板;In a case where the number of question template whose similarity is greater than the preset similarity threshold is greater than or equal to a predetermined number threshold, the question template whose ranking is less than or equal to the predetermined number threshold is used as the standard question template;
    在所述相似度大于所述预设相似度阈值的问句模板的数目小于所述预定数目阈值的情况下,将所述相似度大于所述预设相似度阈值的问句模板,作为所述标准问句模板。In the case that the number of question templates whose similarity is greater than the preset similarity threshold is less than the predetermined number threshold, the question templates with the similarity greater than the preset similarity threshold are used as the Standard question template.
  16. 一种计算机可读存储介质,其中,其存储有计算机程序指令,当所述计算机程序指令被计算机执行时,使计算机执行智能问答方法,所述智能问答方法具体包括如下步骤:A computer-readable storage medium, wherein computer program instructions are stored, and when the computer program instructions are executed by a computer, the computer executes an intelligent question and answer method. The intelligent question answer method specifically includes the following steps:
    对用户输入的问句和预设的模板库中的多个问句模板进行预处理,分别得到所述问句和所述问句模板的关键词;Preprocessing the question sentence input by the user and multiple question sentence templates in the preset template library to obtain the keywords of the question sentence and the question sentence template respectively;
    将基于由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词 组对输入至已建立的包括多层词语相似度判定模型的语义解析器,得到所述语义解析器输出的所述问句与每一所述问句模板的相似度;Input the question sentence and template keyword group pair composed of the keywords of the question sentence and the keywords of each question template into the established semantic parser including the multi-layer word similarity judgment model to obtain the The similarity between the question output from the semantic parser and each question template;
    根据所述相似度在所述预设的模板库中的所有问句模板中获取标准问句模板;Obtaining a standard question template from all question templates in the preset template library according to the similarity;
    确定与所述标准问句模板对应的问题答案并将所述问题答案输出。Determine the answer to the question corresponding to the standard question template and output the answer to the question.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述语义解析器通过如下方式根据输入至该语义解析器的由所述问句的关键词和每一所述问句模板的关键词组成的问句和模板关键词组对输出所述问句与每一所述问句模板的相似度:The computer-readable storage medium according to claim 16, wherein the semantic parser is inputted to the semantic parser according to the keywords of the question sentence and the keywords of each question template The composed question sentence and template keyword group pair output the similarity between the question sentence and each question sentence template:
    获取每一所述问句和模板关键词组对对应的关键词对矩阵;Acquire the keyword pair matrix corresponding to each of the question sentence and the template keyword pair;
    分别利用所述语义解析器中的每一层词语相似度判定模型,对每一关键词对矩阵进行词语相似度判定,得到由每一层词语相似度判定模型针对每一所述问句和模板关键词组对输出的关键词对的相似度矩阵;Each layer of word similarity judgment model in the semantic parser is used to make word similarity judgment for each keyword pair matrix, and each layer of word similarity judgment model is obtained for each question and template. The similarity matrix of the keyword pairs output by the keyword group;
    针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵;For each of the question sentence and template keyword group pair, the question sentence and the template keyword group pair are obtained based on the similarity matrix between the question sentence and the template keyword group pair output by the word similarity judgment model of each layer Inferred similarity matrix;
    针对每一问句和模板关键词组对,根据该问句和模板关键词组对的推断相似度矩阵中的各个相似度,获取该问句和模板关键词组对对应的问句和问句模板的相似度。For each question sentence and template keyword group pair, according to each similarity in the inferred similarity matrix of the question sentence and the template keyword group pair, obtain the similarity of the question sentence and the question sentence template corresponding to the question sentence and the template keyword group pair degree.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述获取每一所述问句和模板关键词组对对应的关键词对矩阵,包括:18. The computer-readable storage medium according to claim 17, wherein said obtaining the keyword pair matrix corresponding to each of the question sentence and the template keyword group pair comprises:
    从所述问句和模板关键词组对中的问句关键词组中的第一个问句关键词开始,按照所述问句关键词组中问句关键词的排序,针对所述问句关键词组中的每一问句关键词,在问句和模板关键词组对的模板关键词组中从第一个模板关键词开始按照所述模板关键词组中模板关键词的排序,依次获取模板关键词,并将获取的模板关键词与该问句关键词组成关键词对;Starting from the first question keyword in the question keyword group in the question sentence and the template keyword group pair, according to the order of the question sentence keywords in the question sentence keyword group, aiming at the question keyword group For each question keyword, in the template keyword group of the question sentence and the template keyword group, start from the first template keyword in the order of the template keywords in the template keyword group, get the template keywords in turn, and The obtained template keyword and the question keyword form a keyword pair;
    获取所述问句和模板关键词组对中的问句关键词组中问句关键词的数目;Acquiring the number of question keywords in the question keyword group in the question sentence and the template keyword group pair;
    对所述关键词对按照生成顺序进行排序,从第一个关键词对开始,每次在没获取过的关键词对中按照所述顺序获取所述数目个关键词对,作为关键词对矩阵中的第n行,其中,n为获取所述数目个关键词对的获取次数。The keyword pairs are sorted according to the generation order, starting from the first keyword pair, each time the number of keyword pairs that have not been obtained are obtained in the stated order, as a keyword pair matrix In the nth row in, where n is the number of times of obtaining the number of keyword pairs.
  19. 根据权利要求17所述的计算机可读存储介质,其中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:The computer-readable storage medium according to claim 17, wherein each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and the And the template keyword group pair, based on the similarity matrix of the keyword pair between the question sentence and the template keyword group pair output by the word similarity judgment model of each layer, obtain the inferred similarity matrix of the question sentence and the template keyword group pair, including :
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,获取由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素的最大值;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, obtain the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer. The maximum value of the corresponding element of the keyword;
    将针对该问句和模板关键词组对对应的每一关键词对获取的所述对应的元素的最大值,按照各个对应的元素在相似度矩阵中的排序组成相似度矩阵,作为该问句和模板关键词组对的推断相似度矩阵。The maximum value of the corresponding element obtained for each keyword pair corresponding to the question and the template keyword group pair is formed into a similarity matrix according to the ordering of each corresponding element in the similarity matrix, as the question sum Inferred similarity matrix of template keyword group pairs.
  20. 根据权利要求17所述的计算机可读存储介质,其中,所述相似度矩阵中的每一元素与问句和模板关键词组对中的一个关键词对对应,所述针对每一所述问句和模板关键词组对,基于由各层词语相似度判定模型输出的该问句和模板关键词组对的关键词对的相似度矩阵,获得该问句和模板关键词组对的推断相似度矩阵,包括:The computer-readable storage medium according to claim 17, wherein each element in the similarity matrix corresponds to a keyword pair in a question sentence and a template keyword group pair, and the And the template keyword group pair, based on the similarity matrix of the keyword pair between the question sentence and the template keyword group pair output by the word similarity judgment model of each layer, obtain the inferred similarity matrix of the question sentence and the template keyword group pair, including :
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键词对,对由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素从大到小进行排序;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer is compared with This keyword sorts the corresponding elements from large to small;
    针对每一所述问句和模板关键词组对,针对该问句和模板关键词组对对应的每一关键 词对,在由各层词语相似度判定模型输出的关键词对的相似度矩阵中与该关键词对对应的元素中,获取排序在前第一预定数目的关键词对对应的元素,作为推断相似度矩阵获取元素;For each question sentence and template keyword group pair, for each keyword pair corresponding to the question sentence and template keyword group pair, in the similarity matrix of the keyword pairs output by the word similarity judgment model of each layer and Among the elements corresponding to the keyword pairs, the elements corresponding to the first predetermined number of keyword pairs in the first order are obtained as the inferred similarity matrix obtaining elements;
    针对每一所述问句和模板关键词组对,基于获取的所述推断相似度矩阵获取元素,获取推断相似度矩阵。For each pair of the question and template keyword, elements are obtained based on the obtained inferred similarity matrix to obtain an inferred similarity matrix.
PCT/CN2020/098948 2019-08-01 2020-06-29 Intelligent question answering method and apparatus, medium and electronic device WO2021017721A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910709165.9 2019-08-01
CN201910709165.9A CN110647614B (en) 2019-08-01 2019-08-01 Intelligent question-answering method, device, medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2021017721A1 true WO2021017721A1 (en) 2021-02-04

Family

ID=68990006

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098948 WO2021017721A1 (en) 2019-08-01 2020-06-29 Intelligent question answering method and apparatus, medium and electronic device

Country Status (2)

Country Link
CN (1) CN110647614B (en)
WO (1) WO2021017721A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966087A (en) * 2021-03-15 2021-06-15 中国美术学院 Intelligent question-answering system and method for inspiration materials
CN113204622A (en) * 2021-05-25 2021-08-03 广州三星通信技术研究有限公司 Electronic device and information processing method thereof
CN113282733A (en) * 2021-06-11 2021-08-20 上海寻梦信息技术有限公司 Customer service problem matching method, system, device and storage medium
CN117708304A (en) * 2024-02-01 2024-03-15 浙江大华技术股份有限公司 Database question-answering method, equipment and storage medium
CN117708304B (en) * 2024-02-01 2024-05-28 浙江大华技术股份有限公司 Database question-answering method, equipment and storage medium

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647614B (en) * 2019-08-01 2023-05-23 平安科技(深圳)有限公司 Intelligent question-answering method, device, medium and electronic equipment
CN110895652A (en) * 2019-09-27 2020-03-20 广州视源电子科技股份有限公司 Comment information processing method, device, system, equipment and storage medium
CN111274371B (en) * 2020-01-14 2023-09-29 东莞证券股份有限公司 Intelligent man-machine conversation method and equipment based on knowledge graph
CN111428002A (en) * 2020-03-23 2020-07-17 南京烽火星空通信发展有限公司 Natural language man-machine interactive intelligent question-answering implementation method
CN111581364B (en) * 2020-05-06 2022-05-03 厦门理工学院 Chinese intelligent question-answer short text similarity calculation method oriented to medical field
CN111782785B (en) * 2020-06-30 2024-04-19 北京百度网讯科技有限公司 Automatic question and answer method, device, equipment and storage medium
CN111898643B (en) * 2020-07-01 2024-02-23 上海依图信息技术有限公司 Semantic matching method and device
CN111831810B (en) * 2020-07-23 2024-02-09 中国平安人寿保险股份有限公司 Intelligent question-answering method, device, equipment and storage medium
CN111858900B (en) * 2020-09-21 2020-12-25 杭州摸象大数据科技有限公司 Method, device, equipment and storage medium for generating question semantic parsing rule template
CN112487165A (en) * 2020-12-02 2021-03-12 税友软件集团股份有限公司 Question and answer method, device and medium based on keywords
CN113157868B (en) * 2021-04-29 2022-11-11 青岛海信网络科技股份有限公司 Method and device for matching answers to questions based on structured database
CN113515605B (en) * 2021-05-20 2023-12-19 中晨田润实业有限公司 Intelligent robot question-answering method based on artificial intelligence and intelligent robot
CN113268563B (en) * 2021-05-24 2022-06-17 平安科技(深圳)有限公司 Semantic recall method, device, equipment and medium based on graph neural network
CN116089589B (en) * 2023-02-10 2023-08-29 阿里巴巴达摩院(杭州)科技有限公司 Question generation method and device
CN116739003A (en) * 2023-06-01 2023-09-12 中国南方电网有限责任公司 Intelligent question-answering implementation method and device for power grid management, electronic equipment and storage medium
CN117708283A (en) * 2023-11-29 2024-03-15 北京中关村科金技术有限公司 Recall content determining method, recall content determining device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170018620A (en) * 2015-08-10 2017-02-20 삼성전자주식회사 similar meaning detection method and detection device using same
CN107436916A (en) * 2017-06-15 2017-12-05 百度在线网络技术(北京)有限公司 The method and device of intelligent prompt answer
CN107609101A (en) * 2017-09-11 2018-01-19 远光软件股份有限公司 Intelligent interactive method, equipment and storage medium
CN109871437A (en) * 2018-11-30 2019-06-11 阿里巴巴集团控股有限公司 Method and device for the processing of customer problem sentence
CN109918624A (en) * 2019-03-18 2019-06-21 北京搜狗科技发展有限公司 A kind of calculation method and device of web page text similarity
CN110647614A (en) * 2019-08-01 2020-01-03 平安科技(深圳)有限公司 Intelligent question and answer method, device, medium and electronic equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10275515B2 (en) * 2017-02-21 2019-04-30 International Business Machines Corporation Question-answer pair generation
CN107220380A (en) * 2017-06-27 2017-09-29 北京百度网讯科技有限公司 Question and answer based on artificial intelligence recommend method, device and computer equipment
CN108595619A (en) * 2018-04-23 2018-09-28 海信集团有限公司 A kind of answering method and equipment
CN109815318A (en) * 2018-12-24 2019-05-28 平安科技(深圳)有限公司 The problems in question answering system answer querying method, system and computer equipment
CN109948143B (en) * 2019-01-25 2023-04-07 网经科技(苏州)有限公司 Answer extraction method of community question-answering system
CN110032632A (en) * 2019-04-04 2019-07-19 平安科技(深圳)有限公司 Intelligent customer service answering method, device and storage medium based on text similarity

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170018620A (en) * 2015-08-10 2017-02-20 삼성전자주식회사 similar meaning detection method and detection device using same
CN107436916A (en) * 2017-06-15 2017-12-05 百度在线网络技术(北京)有限公司 The method and device of intelligent prompt answer
CN107609101A (en) * 2017-09-11 2018-01-19 远光软件股份有限公司 Intelligent interactive method, equipment and storage medium
CN109871437A (en) * 2018-11-30 2019-06-11 阿里巴巴集团控股有限公司 Method and device for the processing of customer problem sentence
CN109918624A (en) * 2019-03-18 2019-06-21 北京搜狗科技发展有限公司 A kind of calculation method and device of web page text similarity
CN110647614A (en) * 2019-08-01 2020-01-03 平安科技(深圳)有限公司 Intelligent question and answer method, device, medium and electronic equipment

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966087A (en) * 2021-03-15 2021-06-15 中国美术学院 Intelligent question-answering system and method for inspiration materials
CN112966087B (en) * 2021-03-15 2023-10-13 中国美术学院 Intelligent question-answering system and method for inspiration materials
CN113204622A (en) * 2021-05-25 2021-08-03 广州三星通信技术研究有限公司 Electronic device and information processing method thereof
CN113282733A (en) * 2021-06-11 2021-08-20 上海寻梦信息技术有限公司 Customer service problem matching method, system, device and storage medium
CN113282733B (en) * 2021-06-11 2024-04-09 上海寻梦信息技术有限公司 Customer service problem matching method, system, equipment and storage medium
CN117708304A (en) * 2024-02-01 2024-03-15 浙江大华技术股份有限公司 Database question-answering method, equipment and storage medium
CN117708304B (en) * 2024-02-01 2024-05-28 浙江大华技术股份有限公司 Database question-answering method, equipment and storage medium

Also Published As

Publication number Publication date
CN110647614A (en) 2020-01-03
CN110647614B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
WO2021017721A1 (en) Intelligent question answering method and apparatus, medium and electronic device
US10586155B2 (en) Clarification of submitted questions in a question and answer system
US10713323B2 (en) Analyzing concepts over time
US9558264B2 (en) Identifying and displaying relationships between candidate answers
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
WO2020042925A1 (en) Man-machine conversation method and apparatus, electronic device, and computer readable medium
US9311823B2 (en) Caching natural language questions and results in a question and answer system
CN109670163B (en) Information identification method, information recommendation method, template construction method and computing device
US20150178623A1 (en) Automatically Generating Test/Training Questions and Answers Through Pattern Based Analysis and Natural Language Processing Techniques on the Given Corpus for Quick Domain Adaptation
EP3933657A1 (en) Conference minutes generation method and apparatus, electronic device, and computer-readable storage medium
EP3958145A1 (en) Method and apparatus for semantic retrieval, device and storage medium
US11188819B2 (en) Entity model establishment
CN110162771B (en) Event trigger word recognition method and device and electronic equipment
CN109325108B (en) Query processing method, device, server and storage medium
WO2020232898A1 (en) Text classification method and apparatus, electronic device and computer non-volatile readable storage medium
US11977567B2 (en) Method of retrieving query, electronic device and medium
WO2021047373A1 (en) Big data-based column data processing method, apparatus, and medium
WO2018121198A1 (en) Topic based intelligent electronic file searching
CN110727769B (en) Corpus generation method and device and man-machine interaction processing method and device
WO2023273598A1 (en) Text search method and apparatus, and readable medium and electronic device
KR20220123187A (en) Multi system based intelligent question answering method, apparatus and device
CN114116997A (en) Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium
Kalashnikov et al. A semantics-based approach for speech annotation of images
CN114547233A (en) Data duplicate checking method and device and electronic equipment
CN117591511A (en) Method and device for constructing search database, and search method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20848414

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20848414

Country of ref document: EP

Kind code of ref document: A1