CN106407333B

CN106407333B - Spoken language query identification method and device based on artificial intelligence

Info

Publication number: CN106407333B
Application number: CN201610801495.7A
Authority: CN
Inventors: 孙宇; 王硕寰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2016-09-05
Filing date: 2016-09-05
Publication date: 2020-03-03
Anticipated expiration: 2036-09-05
Also published as: CN106407333A

Abstract

The invention discloses a spoken language query identification method and a spoken language query identification device based on artificial intelligence, wherein the method comprises the following steps: training the convolutional neural network according to the query field labeled by the spoken language retrieval corpus to generate a retrieval field identification model; and training the recurrent neural network to generate a retrieval intention identification model corresponding to the query field according to the query intention and the parameter information which are marked by the spoken retrieval corpus and correspond to the query field. According to the embodiment of the invention, the retrieval field identification model and the retrieval intention identification model with high applicability and high automation are generated through training, so that the intention of the spoken language query of the user and the corresponding parameter information thereof can be accurately acquired, and the efficiency and the accuracy of the spoken language query identification are improved.

Description

Spoken language query identification method and device based on artificial intelligence

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a spoken language query recognition method and device based on artificial intelligence.

Background

Artificial Intelligence (Artificial Intelligence), abbreviated in english as AI. The method is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, a field of research that includes robotics, speech recognition, image recognition, natural language processing, and expert systems.

With the development of artificial intelligence, the natural interaction of computers and people has been a core problem of artificial intelligence. How to make a machine accurately acquire a spoken query intention of a user and corresponding parameter information is a very important technical problem.

The traditional template matching method is that some cases are manually summarized, fixed patterns of the cases are found out and stored in a dictionary, and when a spoken language of a query is received, the cases are matched with the existing template, so that the query intention and the corresponding parameter information are analyzed.

However, spoken language expression modes are complex and various, templates are limited, all expression modes cannot be completely covered by template matching, cross-domain information cannot be utilized by a template matching method, and the domains are independent and cannot be migrated. For example, the query templates related to booking tickets are summarized, and when the query of taxi taking needs to be processed, the template of booking tickets cannot be completely applied, so that the templates on a new category need to be manually summarized. In addition, template matching requires manual participation in summarization, and the automation degree is not high.

Disclosure of Invention

The object of the present invention is to solve at least to some extent one of the above mentioned technical problems.

Therefore, the first objective of the present invention is to provide an artificial intelligence-based spoken language query recognition method, which generates a high-applicability and high-automation retrieval domain recognition model and a retrieval intention recognition model through training, can accurately obtain the intention of the spoken language query of the user and the corresponding parameter information thereof, and improves the efficiency and accuracy of spoken language query recognition.

The second purpose of the invention is to provide a spoken language query recognition device based on artificial intelligence.

To achieve the above object, an artificial intelligence based spoken language query recognition method according to an embodiment of a first aspect of the present invention includes the following steps:

training the convolutional neural network according to the query field labeled by the spoken language retrieval corpus to generate a retrieval field identification model;

and training a recurrent neural network to generate a retrieval intention identification model corresponding to the query field according to the query intention and the parameter information, which are marked by the spoken language retrieval corpus and correspond to the query field.

The spoken language query recognition method based on artificial intelligence provided by the embodiment of the invention comprises the steps of firstly training a convolutional neural network according to a query field marked by a spoken language retrieval corpus to generate a retrieval field recognition model, and then training a cyclic neural network according to query intentions and parameter information marked by the spoken language retrieval corpus and corresponding to the query field to generate the retrieval intention recognition model corresponding to the query field. Therefore, the retrieval field identification model and the retrieval intention identification model which are high in applicability and high in automation are generated through training, the intention of the spoken language query of the user and the corresponding parameter information of the intention can be accurately acquired, and the efficiency and the accuracy of the spoken language query identification are improved.

To achieve the above object, an artificial intelligence-based spoken query recognition apparatus according to a second aspect of the present invention includes: the first generation module is used for training the convolutional neural network according to the query field labeled by the spoken language retrieval corpus to generate a retrieval field identification model;

and the second generation module is used for training a recurrent neural network to generate a retrieval intention identification model corresponding to the query field according to the query intention and the parameter information which are marked by the spoken language retrieval corpus and correspond to the query field.

The spoken language query recognition device based on artificial intelligence provided by the embodiment of the invention firstly trains a convolutional neural network according to a query field marked by a spoken language retrieval corpus to generate a retrieval field recognition model, and then trains a cyclic neural network according to query intentions and parameter information marked by the spoken language retrieval corpus and corresponding to the query field to generate a retrieval intention recognition model corresponding to the query field. Therefore, the retrieval field identification model and the retrieval intention identification model which are high in applicability and high in automation are generated through training, the intention of the spoken language query of the user and the corresponding parameter information of the intention can be accurately acquired, and the efficiency and the accuracy of the spoken language query identification are improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow diagram of an artificial intelligence based spoken query recognition method according to one embodiment of the invention;

FIG. 2 is a flow diagram of an artificial intelligence based spoken query recognition method according to another embodiment of the invention;

FIG. 3 is a schematic diagram of the structure of a search area recognition model according to one embodiment of the present invention;

FIG. 4 is a flow diagram of an artificial intelligence based spoken query recognition method according to yet another embodiment of the invention;

FIG. 5 is a schematic diagram of the structure of a retrieval intent recognition model, according to one embodiment of the present invention;

FIG. 6 is a flow diagram of an artificial intelligence based spoken query identification method according to yet another embodiment of the invention;

FIG. 7 is a schematic diagram of an artificial intelligence based spoken query recognition apparatus according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of an artificial intelligence based spoken query recognition apparatus according to another embodiment of the present invention;

FIG. 9 is a schematic diagram of an artificial intelligence based spoken query recognition arrangement according to yet another embodiment of the present invention; and

fig. 10 is a schematic structural diagram of an artificial intelligence-based spoken query recognition apparatus according to another embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

The following describes a spoken language query recognition method and apparatus based on artificial intelligence according to an embodiment of the present invention with reference to the accompanying drawings.

FIG. 1 is a flow diagram of a method for artificial intelligence based spoken query recognition, according to one embodiment of the invention.

As shown in fig. 1, the artificial intelligence based spoken query recognition method of the embodiment of the present invention includes the following steps:

step 101, training the convolutional neural network according to the query field labeled by the spoken language retrieval corpus to generate a retrieval field identification model.

Generally, in a query process of a human-computer conversation, a natural expression in order to conform to a human comes in a spoken language form, but the spoken language expression has diversity and complexity. Examples are as follows:

the inquiry is that the company helps me to order the train ticket from Beijing to Tianjin, and the inquiry is that the company wants to sit on the train from Beijing to Tianjin, both the inquiries express that the user wants to order the train ticket, the starting place is 'Beijing', and the destination is 'Tianjin'. How to make a computer accurately understand the query intention of the spoken language and the parameter information required by the intention is a problem to be solved in human-computer conversation.

Therefore, the embodiment of the invention provides an artificial intelligence-based spoken language query recognition method, which can accurately acquire the intention of spoken language query of a user and the corresponding parameter information thereof, and improves the efficiency and accuracy of spoken language query recognition.

Firstly, training a convolutional neural network according to a query field labeled by a spoken language retrieval corpus to generate a retrieval field identification model, wherein the generated retrieval field identification model can determine which field an input spoken language query belongs to.

It should be noted that the convolutional neural network is essentially an input-to-output mapping, and can learn a large number of mapping relationships between inputs and outputs without any precise mathematical expression between the inputs and outputs. Furthermore, the generated retrieval domain identification model has the mapping capability between input and output pairs as long as the convolutional neural network is trained according to the query domain labeled by the spoken retrieval corpus.

The spoken language retrieval corpus is processed through preset rules to obtain the spoken language retrieval corpus in the labeling retrieval field as a training sample, so that sample data can be automatically obtained, the training sample can be obtained in the manual labeling retrieval field, and a plurality of labeled spoken language retrieval corpuses are stored in a training sample set.

In order to make it more clear for those skilled in the art how to generate a recognition model of a search field, a spoken language search corpus of a labeled query field is input, and a convolutional neural network is trained, specifically, the training process is as follows:

step one, a forward propagation stage, namely taking a sample from a training sample set, inputting the sample into a convolutional neural network to calculate corresponding actual output, and in the stage, information is transmitted to an output layer from an input layer through gradual conversion;

secondly, calculating the difference between the actual output and the ideal output corresponding to the sample in a backward propagation stage, and reversely propagating the adjustment weight matrix according to a method of minimizing errors;

and thirdly, repeating the operations of the first step and the second step until the output target function J (theta) after the convolutional neural network classification layer Softmax is less than or equal to 0.0001 to obtain a retrieval field identification model.

Wherein J (theta) represents an objective function, m represents the number of samples, theta represents a training parameter, and x represents a hidden layer vector.

Further, after the retrieval domain identification model is obtained, the retrieval domain identification model can process which domain the input spoken query belongs to, for example, inputting query 1 ═ i "i want to sit on a train from beijing to guangzhou", query 2 ═ beijing to guangzhou train's latest departure "into the retrieval domain identification model, resulting in that both queries belong to the" traffic "domain.

And 102, training the recurrent neural network according to the query intention and the parameter information which are marked by the spoken language retrieval corpus and correspond to the query field to generate a retrieval intention identification model corresponding to the query field.

Specifically, the retrieval intention identification model can identify the retrieval intention and the parameter information in the query field, the retrieval intention identification model can simultaneously consider the retrieval intention identification when learning parameter information labeling, and the two kinds of information are mutually fused and supplemented so as to improve the identification precision. Examples are as follows:

when the query is that the person wants to sit on a train to go to Guangzhou, the Guangzhou is identified as place name parameter information, the query intention may be related to traffic intention, and similarly, when the person identifies that the query intention is to sit on the train, departure place parameter information and destination place parameter information are required to be identified, but weather type parameter information is not required to be identified.

In order to make a person in the field more clear how to generate a retrieval intention recognition model, a spoken language retrieval corpus is input with a query intention and parameter information corresponding to a labeled query field, and a recurrent neural network is trained, wherein the training process is as follows:

firstly, the participles of the spoken language retrieval corpus in the training sample generate an input layer of a recurrent neural network, and real number vectors corresponding to each participle in the input layer are searched from a predefined word list.

And secondly, generating a real number vector layer of the recurrent neural network by using the real number vector, and performing matrix mapping on the real number vector to obtain a hidden layer of the recurrent neural network.

And thirdly, taking the real number vector of each participle as a condition, and respectively calculating the probability of the category corresponding to each participle under the condition to be used as an output layer of the recurrent neural network.

Wherein, the probability formula for calculating the category i is as follows:

theta represents the training parameter, x represents the hidden layer vector, T represents the total class, and k is a constant.

And fourthly, training the recurrent neural network by using the spoken language retrieval corpora of the plurality of query fields to obtain a retrieval intention recognition model.

Further, after the retrieval intention recognition model is obtained, the query intention and the corresponding parameter information of the spoken query in the query field can be recognized. The above example is used as an example to explain: obtaining the two queries both belongs to the field of traffic. The retrieval intention identification model identifies that the query intentions of the two queries are different, the intention of the query 1 is that the user needs to order an air ticket, the intention of the query 2 is that the train time query is, the parameter information of the query 1 and the query 2 is that the starting place is Beijing, and the destination is Guangzhou.

In summary, in the spoken language query recognition method based on artificial intelligence according to the embodiment of the present invention, the convolutional neural network is trained according to the query field labeled by the spoken language retrieval corpus to generate the retrieval field recognition model, and then the cyclic neural network is trained according to the query intention and the parameter information, which are labeled by the spoken language retrieval corpus and correspond to the query field, to generate the retrieval intention recognition model corresponding to the query field. Therefore, the retrieval field identification model and the retrieval intention identification model which are high in applicability and high in automation are generated through training, the intention of the spoken language query of the user and the corresponding parameter information of the intention can be accurately acquired, and the efficiency and the accuracy of the spoken language query identification are improved.

FIG. 2 is a flow diagram of an artificial intelligence based spoken query recognition method according to another embodiment of the invention.

Referring to fig. 2, after step 102, according to the embodiment of fig. 1, the method further includes:

step 201, performing word segmentation processing on the spoken language retrieval sentences input by the user, performing part-of-speech tagging, and sequentially inputting word segmentation results into an input layer of the retrieval field identification model in a vector form.

Specifically, after a retrieval field identification model and a retrieval intention identification model are trained, when a spoken retrieval sentence input by a user is obtained, firstly, word segmentation processing is carried out on the spoken retrieval sentence, and part-of-speech tagging is carried out, wherein the part-of-speech tagging represents a process of determining that each word is a noun, a verb, an adjective or other parts-of-speech. And then, the word segmentation results are sequentially input into an input layer of the search domain identification model in a vector form.

To describe more clearly how to query the spoken language detection statement, the following is detailed in conjunction with fig. 3:

as shown in fig. 3, the spoken language retrieval statement input by the user is obtained as "what is the weather of beijing today", the word segmentation processing is performed on the spoken language retrieval statement, the part of speech is labeled as "beijing" noun, "today" time-like language, "weather" noun, and "what is the" question word ", and the word segmentation result is sequentially input into the input layer of the retrieval field recognition model in a vector form.

And 202, fusing the currently input words and the historically input words through a hidden layer of the retrieval field identification model, and acquiring real number vectors of the spoken retrieval sentences according to fusion results of all the words.

And 203, performing probability analysis on the real number vector through an output layer of the retrieval domain identification model to obtain the query domain of the spoken language detection statement.

Specifically, the retrieval domain identification model has mapping capacity between input and output pairs, the currently input words and the historically input words are fused through a hidden layer of the retrieval domain identification model, real number vectors of the spoken retrieval sentences are obtained according to the fusion results of all the words, probability analysis can be carried out on the real number vectors through an output layer of the retrieval domain identification model, namely softmax is used for classification, the query domain types corresponding to the real number vectors are determined, and finally the query domains of the spoken detection sentences are obtained.

The search area identifies a model input layer, a hidden layer, and an output layer, i.e., a Softmax classification layer, where a Softmax function is a function for classifying results.

Continuing with the above example, the hidden layer of the search domain identification model fuses the input "beijing", "today", "weather", and "what" with the historically input words, obtains the real number vector of the spoken search sentence according to the fusion result of all the words, and performs probability analysis on the real number vector through formula (2). For example, the "Beijing" probability is 0.2, the "today" probability is 0.05, the "weather" probability is 0.7, and the "what" probability is 0.05. Therefore, the fact that the spoken language retrieval sentence is 'how much the weather is today in Beijing' is determined to belong to the field of weather.

In summary, in the artificial intelligence-based spoken language query recognition method according to the embodiment of the present invention, a spoken language retrieval sentence input by a user is subjected to word segmentation processing and part-of-speech tagging, word segmentation results are sequentially input into an input layer of a retrieval field recognition model in a vector form, a currently input word and a historically input word are fused through a hidden layer of the retrieval field recognition model, a real number vector of the spoken language retrieval sentence is obtained according to a fusion result of all words, and finally, probability analysis is performed on the real number vector through the output layer of the retrieval field recognition model, so as to obtain a query field of a spoken language detection sentence. Therefore, the retrieval field recognition model with high applicability and high automation is generated through training, the query field of the spoken language query of the user can be accurately acquired, and the efficiency and the accuracy of the spoken language query recognition are improved.

FIG. 4 is a flow diagram of an artificial intelligence based spoken query recognition method according to yet another embodiment of the invention.

Referring to fig. 4, after step 203, according to the embodiment of fig. 2, the method further includes:

step 401, the word segmentation results are sequentially input into the retrieval intention recognition model corresponding to the query field in a vector form.

Specifically, after the query domain is obtained, and the word segmentation results obtained in the above process are sequentially input to the input layer of the search domain identification model in a vector form.

In order to more clearly describe how to obtain the query intention of the spoken language detection statement and the parameter information corresponding to the query intention, the following is described in detail with reference to fig. 5:

as shown in fig. 5, the word segmentation result of the spoken language search sentence "what is the weather of beijing today" is sequentially input to the input layer of the search field recognition model in a vector form.

And step 402, fusing the currently input words and the historically input words through a hidden layer of the retrieval intention recognition model, and acquiring real number vectors of the spoken retrieval sentences according to fusion results of all the words.

And 403, performing probability calculation on the real number vector through an intention classification output layer of the retrieval intention identification model to obtain the query intention of the spoken language detection statement.

Specifically, the retrieval intention identification model has mapping capacity between input and output pairs, the currently input words and the historically input words are fused through a hidden layer of the retrieval intention identification model, real number vectors of the spoken retrieval sentences are obtained according to the fusion results of all the words, probability analysis can be carried out on the real number vectors through an output layer of the retrieval intention identification model, namely softmax is used for classification, the query intention categories corresponding to the real number vectors are determined, and finally the query intentions of the spoken detection sentences are obtained.

Continuing with the above example, the hidden layer of the retrieval intention recognition model fuses the input "beijing", "today", "weather", and "what" with the historically input words, obtains the real number vector of the spoken retrieval sentence according to the fusion result of all the words, and performs probability analysis on the real number vector through formula (2). For example, the "Beijing" probability is 0.1, the "today" probability is 0.1, the "weather" probability is 0.3, and the "what" probability is 0.5. Therefore, the query intention of the spoken language retrieval sentence of 'how much weather is today in Beijing' is determined to be how much weather.

And 404, performing probability calculation on the real number vector through a serialized classification output layer of the retrieval intention identification model to acquire parameter information corresponding to the query intention.

Continuing with the above example, the probability analysis is performed on the real number vector by equation (2). For example, the "Beijing" probability is 0.1, the "today" probability is 0.1, the "weather" probability is 0.2, and the "what" probability is 0.6. Thus, the parameter information corresponding to the query intention of the spoken language retrieval sentence of 'how much weather is in Beijing' and 'today' is determined.

In summary, in the spoken language query recognition method based on artificial intelligence according to the embodiment of the present invention, word segmentation results are sequentially input into a retrieval intention recognition model corresponding to a query field in a vector form, then currently input words and historically input words are fused by a hidden layer of the retrieval intention recognition model, a real number vector of a spoken language retrieval sentence is obtained according to a fusion result of all the words, and finally probability calculation is performed on the real number vector by an intention classification output layer of the retrieval intention recognition model to obtain a query intention of a spoken language detection sentence and probability calculation is performed on the real number vector by a serialization classification output layer of the retrieval intention recognition model to obtain parameter information corresponding to the query intention. Therefore, a retrieval intention recognition model with high applicability and high automation is generated through training, the query intention of the spoken language query of the user and the corresponding parameter information can be accurately acquired, and the efficiency and the accuracy of the spoken language query recognition are improved.

FIG. 6 is a flow diagram of an artificial intelligence based spoken query identification method according to yet another embodiment of the invention.

Referring to fig. 6, based on the above embodiment, in order to further improve the efficiency and accuracy of spoken query recognition, it is necessary to detect the confidence of the query domain and the confidence of the query intent and the parameter information, and retrain the retrieval domain recognition model and the retrieval intent recognition model when the confidence is lower than a certain threshold, and the detailed process is described as follows:

step 601, the spoken language retrieval sentences marked by the retrieval domain identification model and having the confidence coefficient of the query domain lower than the preset threshold are marked manually again and used as spoken language retrieval linguistic data to be retrained.

Step 602, the spoken language retrieval sentences marked by the retrieval intention recognition model and marked by the parameter information with confidence degrees lower than the preset threshold are marked again manually and are used as spoken language retrieval corpora for retraining.

Specifically, the confidence of the query field marked by the retrieval field identification model indicates the accuracy degree of the query field marked by the retrieval field identification model, in order to improve the efficiency and accuracy of spoken query identification, a preset threshold value is set to ensure that the query field marked by the retrieval field identification model has certain accuracy, and similarly, a preset threshold value is set to ensure that the query intention and the parameter information marked by the retrieval intention identification model have certain accuracy.

Further, when the confidence of the query field labeled by the retrieval field identification model is lower than a preset threshold, the corresponding spoken retrieval statement is labeled manually again and is retrained as spoken retrieval corpus to obtain a new retrieval field identification model.

Further, when the confidence degrees of the query intention and the parameter information marked by the retrieval intention recognition model are lower than a preset threshold value, the corresponding spoken retrieval statement is marked manually again and is used as a spoken retrieval corpus to be retrained, and a new retrieval intention recognition model is obtained.

It is required to be noted that when the confidence of the query domain labeled by the retrieval domain identification model is not lower than the preset threshold, the corresponding spoken retrieval statement is not manually labeled again and is retrained as the spoken retrieval corpus.

It should be noted that when the query intention labeled by the retrieval intention recognition model and the confidence of the parameter information are not lower than the preset threshold, the corresponding spoken language retrieval statement is not manually labeled again and is retrained as the spoken language retrieval corpus.

In summary, the spoken language query recognition method based on artificial intelligence according to the embodiment of the present invention further artificially labels again the spoken language retrieval sentences labeled by the retrieval domain recognition model and having the confidence level of the query domain lower than the preset threshold, and trains again as the spoken language retrieval corpus, and artificially labels again the spoken language retrieval sentences labeled by the retrieval intention recognition model and having the confidence level of the parameter information lower than the preset threshold, and trains again as the spoken language retrieval corpus. Therefore, the efficiency and the accuracy of the spoken language query identification are further improved.

In order to implement the above embodiment, the present invention further provides a spoken language query recognition apparatus based on artificial intelligence.

FIG. 7 is a schematic structural diagram of an artificial intelligence based spoken query recognition apparatus according to an embodiment of the present invention.

As shown in fig. 7, the artificial intelligence based spoken query recognition apparatus includes: a first generating module 710 and a second generating module 720.

The first generating module 710 is configured to train the convolutional neural network according to the query domain labeled by the spoken language retrieval corpus to generate a retrieval domain identification model.

The second generating module 720 is configured to train the recurrent neural network to generate a retrieval intention recognition model corresponding to the query field according to the query intention and the parameter information, which are labeled by the spoken language retrieval corpus and correspond to the query field.

Specifically, the retrieval intention identification model can identify the retrieval intention and the parameter information in the query field, the retrieval intention identification model can simultaneously consider the retrieval intention identification when learning parameter information labeling, and the two kinds of information are mutually fused and supplemented so as to improve the identification precision.

Further, after the retrieval intention recognition model is obtained, the query intention and the corresponding parameter information of the spoken query in the query field can be recognized.

It should be noted that the foregoing explanation of the embodiment of the artificial intelligence based spoken language query recognition method is also applicable to the artificial intelligence based spoken language query recognition apparatus of the embodiment, and the implementation principle thereof is similar, and is not repeated here.

In summary, in the spoken language query recognition apparatus based on artificial intelligence according to the embodiment of the present invention, the convolutional neural network is trained according to the query domain labeled by the spoken language retrieval corpus to generate the retrieval domain recognition model, and then the cyclic neural network is trained according to the query intention and the parameter information corresponding to the query domain labeled by the spoken language retrieval corpus to generate the retrieval intention recognition model corresponding to the query domain. Therefore, the retrieval field identification model and the retrieval intention identification model which are high in applicability and high in automation are generated through training, the intention of the spoken language query of the user and the corresponding parameter information of the intention can be accurately acquired, and the efficiency and the accuracy of the spoken language query identification are improved.

Fig. 8 is a schematic structural diagram of an artificial intelligence based spoken query recognition apparatus according to another embodiment of the present invention.

As shown in fig. 8, on the basis of that shown in fig. 7, the apparatus further includes: a processing module 730, a first obtaining module 740, and a second obtaining module 750.

The processing module 730 is configured to perform word segmentation processing on the spoken language search sentence input by the user, perform part-of-speech tagging, and sequentially input word segmentation results into an input layer of the search field recognition model in a vector form.

The first obtaining module 740 is configured to fuse currently input words and historically input words through a hidden layer of the search domain identification model, and obtain real vectors of spoken language search sentences according to fusion results of all the words.

The second obtaining module 750 is configured to perform probability analysis on the real number vector through an output layer of the retrieval domain identification model, so as to obtain a query domain of the spoken language detection statement.

In summary, in the artificial intelligence-based spoken language query recognition apparatus according to the embodiment of the present invention, a spoken language retrieval sentence input by a user is subjected to word segmentation processing and part-of-speech tagging, word segmentation results are sequentially input into an input layer of a retrieval field recognition model in a vector form, a currently input word and a historically input word are fused through a hidden layer of the retrieval field recognition model, a real number vector of the spoken language retrieval sentence is obtained according to a fusion result of all words, and finally, probability analysis is performed on the real number vector through the output layer of the retrieval field recognition model, so as to obtain a query field of a spoken language detection sentence. Therefore, the retrieval field recognition model with high applicability and high automation is generated through training, the query field of the spoken language query of the user can be accurately acquired, and the efficiency and the accuracy of the spoken language query recognition are improved.

Fig. 9 is a schematic structural diagram of an artificial intelligence-based spoken query recognition apparatus according to still another embodiment of the present invention.

As shown in fig. 9, the apparatus further includes, on the basis of those shown in fig. 7 to 8: an input module 760, a third acquisition module 770, a fourth acquisition module 780, and a fifth acquisition module 790.

The input module 760 is configured to sequentially input the segmentation result into the retrieval intention recognition model corresponding to the query field in a vector form.

The third obtaining module 770 is configured to fuse the currently input words and the historically input words through a hidden layer of the retrieval intention recognition model, and obtain real vectors of the spoken language retrieval statement according to a fusion result of all the words.

The fourth obtaining module 780 is configured to perform probability calculation on the real number vector by retrieving an intention classification output layer of the intention recognition model, so as to obtain the query intention of the spoken language detection statement.

The fifth obtaining module 790 is configured to perform probability calculation on the real number vector through a serialized classification output layer of the retrieval intention recognition model, and obtain parameter information corresponding to the query intention.

To sum up, in the spoken language query recognition apparatus based on artificial intelligence according to the embodiment of the present invention, word segmentation results are sequentially input into a retrieval intention recognition model corresponding to a query field in a vector form, then currently input words and historically input words are fused by a hidden layer of the retrieval intention recognition model, a real number vector of a spoken language retrieval sentence is obtained according to a fusion result of all the words, and finally probability calculation is performed on the real number vector by an intention classification output layer of the retrieval intention recognition model to obtain a query intention of a spoken language detection sentence and probability calculation is performed on the real number vector by a serialization classification output layer of the retrieval intention recognition model to obtain parameter information corresponding to the query intention. Therefore, a retrieval intention recognition model with high applicability and high automation is generated through training, the query intention of the spoken language query of the user and the corresponding parameter information can be accurately acquired, and the efficiency and the accuracy of the spoken language query recognition are improved.

As shown in fig. 10, the apparatus further includes, on the basis of those shown in fig. 7 to 9: a first training module 7100 and a second training module 7110.

The first training module 7100 is configured to re-label, manually, spoken language retrieval sentences labeled by the retrieval domain identification model and having a confidence level of the query domain lower than a preset threshold, and re-train the spoken language retrieval sentences as spoken language retrieval corpus.

The second training module 7110 re-labels the spoken language retrieval sentences labeled by the retrieval intention recognition model, wherein the confidence degrees of the query intention and the parameter information are lower than the preset threshold value, and re-trains the spoken language retrieval sentences as spoken language retrieval corpora.

In summary, the spoken language query recognition apparatus based on artificial intelligence according to the embodiment of the present invention further re-labels the spoken language retrieval sentences labeled by the retrieval domain recognition model and having the confidence level of the query domain lower than the preset threshold manually, and re-trains the spoken language retrieval sentences as spoken language retrieval corpus, and re-labels the spoken language retrieval sentences labeled by the retrieval intention recognition model and having the confidence level of the parameter information lower than the preset threshold manually, and re-trains the spoken language retrieval corpus. Therefore, the efficiency and the accuracy of the spoken language query identification are further improved.

In the description of the present invention, it is to be understood that the terms "first", "second" and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A spoken language query recognition method based on artificial intelligence is characterized by comprising the following steps:

s1, inputting the training sample into the convolutional neural network to calculate corresponding actual output; the training sample is spoken language retrieval corpus of the annotation retrieval field;

s2, calculating the difference between the actual output and the ideal output corresponding to the sample, and reversely transmitting the adjustment weight matrix according to the method of minimizing the error;

s3, repeating the operations of S1 and S2 until the target function output after the classification layer of the convolutional neural network is smaller than a preset threshold value, and generating a retrieval domain identification model;

s4, generating an input layer of the recurrent neural network by the word segmentation of the training sample, and searching a real number vector corresponding to each word segmentation in the input layer from a predefined word list;

s5, generating a real number vector layer of the recurrent neural network according to the real number vector corresponding to each participle, and performing matrix mapping on the real number vector corresponding to each participle to obtain a hidden layer of the recurrent neural network;

s6, taking the real number vector of each participle as a condition, respectively calculating the probability of the category corresponding to each participle under the condition, and taking the probability as the output layer of the recurrent neural network;

and S7, training the recurrent neural network by using the training samples of the plurality of query fields to obtain a retrieval intention recognition model.

2. The method of claim 1, further comprising:

carrying out word segmentation processing on spoken retrieval sentences input by a user, carrying out part-of-speech tagging, and sequentially inputting word segmentation results into an input layer of the retrieval field identification model in a vector form;

fusing the currently input words and the historically input words through a hidden layer of the retrieval field identification model, and acquiring real number vectors of the spoken retrieval sentences according to the fusion results of all the words;

and performing probability analysis on the real number vector through an output layer of the retrieval field identification model to obtain the query field of the spoken language detection statement.

3. The method of claim 2, after said obtaining a query domain for the spoken language detection statement, further comprising:

sequentially inputting the word segmentation results into a retrieval intention recognition model corresponding to the query field in a vector form;

fusing the currently input words and the historically input words through a hidden layer of the retrieval intention recognition model, and acquiring real number vectors of the spoken retrieval sentences according to the fusion results of all the words;

performing probability calculation on the real number vector through an intention classification output layer of the retrieval intention recognition model to obtain the query intention of the spoken language detection statement;

and performing probability calculation on the real number vector through a serialization classification output layer of the retrieval intention identification model to obtain parameter information corresponding to the query intention.

4. The method of any of claims 1-3, further comprising:

and re-manually labeling the spoken language retrieval sentences marked by the retrieval field identification model and having the confidence coefficient of the query field lower than a preset threshold value, and re-training the spoken language retrieval sentences as spoken language retrieval corpora.

5. The method of any of claims 1-3, further comprising:

and re-manually labeling the spoken language retrieval sentences labeled by the retrieval intention identification model and labeled by the spoken language retrieval sentences with the confidence degrees of the parameter information lower than the preset threshold value, and re-training the spoken language retrieval sentences as spoken language retrieval corpora.

6. An artificial intelligence based spoken language query recognition device, comprising:

a first generation module, configured to S1, input the training samples into a convolutional neural network to calculate corresponding actual outputs; the training sample is spoken language retrieval corpus of the annotation retrieval field; s2, calculating the difference between the actual output and the ideal output corresponding to the sample, and reversely transmitting the adjustment weight matrix according to the method of minimizing the error; s3, repeating the operations of S1 and S2 until the target function output after the classification layer of the convolutional neural network is smaller than a preset threshold value, and generating a retrieval domain identification model;

a second generating module, configured to S4, generate an input layer of a recurrent neural network by using the participles of the training sample, and search a real number vector corresponding to each participle in the input layer from a predefined vocabulary; s5, generating a real number vector layer of the recurrent neural network according to the real number vector corresponding to each participle, and performing matrix mapping on the real number vector corresponding to each participle to obtain a hidden layer of the recurrent neural network; s6, taking the real number vector of each participle as a condition, respectively calculating the probability of the category corresponding to each participle under the condition, and taking the probability as the output layer of the recurrent neural network; and S7, training the recurrent neural network by using the training samples of the plurality of query fields to obtain a retrieval intention recognition model.

7. The apparatus of claim 6, further comprising:

the processing module is used for performing word segmentation processing on spoken retrieval sentences input by a user, performing part-of-speech tagging and sequentially inputting word segmentation results into an input layer of the retrieval field identification model in a vector form;

the first acquisition module is used for fusing the currently input words and the historically input words through a hidden layer of the retrieval field identification model and acquiring real number vectors of the spoken retrieval sentences according to the fusion results of all the words;

and the second acquisition module is used for carrying out probability analysis on the real number vector through an output layer of the retrieval field identification model to acquire the query field of the spoken language detection statement.

8. The apparatus of claim 7, further comprising:

the input module is used for sequentially inputting the word segmentation results into a retrieval intention recognition model corresponding to the query field in a vector form;

a third obtaining module, configured to fuse currently input words and historically input words through a hidden layer of the retrieval intention recognition model, and obtain real vectors of the spoken retrieval statement according to a fusion result of all the words;

a fourth obtaining module, configured to perform probability calculation on the real number vector through an intention classification output layer of the retrieval intention recognition model, and obtain a query intention of the spoken language detection statement;

and the fifth obtaining module is used for carrying out probability calculation on the real number vector through a serialization classification output layer of the retrieval intention identification model to obtain parameter information corresponding to the query intention.

9. The apparatus of any of claims 6-8, further comprising:

and the first training module is used for re-labeling the spoken language retrieval sentences labeled by the retrieval field identification model and marked by the query field confidence coefficient lower than the preset threshold value manually, and re-training the spoken language retrieval sentences as spoken language retrieval corpora.

10. The apparatus of any of claims 6-8, further comprising:

and the second training module is used for re-labeling the spoken language retrieval sentences labeled by the retrieval intention recognition model and labeled by the retrieval intention recognition model, wherein the confidence degrees of the parameter information are lower than the preset threshold value, and the spoken language retrieval sentences are used as spoken language retrieval corpora for re-training.