CN109241265B

CN109241265B - Multi-round query-oriented field identification method and system

Info

Publication number: CN109241265B
Application number: CN201811082183.0A
Authority: CN
Inventors: 许洛; 谭斌; 孙锐; 展华益; 王欣; 杨兰; 饶璐
Original assignee: Sichuan Changhong Electric Co Ltd
Current assignee: Sichuan Changhong Electric Co Ltd
Priority date: 2018-09-17
Filing date: 2018-09-17
Publication date: 2022-06-03
Anticipated expiration: 2038-09-17
Also published as: CN109241265A

Abstract

The invention discloses a field identification method facing multi-round query, which comprises the following steps: acquiring current and current above query texts of a user; performing word segmentation or word segmentation processing on the current and the previous query texts to obtain word strings or word strings; obtaining the vector representation corresponding to the word string or the word string; inputting the vector representation corresponding to the current and the above query text into a deep neural network; the combination of the information contained in the current and the above queries is accomplished in the neural network model through a variety of methods, and finally the domain to which the current query belongs is effectively determined. The method not only effectively realizes the field identification facing multi-round query, but also realizes the effective utilization of the above information through different combination modes of the current query and the above query.

Description

Multi-round query-oriented field identification method and system

Technical Field

The invention relates to the field of natural language processing and the technical field of deep learning, in particular to a multi-round query-oriented field identification method and a multi-round query-oriented field identification system.

Background

With the rapid development of technologies such as speech recognition and natural language processing, human-computer interaction and field intention classification are gradually common. In the interaction process, the semantic understanding of the user not only becomes an important point, but also becomes a technical difficulty. The field intention recognition of the text inquired by the user judges the demand direction of the user from the natural language inquiry (voice recognition result) input by the user, can help the platform to better understand the current interactive semantics, and is very important for dialogue products or interactive products.

The traditional method generally performs domain classification by extracting features of query texts and inputting a trained machine learning model, and the selected machine learning method comprises the following steps: support vector machines, naive bayes, etc. The field identification accuracy of the models facing single round of query is difficult to obtain the effect obtained by a deep learning model, the field identification effect facing multi-round of query is poor, and some methods need to find a characteristic construction method.

Disclosure of Invention

The invention aims to overcome the defects in the background art, and provides a field identification method and system for multi-round query, which allow a user to directly input a natural language text and quickly and effectively obtain the field to which the query text belongs.

In order to achieve the technical effects, the invention adopts the following technical scheme:

a multi-round query-oriented field identification method comprises the following steps:

A. acquiring a current query text to be identified and judging whether the query text exists, if so, acquiring the query text, otherwise, simultaneously using the current query text as the query text;

the query text can be one round of query data or historical data of multiple rounds of queries, and if the query text is the historical data of the multiple rounds of queries, a plurality of pieces of data are connected to serve as one query statement;

B. performing word segmentation or word segmentation processing on the current query text and the above query text to obtain a word string or word string of the current query text and a word string or word string of the above query text;

when the text is divided into word strings, a Chinese word segmentation algorithm is needed, and the method can be realized by a general conventional word segmentation method;

C. performing same-dimension vector representation on the word string or word string of the current query text and the word string or word string of the above query text to respectively obtain vector representation corresponding to the current query text and vector representation corresponding to the above query text; wherein, the vector representation adopts 0-1;

D. taking the vector representation corresponding to the current query text and the vector representation corresponding to the query text as input information of the neural network model;

E. the neural network model obtains the field to which the current query text belongs from the combined information after the input information is combined.

Further, the combination mode of the input information combination in the step E includes: and splicing the vectors, adding the weights, and taking the maximum value at the position corresponding to the vectors.

Further, the input information combination in the step E is specifically a combination of a vector representation corresponding to the current query text and a vector representation corresponding to the above query text, or a combination of the current query text and a hidden layer vector corresponding to the above query text in the neural network.

Further, when the input information combination in step E is a combination of the current query text and the vector representation corresponding to the above query text, step E specifically includes the following steps:

s1.1, combining the vector representation corresponding to the current query text and the vector representation corresponding to the query text into a new vector;

s1.2, the new vector obtained by combination is used as the input of the neural network and is sent to an input layer of the neural network, and the operation of the neural network is carried out;

s1.3, obtaining a Top-k result of the field to which the current query text belongs from the output result of the neural network, and selecting the field with the highest probability as the field to which the current query text belongs.

Further, the neural network model adopts a convolutional neural network, a one-way cyclic neural network or a two-way cyclic neural network.

Further, when the neural network model adopts a convolutional neural network, and when the input information combination in step E is a combination of the current query text in the neural network and the hidden layer vector corresponding to the above query text, step E specifically includes the following steps:

s2.1, respectively taking the vector representation corresponding to the current query text and the vector representation corresponding to the query text as an input and sending the input to an input layer of the convolutional neural network;

s2.2, in the convolutional neural network, respectively subjecting the vector representation corresponding to the current query text and the vector representation corresponding to the query text to a plurality of neural network layer calculations, including convolutional pooling operation of the convolutional neural network and hidden layer calculation of the cyclic neural network, and respectively obtaining a full-link layer vector corresponding to the current query text and a full-link layer vector corresponding to the query text;

s2.3, combining the full-connection layer vector corresponding to the current query text and the full-connection layer vector corresponding to the query text in the full-connection layer to obtain a new vector;

s2.4, sending the new vector into an activation function after the dropout operation;

s2.5, inputting the output result of the activation function into a full connection layer to adjust vector dimensionality and then sending the vector dimensionality into a softmax classifier to carry out domain classification;

s2.6, obtaining a Top-k result of the field to which the current query text belongs from the output result of the convolutional neural network, and selecting the field with the highest probability as the field to which the current query text belongs.

Meanwhile, the invention also discloses a multi-round query-oriented field identification system, which comprises the following steps:

the acquisition module is used for acquiring the current query text to be identified and extracting the above query text when the current query text exists in the above query text;

the vector representation module is used for performing word segmentation or word segmentation processing on the current query text and the above query text to obtain word strings or word strings, and performing corresponding same-dimension vector representation on the obtained word strings or word strings to respectively obtain vector representation corresponding to the current query text and vector representation corresponding to the above query text;

the domain identification module is used for taking the vector representation corresponding to the current query text and the vector representation corresponding to the query text as input information of the neural network information combination model and determining the domain of the current query text from the output result of the neural network information combination model;

the neural network information combination model is embedded in the field identification module and used for processing input information, obtaining the field to which the current query text belongs from the combined information after the input information combination and outputting the field, wherein the input information combination is specifically the combination of vector representation corresponding to the current query text and vector representation corresponding to the above query text or the combination of the current query text and hidden layer vectors corresponding to the above query text in the neural network.

The input end of the vector representation module is connected with the acquisition module, and the output end of the vector representation module is connected with the field identification module.

Further, the vector representation module comprises a dictionary construction unit and a vector representation unit;

the dictionary construction unit is used for constructing a dictionary according to common words or words appearing in the training corpus, and the vector representation unit is used for carrying out vector representation on the current query text and the above query text according to the dictionary to obtain vector representation corresponding to the current query text and vector representation corresponding to the above query text.

Further, when the input information combination is a combination of a current query text and a hidden layer vector corresponding to the above query text in the neural network, the domain identification module comprises a network input unit, an information combination unit, an information activation unit, a domain classification unit and a domain determination unit;

the network input unit is used for respectively inputting the vector representation corresponding to the current query text and the vector representation corresponding to the query text into the neural network and performing convolution and pooling operations on a plurality of neural networks;

the information combination unit is used for respectively obtaining the processed full connection layer vector corresponding to the current query text and the full connection layer vector corresponding to the above query text in the full connection layer of the neural network, and combining the full connection layer vector corresponding to the current query text and the full connection layer vector corresponding to the above query text in the full connection layer to obtain a new vector;

the information activation unit is used for activating the new vector, the field classification unit is used for sending the activated new vector into the full-connection layer to perform vector dimension adjustment and then performing field classification, and the field determination unit is used for obtaining a Top-k result of the field to which the current query text belongs from the output result of the neural network and selecting the Top-k result with the maximum probability as the field to which the current query text belongs.

Compared with the prior art, the invention has the following beneficial effects:

the invention relates to a multi-round query-oriented field identification method and a multi-round query-oriented field identification system, which combine multi-round query text information and are based on a deep neural network multi-round field identification technology, and compared with the field identification method in the prior art, the multi-round query-oriented field identification method and the multi-round query-oriented field identification system have the following advantages:

firstly, the multi-round query-oriented field identification method and the multi-round query-oriented field identification system do not need to construct features, allow a user to input natural language and directly obtain the field of the natural language;

secondly, the multi-round query-oriented field identification method and the multi-round query-oriented field identification system can correspond to a plurality of different above query information utilization modes through different combination modes;

thirdly, the multi-round query-oriented domain identification method and the multi-round query-oriented domain identification system have the advantages of being good in domain identification effect and strong in adaptability to different classification tasks.

Drawings

FIG. 1 is a flow chart of the domain identification method for multi-round query according to the present invention.

FIG. 2 is a schematic diagram of the flow of a method in a neural network model in one implementation of the invention.

FIG. 3 is a schematic diagram of the flow of a method in a neural network model in one implementation of the invention.

Detailed Description

The invention will be further elucidated and described with reference to the embodiments of the invention described hereinafter.

Example (b):

the first embodiment is as follows:

as shown in fig. 1, a domain identification method for multi-round query can be used for domain identification for single-round query, and can be used as part of intention identification for single-round query or intention identification for multi-round query.

The multi-round query-oriented domain identification method of the embodiment can be configured in a multi-round interactive domain identification device. The apparatus may be disposed in a server or an electronic device, and the embodiment of the present invention is not limited thereto. The electronic device can be a television, a conversation robot, a mobile phone multi-wheel interaction device or other household appliances such as an air conditioner and the like.

The field identification method facing to the multi-round query in the embodiment specifically comprises the following steps:

the method comprises the following steps: and acquiring a current query text of a user to be identified, and extracting the above query text if the current query exists in the above query. As shown in table 1, a statistical table of the current query text and the above query text acquisition conditions in this embodiment is shown: firstly, acquiring a current query text, for example, acquiring that the acquisition result is 'less than five hundred yuan', then judging whether the user has the above query text, and if so, acquiring the above query text.

The above query text may be the query data of one round, or the historical data of multiple rounds of queries, and if the query text is the historical data of multiple rounds of queries, multiple pieces of data are connected as one above query statement, as shown in row 3, column 2 of table 1. If there is no above query, then the current query text is taken as the above query text, as shown in Table 1, line 5, column 2.

Table 1: statistical table of current query text, obtaining condition of above query text and corresponding field

Current query text	The above historical query text	Field of the invention
			Less than five hundred yuan	Fire ticket for seeking lower Chengdu and going to Shenzhen	Train ticket
Less than five hundred yuan	Checking the train ticket/Nanjing departed/next week of Chengdu going to Shenzhen	Train ticket
			Less than five hundred yuan	I want to buy the refrigerator	Shopping
Less than five hundred yuan	Less than five hundred yuan	Shopping/other

Step two: and performing word segmentation or word segmentation processing on the current query text and the current above query text to obtain a word string or a word string. For query q, the length is 11, wherein the query is 'train ticket for looking down Cheng and going to Shenzhen', and if the query is divided into strings, the query is 'train/ticket/for looking down/Cheng/Du/go/Shenzhen'; the method is divided into word strings, and the ideal effect is as follows: "find down/Cheng Du/go/Shenzhen/train ticket", length is 6.

If the Chinese word segmentation algorithm is needed for the word string segmentation, the specific algorithm summarized in the embodiment is realized by a conventional word segmentation method.

Step three: the corresponding same-dimension vector representation of the word string or the word string is obtained. The vector representation is represented by 0-1. If the vector representation adopts d dimension, a dictionary of d x 1 is constructed first. Then, the dimension represented by query q using the case of strings is: d 11; if the word string is divided into ideal word strings, the dimension is as follows: d 6.

Step four: and taking the vector representation corresponding to the current query text and the above query text as the input of the neural network model. In practice, the deep neural network may be any one of a convolutional neural network, a unidirectional cyclic neural network, and a bidirectional cyclic neural network. In this embodiment, a convolutional neural network is specifically adopted.

However, when the word strings or word strings are adopted, the number of words or phrases contained in different queries is still inconsistent, and therefore, the vector representation of each query text needs to be unified to a larger dimension di. Thus, each query dimension is d × di, the insufficient part is supplemented by a 0 filling method, and if the redundant part exists in a few queries, the front part is intercepted.

Step five: the neural network model obtains the field to which the current query text belongs from the information obtained by combining the current query text and the above query text.

Specifically, there are two vector combination and processing modes, and the specific operation steps performed in the neural network by using different vector combination methods are different. The method mainly comprises the following steps: the vector representation of the query text is combined, and the combination method of the hidden layer vectors corresponding to the query text in the neural network can be implemented by optionally selecting one of the methods.

In this embodiment, an example of a method for combining vector representations of query texts is selected:

fig. 2 shows an explanation of corresponding operations performed in a neural network when a method of combining vector representations of query texts is selected, which specifically includes the steps of:

step s 1: and combining the vector representation corresponding to the current query text and the vector representation corresponding to the query text into a new vector, wherein the dimension of the new vector depends on the combination mode.

Step s 2: and sending the new vector obtained by combination into an input layer of the neural network as the input of the neural network.

Step s 3: the operation performed by the neural network, such as the convolutional neural network of this embodiment, needs to be performed by: convolution, pooling, fully connected layers, dropout, softmax classification, etc.

Step s 4: and finally, obtaining a Top-k result of the field from the output of the neural network, and selecting the field with the maximum probability as the current query text.

Example two

The steps of this embodiment are the same as those of the first embodiment, except that in the fifth embodiment, a combination method of hidden layer vectors corresponding to the query text in the neural network is selected, and the description of the corresponding operation performed in the neural network specifically includes, as shown in fig. 3, the steps of:

step s 1: and respectively taking the vector representation corresponding to the current query text and the vector representation corresponding to the query text as an input and sending the input to an input layer of the convolutional neural network.

Step s 2: in the convolutional neural network, vector representation corresponding to the current query text and vector representation corresponding to the query text are subjected to a plurality of neural network layer calculations including convolutional pooling operation of the convolutional neural network and hidden layer calculation of the cyclic neural network, so that a full-link layer vector corresponding to the current query text and a full-link layer vector corresponding to the query text are obtained respectively.

If the query text is a recurrent neural network, the current query text and the above query text need to be subjected to hidden layer calculation.

Step s 3: and combining the full-link layer vector corresponding to the current query text and the full-link layer vector corresponding to the query text in the full-link layer to obtain a new vector.

Step s 4: and sending the new vector into an activation function after the dropout operation, wherein the activation function is a linear rectification function relu function and the like.

Step s 5: the result of the activation function output is input to the full link layer to adjust the vector dimension, and then sent to the softmax classifier.

Step s 6: and finally, obtaining a Top-k result of the field from the output of the neural network, and selecting the Top-k result with the maximum probability as the field to which the current query text belongs.

Specifically, in the steps s1 of the first embodiment and s3 of the second embodiment, when the vector is subject to creep, the selectable combination includes: the vector splicing, the weighted addition and the maximum value at the corresponding position of the vector are realized, and different combination modes realize the utilization of the information in different modes.

The introduction of each combination mode is as follows, and the combination modes can be selected according to specific requirements in practice:

(1) the concatenation of vector, two sets of vector end to end connection, the dimension becomes: d (2 di). This approach fully preserves the current query information and the above query information.

(2) And adding the weights, and adding the weights after multiplying the corresponding positions of the two groups of vectors. For example, the current query correspondence vector (C1, … …, Cdi), the above query correspondence vector (H1, … …, Hdi), the new vector after combination is (AC1+ BH1, … …, ACdi + BHdi), and a and B are weights. If the current query is dominant, with the above query as a reference, then A should be greater than B. If A and B are equal, then the average is taken at the corresponding position of the two sets of vectors.

(3) The maximum value is taken at the corresponding position of the vector, namely, a larger value is taken at the corresponding position of the two groups of vectors. For example, if C1 is greater than H1 and Cdi is less than Hdi in the above example, the new vector is (C1, … …, Hdi). This method is similar to the max pooling method in convolutional neural networks, with the aim of keeping the more salient features of the two sets of vectors and discarding the less salient features.

EXAMPLE III

A multi-round query-oriented field identification system can directly identify the field to which a user query belongs. The system specifically comprises:

the acquisition module is used for acquiring a current query text of a user to be identified, and extracting the above query text if the current query exists in the above query;

and the vector representation module is used for performing word segmentation or word segmentation processing on the current query text and the current query text above to obtain a word string or a word string, and then obtaining the corresponding same-dimension vector representation of the word string or the word string. The module comprises a dictionary construction unit and a vector representation unit.

The input end of the vector representation module is connected with the acquisition module, and the output end of the vector representation module is connected with the domain identification module. In this embodiment, the vector combination method adopted by the vector of the system in the neural network model is a combination method of hidden layer vectors corresponding to query texts in the neural network.

Specifically, the vector representation module comprises a dictionary construction unit and a vector representation unit.

The dictionary construction unit is used for constructing a dictionary according to common words or words appearing in the training corpus, and the size of the dictionary does not need to be large according to the application scene.

The vector representation unit is used for carrying out vector representation on the current query text and the above query text according to the dictionary to obtain vector representation corresponding to the current query text and vector representation corresponding to the above query text. Specifically, a 0-1 vector representation is performed.

The domain identification module comprises a network input unit, an information combination unit, an information activation unit, a domain classification unit and a domain determination unit.

The network input unit is used for respectively inputting the vector representation corresponding to the current query text and the vector representation corresponding to the query text into the neural network and performing convolution and pooling operations of a plurality of neural networks.

And the information combination unit is used for respectively obtaining the processed full connection layer vector corresponding to the current query text and the processed full connection layer vector corresponding to the above query text in the full connection layer of the neural network, and combining the full connection layer vector corresponding to the current query text and the full connection layer vector corresponding to the above query text in the full connection layer to obtain a new vector.

The information activation unit is used for activating the new vector through activation function operation, the field classification unit is used for sending the activated new vector into the full connection layer to perform vector dimension adjustment and then inputting the vector into the softmax classifier to perform field classification, and the field determination unit is used for obtaining a Top-k result of the field to which the current query text belongs from the output result of the neural network and selecting the Top-k result with the maximum probability as the field to which the current query text belongs.

It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims

1. A multi-round query-oriented field identification method is characterized by comprising the following steps:

C. performing same-dimension vector representation on the word strings or word strings of the current query text and the word strings or word strings of the above query text to respectively obtain vector representation corresponding to the current query text and vector representation corresponding to the above query text;

E. the neural network model obtains the field to which the current query text belongs from the combined information after the input information is combined; the combination mode of the input information combination in the step E comprises the following steps: splicing vectors, adding with weights, and taking the maximum value at the position corresponding to the vectors; the input information combination is specifically a combination of vector representation corresponding to the current query text and vector representation corresponding to the above query text or a combination of the current query text and hidden layer vectors corresponding to the above query text in a neural network;

when the input information combination in the step E is a combination of the current query text and the vector representation corresponding to the above query text, the step E specifically includes the following steps:

2. The multi-round query oriented domain identification method according to claim 1, wherein the neural network model adopts a convolutional neural network, a one-way cyclic neural network or a two-way cyclic neural network.

3. The method for identifying a multi-round query-oriented domain according to claim 2, wherein when the neural network model adopts a convolutional neural network, and when the input information combination in the step E is a combination of a current query text in the neural network and a hidden layer vector corresponding to an above query text, the step E specifically includes the following steps:

s2.5, inputting the output result of the activation function into a full connection layer to adjust vector dimensionality, and then sending the vector dimensionality into a softmax classifier to classify the field categories;

s2.6, obtaining a Top-k result of the field to which the current query text belongs from the output result of the convolutional neural network, and selecting the Top-k result with the maximum probability as the field to which the current query text belongs.

4. A multi-round query-oriented domain identification system for implementing the multi-round query-oriented domain identification method according to claim 1, comprising:

the neural network information combination model is embedded in the field identification module and used for processing input information, obtaining the field to which the current query text belongs from the combined information after the input information combination and outputting the field, wherein the input information combination is specifically a combination of vector representation corresponding to the current query text and vector representation corresponding to the above query text or a combination of the current query text and hidden layer vectors corresponding to the above query text in a neural network;

the input end of the vector representation module is connected with the acquisition module, and the output end of the vector representation module is connected with the domain identification module.

5. The multi-round query oriented domain recognition system of claim 4, wherein the vector representation module comprises a dictionary construction unit and a vector representation unit;

6. The multi-round query oriented domain identification system according to claim 5, wherein when the input information combination is a combination of hidden layer vectors corresponding to a current query text and an above query text in a neural network, the domain identification module comprises a network input unit, an information combination unit, an information activation unit, a domain classification unit, and a domain determination unit;

the network input unit is used for respectively inputting the vector representation corresponding to the current query text and the vector representation corresponding to the query text into the neural network and respectively performing a plurality of convolution and pooling operations by the neural network;

the information combination unit is used for respectively obtaining the processed full connection layer vector corresponding to the current query text and the processed full connection layer vector corresponding to the above query text in the full connection layer of the neural network, and combining the full connection layer vector corresponding to the current query text and the full connection layer vector corresponding to the above query text in the full connection layer to obtain a new vector;

the information activation unit is used for activating the new vectors, the field classification unit is used for sending the activated new vectors into the full-link layer to perform vector dimension adjustment and then performing field classification, and the field determination unit is used for obtaining Top-k results of the field to which the current query text belongs from output results of the neural network and selecting the Top-k results with the maximum probability as the field to which the current query text belongs.