CN111222343B - Intention recognition method and intention recognition device - Google Patents

Intention recognition method and intention recognition device Download PDF

Info

Publication number
CN111222343B
CN111222343B CN201911244722.0A CN201911244722A CN111222343B CN 111222343 B CN111222343 B CN 111222343B CN 201911244722 A CN201911244722 A CN 201911244722A CN 111222343 B CN111222343 B CN 111222343B
Authority
CN
China
Prior art keywords
vector
sentence
attention
mapping
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911244722.0A
Other languages
Chinese (zh)
Other versions
CN111222343A (en
Inventor
黄日星
熊友军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ubtech Robotics Corp
Original Assignee
Ubtech Robotics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ubtech Robotics Corp filed Critical Ubtech Robotics Corp
Priority to CN201911244722.0A priority Critical patent/CN111222343B/en
Publication of CN111222343A publication Critical patent/CN111222343A/en
Application granted granted Critical
Publication of CN111222343B publication Critical patent/CN111222343B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an intention recognition method and an intention recognition device, wherein the method comprises the following steps: acquiring sentences to be identified; converting the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector; respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector; acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector; acquiring a first sentence vector according to the first attention vector, and acquiring a second sentence vector according to the second attention vector; obtaining a target sentence vector according to the first sentence vector and the second sentence vector; and inputting the target sentence vector into a linear layer of the Bert model to perform intention recognition, and obtaining an intention recognition result.

Description

Intention recognition method and intention recognition device
Technical Field
The invention relates to the technical field of natural language processing and neural networks, in particular to an intention recognition method and an intention recognition device.
Background
The transducer model is a natural language processing model (Natural Language Processing, called NLP for short) developed by Google, inc., and is an important innovation for NLP development. The Bert model (Bidirectional Encoder Representations from Transformers) is based on an encoder (encoder) structure of the converter model, the encoder comprises a plurality of encoding layers, one encoding layer comprises a self-attention (self-attention) layer, an addition & normalization layer and a fully-connected feedforward neural network layer, the self-attention layer is formed by stacking N Scaled Dot Product Attention (SDPA) components, namely the SDPA is a component of the Bert model, after an input sentence is converted into a feature vector through an embedding layer in the Bert model, the sentence to be recognized is analyzed through the SDPA component in a self-attention mechanism, and final intention recognition is performed. In the conventional SDPA component, the input sentence is subjected to correlation processing in the following manner to obtain a sentence vector capable of performing intent recognition:
Q=K=V=embedding(sen);
vec=linear(sum(Attention(Q,K,V),dim=0));
where sen represents an input sentence, vec represents a final sentence vector for intent recognition, and matrices Q, K and V are vectors obtained by embedding the input sentence in an embedding layer (embedding), it can be seen that in the conventional SDPA component, since the matrix Q is equal to the matrix K, and the value of the multiplication of the matrix Q and the transposed point of the matrix K is a diagonal matrix, the non-diagonal elements in the diagonal matrix are generally smaller, and for the case where the dimension of the input vector is larger, the non-diagonal elements tend to appear as numbers similar to 0. For two different sentences to be recognized, the distance between the point multiplication of the transpose of the matrix Q1 and the matrix K1 of one sentence and the point multiplication of the transpose of the matrix Q2 and the matrix K2 of the other sentence is smaller, which limits the performance of the intent recognition of the input sentence, so that the obtained sentence vector contains less information, and the final intent recognition accuracy of the sentence is affected.
Disclosure of Invention
The embodiment of the invention provides an intention recognition method, an intention recognition device and a computer readable storage medium, which are used for solving the problem that the existing intention recognition performance is limited so as to improve the accuracy of sentence intention recognition.
In a first aspect, an embodiment of the present invention provides an intent recognition method, including:
acquiring sentences to be identified;
converting the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector;
respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector;
acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector;
acquiring a first sentence vector according to the first attention vector, and acquiring a second sentence vector according to the second attention vector;
obtaining a target sentence vector according to the first sentence vector and the second sentence vector;
and inputting the target sentence vector into a linear layer of the Bert model to perform intention recognition, and obtaining an intention recognition result.
Optionally, the acquiring the first attention vector according to the first mapping vector and acquiring the second attention vector according to the second mapping vector includes:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first attention degree, and performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second attention degree;
respectively carrying out normalization processing on the first attention degree and the second attention degree to obtain a first target attention degree and a second target attention degree;
and performing point multiplication on the first target attention and the third vector to obtain a first attention vector, and performing point multiplication on the second target attention and the third vector to obtain a second attention vector.
Optionally, the performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first attention degree includes:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first numerical value;
dividing the first value by a scaling factorThe first attention degree is obtained, wherein the d K Is the dimension of the second vector;
the performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second attention degree includes:
performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second value;
dividing the second value by a scaling factorAnd obtaining the second attention degree.
Optionally, the obtaining a first sentence vector according to the first attention vector and obtaining a second sentence vector according to the second attention vector includes:
accumulating the N dimension of the first attention vector to obtain the first sentence vector;
and accumulating the N dimension of the second attention vector to obtain the second sentence vector.
Optionally, N is 0.
Optionally, the obtaining a target sentence vector according to the first sentence vector and the second sentence vector includes:
and merging the first sentence vector and the second sentence vector to obtain the target sentence vector.
Optionally, the converting, by the embedding layer, the sentence to be recognized into a feature vector includes:
and converting the sentence to be recognized into a feature vector by the embedding layer in a Word2Vec mode.
In a second aspect, an embodiment of the present invention provides an intention recognition apparatus, including:
the acquisition module is used for acquiring sentences to be identified;
a processing module for:
converting the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector;
respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector;
acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector;
obtaining a first sentence vector according to the first attention vector, and obtaining a second sentence vector according to the second attention vector;
obtaining a target sentence vector according to the first sentence vector and the second sentence vector;
the intention recognition module is used for inputting the target sentence vector into a linear layer of the Bert model to perform intention recognition and obtaining an intention recognition result.
Optionally, the processing module is further configured to:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first attention degree, and performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second attention degree;
respectively carrying out normalization processing on the first attention degree and the second attention degree to obtain a first target attention degree and a second target attention degree;
and performing point multiplication on the first target attention and the third vector to obtain a first attention vector, and performing point multiplication on the second target attention and the third vector to obtain a second attention vector.
Optionally, the processing module is further configured to:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first numerical value;
dividing the first value by a scaling factorThe first attention degree is obtained, wherein the d K Is the dimension of the second vector;
performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second value;
dividing the second value by a scaling factorAnd obtaining the second attention degree.
Optionally, N is 0.
Optionally, the processing module is further configured to:
and merging the first sentence vector and the second sentence vector to obtain the target sentence vector.
Optionally, the processing module is further configured to:
and converting the sentence to be recognized into a feature vector by the embedding layer in a Word2Vec mode.
A third aspect of the invention provides an intent recognition device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, said processor implementing the steps of the method according to the first aspect of the invention when said computer program is executed.
A fourth aspect of the invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, carries out the steps of the method according to the first aspect of the invention.
In one scheme realized by the intent recognition method, the intent recognition device and the computer readable storage medium provided by the embodiment of the invention, the sentence to be recognized is converted into a first vector Q, a second vector K and a third vector V through an embedding layer; respectively carrying out linear mapping on the first vector Q and the second vector K to obtain a first mapping vector and a second mapping vector; acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector; acquiring a first sentence vector according to the first attention vector, and acquiring a second sentence vector according to the second attention vector; obtaining a target sentence vector according to the first sentence vector and the second sentence vector; and inputting the target sentence vector into a linear layer of the Bert model to perform intention recognition, and obtaining an intention recognition result. Therefore, in the present invention, since the first vector and the second vector corresponding to the feature vector of the sentence to be recognized are linearly converted, the obtained first mapping vector is not equal to the second mapping vector, and the unequal matrix Q 'and the matrix K' are obtained according to the first mapping vector and the second mapping vector, which avoids the problem that in the conventional SDPA component processing method, the distance between the point multiplication of the matrix Q and the transpose of the matrix K, which is the non-diagonal element of the diagonal matrix obtained by the point multiplication of the matrix Q and the transpose of the matrix K, is easy to be approximately 0, which limits the performance of intent recognition of the input sentence.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a Bert model architecture based on a transducer model according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a self-attention layer in a coding layer according to an embodiment of the present invention;
FIG. 3 is a flow chart of the method for identifying intent in an embodiment of the present invention;
FIG. 4 is a flow chart of the intent recognition method according to the embodiment of the invention for obtaining the attention vector according to the mapping vector;
FIG. 5 is a schematic diagram of an apparatus for recognizing intention in an embodiment of the present invention;
fig. 6 is another structural diagram of the intention recognition device in the embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The method for identifying intent in this embodiment is based on the Bert model, and for the sake of understanding the embodiments of the present invention, some description will be made of the Bert model, and the Bert model is based on the encoder structure in the encoder (encoder) of the Transformer model, specifically, as shown in fig. 1, the Transformer model includes an encoder (encoder) and a decoder (decoder), but since the decoder is not used in the Bert model, the description will not be given here. In the encoder structure of the transducer model, the encoder includes a plurality of encoding (encoder) layers, the encoder layers include a self-attention (self-attention) layer, an addition & normalization layer and a fully-connected feedforward neural network layer, as shown in fig. 2, the self-attention layer is formed by stacking N Scaled Dot Product Attention (SDPA) components, which can be understood as SDPA is a component of the Bert model. In this embodiment, based on the Bert model, a first vector, a second vector and a third vector are obtained by converting a sentence to be recognized through an embedding layer (embedding), and are input into an SDPA component, an input end of an encoder performs input into an SDPA base component of the encoder, and the first vector, the second vector and the third vector output processed target sentence vectors after being processed by the SDPA component in the self-attention layer. In the embodiment of the invention, the processing mode of the SDPA component is different from that of the traditional SDPA component, so that the obtained matrix Q 'is not equal to the matrix K', the dot multiplication of the transpose of the matrix Q 'and the matrix K' is not a diagonal matrix, the situation that only diagonal elements are large and the non-diagonal elements are small can not occur, the distance obtained by the dot multiplication of the transpose of the matrix Q 'and the matrix K' is larger, the information contained in the obtained sentence vector is more, and the sentence intention recognition accuracy is improved. See the description of the examples below for details:
specifically, as shown in fig. 3, the intention recognition method includes the steps of:
s10: and acquiring sentences to be identified.
S11: and converting the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector.
If the sentence to be identified is sen, the following can be obtained:
the first vector=second vector=third vector= embedding (sen).
For convenience of understanding, the following description is made with actual words, and for words corresponding to sentences to be recognized, it is assumed that an input sentence to be recognized includes N words after word segmentation, and accordingly, each word is subjected to word vector conversion by an embedding layer (enabling) to obtain N word vectors, where the N word vectors correspond to N first vectors, N second vectors, and N third vectors, and the N first vectors form a matrix Q, the N second vectors form a matrix K, and the N third vectors form a matrix V.
For example, let the 1 st vocabulary be Apple, that is, an embedding layer (embedding) through which the 1 st vocabulary passes obtains a first vector Q1, and correspondingly, the vocabulary Apple is converted by the embedding layer (embedding) to obtain a second vector K1 and a third vector V1, where:
the first vector q1=second vector k1=third vector v1= embedding (Apple).
Accordingly, for a sentence to be recognized, a plurality of words are involved, so in practical application, the first vector, the second vector and the third vector corresponding to all words in the sentence to be recognized can participate in calculation in the form of a matrix to respectively form a matrix Q, a matrix V and a matrix K. That is, the matrix Q, the matrix K, and the matrix V corresponding to the sentence to be recognized can be obtained by converting the sentence to be recognized through an embedding layer (embedding), that is:
the matrix q=matrix k=matrix v=casting (sentence to be recognized).
Specifically, when intention recognition is required for a certain sentence, the sentence to be recognized can be obtained and input into a linear layer of a Bert model, and the first vector, the second vector and the third vector can be obtained by converting the sentence to be recognized through an embedding layer of the Bert model. The sentence to be recognized may be a sentence in the form of text information or voice, which is not limited herein; based on the sentence to be recognized, the sentence to be recognized is segmented and vectorized through the embedding layer to be converted into a first vector, a second vector and a third vector or matrix Q, a matrix K and a matrix V, namely the sentence to be recognized is converted into a vector form or a matrix form through the embedding layer. It will be appreciated that the embedding layer may implement the Word vector conversion process described above using one-hot or Word2Vec, etc., and is not limited thereto.
In one embodiment, the information of the current input is converted into a feature vector through an embedding layer, and specifically, a Word2Vec mode is adopted to convert a sentence of the current input into a first vector, a second vector and a third vector through the embedding layer. The Word2Vec Word embedding mode is one of the embedding layers (embedding), and converts the obtained sentence to be recognized into a vector form by the Word2Vec mode, specifically, according to a given corpus, the Word2Vec can quickly and effectively express a Word into a vector form by an optimized training model, which is not described herein.
S12: and respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector.
In this embodiment, after the first vector and the second vector are obtained, the first vector and the second vector are respectively mapped linearly to obtain a first mapped vector and a second mapped vector, specifically, the first vector, the second vector and the third vector corresponding to all the words in the sentence to be recognized may be involved in calculation in the form of a matrix to obtain a matrix Q, a matrix K and a matrix V corresponding to the sentence to be recognized, and then the matrix Q and the matrix K corresponding to the first vector and the second vector are mapped linearly, which may be expressed as that Q '=linear 1 (Q), K' =linear 2 (K), that is, after the first vector Q and the second vector are obtained, the matrix Q and the matrix V corresponding to the first vector and the second vector may be mapped linearly through the linear thread layer, so as to obtain a corresponding matrix Q 'and a matrix K', and the obtained matrix Q 'may be unequal to the matrix K' through linear mapping of the matrix Q and the matrix K in step S12.
S13: the first attention vector is acquired according to the first mapping vector, and the second attention vector is acquired according to the second mapping vector.
Based on the steps S10-S12, converting the sentences to be recognized through the embedding layer to obtain a first vector, a second vector and a third vector, and carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector. In step S13, a first attention vector is acquired according to the first mapping vector, and a second attention vector is acquired according to the second mapping vector, and in a specific embodiment, as shown in fig. 4, the first attention vector and the second attention vector may be acquired through steps S131-S133.
S131: and carrying out point multiplication on the first mapping vector and the transposition of the second mapping vector to obtain a first attention, and carrying out point multiplication on the second mapping vector and the transposition of the first mapping vector to obtain a second attention.
After the first mapping vector and the second mapping vector are obtained, a first attention degree is obtained according to the first mapping vector and the transposed second mapping vector, and a second attention degree is obtained according to the second mapping vector and the transposed first mapping vector.
In the embodiment of the present invention, two ways of obtaining the first attention degree and the second attention degree may be adopted, where the first way is to perform point multiplication on the first mapping vector and the transposed second mapping vector to obtain the first attention degree, and perform point multiplication on the second mapping vector and the transposed first mapping vector to obtain the second attention degree, where when the first mapping vector participates in calculation in a matrix form, the specific formula may be as follows:
score1=Q'K' T ,score2=(Q'K' T ) T =Q' T K';
wherein score1 represents a first degree of interest, score2 represents a second degree of interest, Q 'represents a matrix obtained by linearly mapping a matrix Q corresponding to the first vector, K' represents a matrix obtained by linearly mapping a matrix K corresponding to the second vector, and Q '' T Representing the transpose of matrix Q ', K' T Representing the transpose of matrix K'. The second is to transpose the first mapping vector with the second mapping vectorObtaining a first score after dot multiplication, and dividing the first score by a scaling factorObtaining a first attention degree, and carrying out point multiplication on the second mapping vector and the transposition of the first mapping vector to obtain a second value; dividing the second value by the scaling factor +.>Obtaining a second attention degree, wherein d K For the dimension of the second vector, when the second vector participates in calculation in a matrix form, the following formula can be specifically shown:
in step S131, based on the fact that the matrix Q 'obtained in step S12 is not equal to the matrix K', the dot product of the transpose of the matrix Q 'and the matrix K' is not a diagonal matrix, and the situation that only diagonal elements are large and non-diagonal elements are small does not occur.
S132: and respectively carrying out normalization processing on the first attention degree and the second attention degree to obtain a first target attention degree and a second target attention degree.
After the first attention degree and the second attention degree are obtained, respectively carrying out normalization processing on the first attention degree and the second attention degree to obtain a first target attention degree and a second target attention degree, wherein the normalization processing is carried out on the first attention degree and the second attention degree for the convenience of calculation and analysis. By way of example, the following formula may be used:
softmax(Q'K' T );
soft max(Q'K' T ) T =sof tmax(Q' T K');
or alternatively, the first and second heat exchangers may be,
wherein soft max (Q 'K' T ) Representing the first target attention, soft max (Q 'K' T ) T Representing the second target attention; or alternativelyRepresenting the first target degree of interest, or +.>Representing a second target attention.
In the second manner, by adding the scaling factor, when the dimension of K is large, the dimension of the result obtained by dot multiplying the first vector and the second vector is also large, so that the result is in a region where the gradient of the softmax function is small, and for the case where the gradient is small, this is disadvantageous for back propagation, and in order to overcome this negative effect, dividing by a scaling factor can slow down the case to some extent. Specifically, the scaling factor generally adopts the formulation 8 of 64, which is understood herein as one of the default values, although other values may be selected, and the scaling factor is not limited herein.
The above-mentioned step S132 process transfers the result calculated in step S131 to the softmax layer, and the score value is normalized by the softmax layer, so that the model processing and analysis are facilitated.
S133: and performing point multiplication on the first target attention and the third vector to obtain a first attention vector, and performing point multiplication on the second target attention and the third vector to obtain a second attention vector.
After the first target attention and the second target attention are obtained, the first target attention and the second target attention are respectively dot multiplied with a third vector to obtain a first attention vector and a second attention vector, and when the first attention vector and the second attention vector participate in the operation in a matrix form, the following formula can be specifically shown:
Attention1(Q'、K'、V)=soft max(Q'K' T )V;
Attention2(Q'、K'、V)=soft max(Q'K' T ) T V=soft max(Q' T K')V;
or alternatively, the first and second heat exchangers may be,
wherein, attention1 represents a matrix corresponding to a first Attention vector, attention2 represents a matrix corresponding to a second Attention vector, and V is a matrix V corresponding to a third vector.
S14: and acquiring a first sentence vector according to the first attention vector, and acquiring a second sentence vector according to the second attention vector.
In one embodiment, a first sentence vector is obtained according to the first attention vector, and a second sentence vector is obtained according to the second attention vector, specifically, the N-th dimension of the first attention vector is accumulated to obtain the first sentence vector; the second sentence vector is obtained by accumulating the nth dimension of the second attention vector, and can be expressed by the following formula:
vec1=sum(Attention1(Q'、K'、V),dim=N);
vec2=sum(Attention2(Q'、K'、V),dim=N);
wherein vec1 represents a first sentence vector, vec2 represents a second sentence vector, and in practical application, the 0 th dimension of the corresponding matrix Attention1 of the first Attention vector can be accumulated to obtain the first sentence vector; and accumulating the 0 th dimension of the matrix Attention2 corresponding to the second Attention vector to obtain a second sentence vector, wherein the second sentence vector can be expressed by the following formula:
vec1=sum(Attention1(Q'、K'、V),dim=0);
vec2=sum(Attention2(Q'、K'、V),dim=0);
s15: and obtaining a target sentence vector according to the first sentence vector and the second sentence vector.
S16: and inputting the target sentence vector into a linear layer of the Bert model to perform intention recognition, and obtaining an intention recognition result.
After the first sentence vector and the second sentence vector are obtained, a target sentence vector can be obtained according to the first sentence vector and the second sentence vector, and the target sentence vector is input into a linear layer of a Bert model for intention recognition, so that an intention recognition result is obtained. In one embodiment, the combining according to the first sentence vector vec1 and the second sentence vector vec2 to obtain the target sentence vector vec may be represented by the following formula:
vec=vec1+vec2。
therefore, in the embodiment of the invention, the obtained matrix Q 'is not equal to the matrix K', and the dot product of the transposition of the matrix Q 'and the matrix K' is not a diagonal matrix, and the situation that only diagonal elements are large and non-diagonal elements are small does not occur, and for two different sentences to be identified, the distance between the dot product of the transposition of the matrix Q '1 and the matrix K'1 of one sentence and the dot product of the transposition of the matrix Q '2 and the transposition of the matrix K'2 of the other sentence is larger, so that the obtained target sentence vector contains more information, and the accuracy of identifying the final sentence is improved.
Fig. 5 shows a schematic block diagram of an intention recognition apparatus in one-to-one correspondence with the intention recognition method in embodiment 1. As shown in fig. 5, the intention recognition device includes an acquisition module 20 and a processing module 21.
An obtaining module 20, configured to obtain a sentence to be identified;
the processing module 21 converts the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector;
respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector;
acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector;
obtaining a first sentence vector according to the first attention vector, and obtaining a second sentence vector according to the second attention vector;
obtaining a target sentence vector according to the first sentence vector and the second sentence vector;
and inputting the target sentence vector into a linear layer of the Bert model to perform intention recognition, and obtaining an intention recognition result.
Optionally, the processing module 21 is further configured to:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first attention degree, and performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second attention degree;
respectively carrying out normalization processing on the first attention degree and the second attention degree to obtain a first target attention degree and a second target attention degree;
and performing point multiplication on the first target attention and the third vector to obtain a first attention vector, and performing point multiplication on the second target attention and the third vector to obtain a second attention vector.
Optionally, the processing module 21 is further configured to:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first numerical value;
dividing the first value by a scaling factorThe first attention degree is obtained, wherein the d K Is the dimension of the second vector;
performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second value;
dividing the second value by a scaling factorAnd obtaining the second attention degree.
Optionally, the processing module 21 is further configured to:
performing point multiplication on the first mapping vector and the transpose of the second mapping vector to obtain a first numerical value;
dividing the first value by a scaling factorThe first attention degree is obtained, wherein the d K Is the dimension of the second vector;
performing point multiplication on the second mapping vector and the transpose of the first mapping vector to obtain a second value;
dividing the second value by a scaling factorAnd obtaining the second attention degree.
Optionally, the processing module 21 is further configured to:
accumulating the N dimension of the first attention vector to obtain the first sentence vector;
and accumulating the N dimension of the second attention vector to obtain the second sentence vector.
Optionally, N is 0.
Optionally, the processing module 21 is further configured to:
and merging the first sentence vector and the second sentence vector to obtain the target sentence vector.
Optionally, the processing module 21 is further configured to:
and converting the sentence to be recognized into a feature vector by the embedding layer in a Word2Vec mode.
The implementation functions of the modules correspond to the steps corresponding to the method for identifying the intent in the embodiment, and in order to avoid redundancy, the embodiment is not described in detail.
The present embodiment provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the method for identifying intent described in the embodiment, and in order to avoid repetition, a detailed description is omitted here. Alternatively, the computer program when executed by the processor implements the functions corresponding to the modules in the device for identifying an intention in the embodiment, and in order to avoid repetition, a description is omitted here. It will be appreciated that the computer readable storage medium may comprise: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier wave signal, a telecommunications signal, and the like.
In one embodiment, an intent recognition device is provided, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the intent recognition method described in the foregoing embodiment when executing the computer program, or where the processor implements functions corresponding to each module in the intent recognition device in the foregoing embodiment when executing the computer program, and for avoiding repetition, the description is omitted herein.
Fig. 6 is a schematic diagram of an intention recognition device according to an embodiment of the present invention. As shown in fig. 6, the intention recognition device 60 of this embodiment includes: a processor 61, a memory 62 and a computer program 63 stored in the memory 62 and executable on the processor 61. The steps of the intended identification method in the above-described embodiment 1, such as steps S10 to S16 shown in fig. 3 or steps S131 to S133 shown in fig. 4, are realized when the processor 61 executes the computer program 63. Alternatively, the processor 61 implements functions corresponding to each module in the intent recognition device in the above embodiment when executing the computer program 63, where the implemented functions correspond to steps corresponding to the intent recognition method in embodiment 1, and in order to avoid redundancy, the embodiment is not described in detail.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (7)

1. An intent recognition method, comprising:
acquiring sentences to be identified;
converting the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector;
respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector;
acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector;
acquiring a first sentence vector according to the first attention vector, and acquiring a second sentence vector according to the second attention vector;
obtaining a target sentence vector according to the first sentence vector and the second sentence vector;
inputting the target sentence vector into a linear layer of a Bert model to perform intention recognition, and obtaining an intention recognition result;
the obtaining a first attention vector according to the first mapping vector and obtaining a second attention vector according to a second mapping vector includes:
the first and second attention vectors are calculated by the following formula:
wherein, attention1 represents a first Attention vector, attention2 represents a second Attention vector, V is a matrix V corresponding to a third vector, Q 'represents a matrix obtained by linearly mapping a matrix Q corresponding to the first vector, K' represents a matrix obtained by linearly mapping a matrix K corresponding to the second vector, and Q '' T Representing the transpose of matrix Q ', K' T Representing the transpose of the matrix K',representing the scaling factor.
2. The method of claim 1, wherein the obtaining a first sentence vector from the first attention vector and a second sentence vector from the second attention vector comprises:
accumulating the N dimension of the first attention vector to obtain the first sentence vector;
accumulating the N dimension of the second attention vector to obtain the second sentence vector;
the expression can be expressed as follows:
vec1=sum(Attention1(Q'、K'、V),dim=N);
vec2=sum(Attention2(Q'、K'、V),dim=N);
where vec1 represents a first sentence vector and vec2 represents a second sentence vector.
3. The intention recognition method of claim 2, wherein N is 0.
4. The method of claim 1, wherein the obtaining a target sentence vector from the first sentence vector and the second sentence vector comprises:
and merging the first sentence vector and the second sentence vector to obtain the target sentence vector.
5. The method for recognizing intention according to claim 1, wherein the converting the sentence to be recognized into a feature vector through an embedding layer comprises:
and converting the sentence to be recognized into a feature vector by the embedding layer in a Word2Vec mode.
6. An intent recognition device, comprising:
the acquisition module is used for acquiring sentences to be identified;
a processing module for:
converting the sentence to be recognized through an embedding layer to obtain a first vector, a second vector and a third vector;
respectively carrying out linear mapping on the first vector and the second vector to obtain a first mapping vector and a second mapping vector;
acquiring a first attention vector according to the first mapping vector, and acquiring a second attention vector according to a second mapping vector;
obtaining a first sentence vector according to the first attention vector, and obtaining a second sentence vector according to the second attention vector;
obtaining a target sentence vector according to the first sentence vector and the second sentence vector;
inputting the target sentence vector into a linear layer of a Bert model to perform intention recognition, and obtaining an intention recognition result;
the processing module is further configured to:
the first and second attention vectors are calculated by the following formula:
wherein, attention1 represents a first Attention vector, attention2 represents a second Attention vector, V is a matrix V corresponding to a third vector, Q 'represents a matrix obtained by linearly mapping a matrix Q corresponding to the first vector, K' represents a matrix obtained by linearly mapping a matrix K corresponding to the second vector, and Q '' T Representing the transpose of matrix Q ', K' T Representing the transpose of the matrix K',representing the scaling factor.
7. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 5.
CN201911244722.0A 2019-12-06 2019-12-06 Intention recognition method and intention recognition device Active CN111222343B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911244722.0A CN111222343B (en) 2019-12-06 2019-12-06 Intention recognition method and intention recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911244722.0A CN111222343B (en) 2019-12-06 2019-12-06 Intention recognition method and intention recognition device

Publications (2)

Publication Number Publication Date
CN111222343A CN111222343A (en) 2020-06-02
CN111222343B true CN111222343B (en) 2023-12-29

Family

ID=70826581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911244722.0A Active CN111222343B (en) 2019-12-06 2019-12-06 Intention recognition method and intention recognition device

Country Status (1)

Country Link
CN (1) CN111222343B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287108B (en) * 2020-10-29 2022-08-16 四川长虹电器股份有限公司 Intention recognition optimization method in field of Internet of things
CN116308587A (en) * 2023-05-18 2023-06-23 北京宽客进化科技有限公司 Transaction quotation determining method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110192206A (en) * 2017-05-23 2019-08-30 谷歌有限责任公司 Sequence based on attention converts neural network
CN110287283A (en) * 2019-05-22 2019-09-27 中国平安财产保险股份有限公司 Intent model training method, intension recognizing method, device, equipment and medium
CN110390108A (en) * 2019-07-29 2019-10-29 中国工商银行股份有限公司 Task exchange method and system based on deeply study

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110192206A (en) * 2017-05-23 2019-08-30 谷歌有限责任公司 Sequence based on attention converts neural network
CN110287283A (en) * 2019-05-22 2019-09-27 中国平安财产保险股份有限公司 Intent model training method, intension recognizing method, device, equipment and medium
CN110390108A (en) * 2019-07-29 2019-10-29 中国工商银行股份有限公司 Task exchange method and system based on deeply study

Also Published As

Publication number Publication date
CN111222343A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN111783462B (en) Chinese named entity recognition model and method based on double neural network fusion
CN111931513B (en) Text intention recognition method and device
EP3680894A1 (en) Real-time speech recognition method and apparatus based on truncated attention, device and computer-readable storage medium
US11450332B2 (en) Audio conversion learning device, audio conversion device, method, and program
CN113205817B (en) Speech semantic recognition method, system, device and medium
US11869486B2 (en) Voice conversion learning device, voice conversion device, method, and program
CN110046248B (en) Model training method for text analysis, text classification method and device
US11475225B2 (en) Method, system, electronic device and storage medium for clarification question generation
CN111222343B (en) Intention recognition method and intention recognition device
CN111401084A (en) Method and device for machine translation and computer readable storage medium
You et al. Contextualized attention-based knowledge transfer for spoken conversational question answering
Masumura et al. Sequence-level consistency training for semi-supervised end-to-end automatic speech recognition
Chen et al. Speechformer++: A hierarchical efficient framework for paralinguistic speech processing
CN112735404A (en) Ironic detection method, system, terminal device and storage medium
Wei et al. Attentive contextual carryover for multi-turn end-to-end spoken language understanding
CN109979461B (en) Voice translation method and device
Yang et al. An overview & analysis of sequence-to-sequence emotional voice conversion
Peymanfard et al. Lip reading using external viseme decoding
Zhang et al. Cacnet: Cube attentional cnn for automatic speech recognition
Matsuura et al. Generative adversarial training data adaptation for very low-resource automatic speech recognition
CN113362804A (en) Method, device, terminal and storage medium for synthesizing voice
CN113823259A (en) Method and device for converting text data into phoneme sequence
CN113569584A (en) Text translation method and device, electronic equipment and computer readable storage medium
US20230317059A1 (en) Alignment Prediction to Inject Text into Automatic Speech Recognition Training
CN114333762B (en) Expressive force-based speech synthesis method, expressive force-based speech synthesis system, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant