CN115114407A

CN115114407A - Intention recognition method and device, computer equipment and storage medium

Info

Publication number: CN115114407A
Application number: CN202210822568.6A
Authority: CN
Inventors: 李志韬; 王健宗; 程宁
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2022-07-12
Filing date: 2022-07-12
Publication date: 2022-09-27
Anticipated expiration: 2042-07-12
Also published as: CN115114407B

Abstract

The embodiment of the application belongs to the field of artificial intelligence and relates to an intention identification method, an intention identification device, computer equipment and a storage medium, wherein the intention identification method comprises the following steps: acquiring a training text and a full label sequence; performing cross coding on the training text and the full label sequence to obtain a joint characterization vector, and performing attention interaction on the text sequence characterization vector and the label sequence characterization vector to obtain a training text characterization vector; processing the training text representation vector through an initial intention recognition model to obtain a multi-intention prediction result; obtaining a two-dimensional co-occurrence prediction result through label two-dimensional co-occurrence prediction, and obtaining a high-dimensional co-occurrence prediction result through label high-dimensional co-occurrence prediction; calculating joint loss based on each prediction result to adjust the model to obtain an intention recognition model; and the intention recognition is carried out through the multi-intention recognition result. In addition, the application also relates to a block chain technology, and the training text and the full-scale label sequence can be stored in the block chain. The intention recognition accuracy is improved.

Description

Intention recognition method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an intention recognition method, an intention recognition apparatus, a computer device, and a storage medium.

Background

With the development of computer technology, it is becoming more and more common to perform intent recognition by computer. Intent recognition is typically the input of text into the neural network that is relevant to the target object, with the intent of the target object being predicted by the neural network. In practical applications, the text may also contain multiple intents.

The current intention recognition technology usually only focuses on the context interaction of the text in training, does not fully use the labels, uses less feature information, and causes the accuracy of intention recognition to be lower.

Disclosure of Invention

The embodiment of the application aims to provide an intention identification method, an intention identification device, computer equipment and a storage medium, so as to solve the problem of low intention identification accuracy.

In order to solve the above technical problem, an embodiment of the present application provides an intention identification method, which adopts the following technical solutions:

acquiring a training text with an intention label sequence and a full label sequence;

inputting the training text and the full label sequence into an initial intention recognition model to perform cross coding on the training text and the full label sequence to obtain a joint characterization vector;

performing attention interaction on the text sequence characterization vector and the label sequence characterization vector in the joint characterization vector to obtain a training text characterization vector;

processing the training text representation vector through the initial intention recognition model to obtain a multi-intention prediction result;

randomly selecting an intention label from the intention label sequence to take a label characterization vector of the intention label as a first vector, and randomly selecting an intention label from the full quantity label sequence to take a label characterization vector of the intention label as a second vector; wherein, the label characterization vector is a vector obtained by cross coding the intention label;

splicing the first vector and the second vector, and inputting the spliced first vector and second vector into a label two-dimensional co-occurrence prediction model to obtain a two-dimensional co-occurrence prediction result;

randomly selecting a preset number of intention labels from the intention label sequence as high-dimensional prediction labels, calculating a fusion vector according to label characterization vectors of the intention labels, and setting the label characterization vectors of the intention labels except the high-dimensional prediction labels in the full-quantity label sequence as complementary set vectors;

splicing the fusion vector and each complementary set vector, and inputting the spliced fusion vector and each complementary set vector into a label high-dimensional co-occurrence prediction model to obtain a high-dimensional co-occurrence prediction result;

calculating a joint loss based on the multi-intent prediction result, the two-dimensional co-occurrence prediction result, and the high-dimensional co-occurrence prediction result;

adjusting the initial intention recognition model according to the joint loss until the joint loss meets a training stopping condition to obtain an intention recognition model;

and performing intention recognition on the text to be recognized through the intention recognition model to obtain a multi-intention recognition result.

In order to solve the above technical problem, an intention identifying apparatus is further provided in an embodiment of the present application, which adopts the following technical solutions:

the acquisition module is used for acquiring a training text with an intention label sequence and a full label sequence;

the cross coding module is used for inputting the training text and the full-scale label sequence into an initial intention recognition model so as to carry out cross coding on the training text and the full-scale label sequence to obtain a joint characterization vector;

the vector interaction module is used for carrying out attention interaction on the text sequence characterization vector and the label sequence characterization vector in the combined characterization vector to obtain a training text characterization vector;

the intention prediction module is used for processing the training text representation vector through the initial intention recognition model to obtain a multi-intention prediction result;

a two-dimensional selection module for randomly selecting an intention label from the intention label sequence to take a label characterization vector thereof as a first vector and randomly selecting the intention label from the full quantity label sequence to take a label characterization vector thereof as a second vector; wherein, the label characterization vector is a vector obtained by cross coding the intention label;

the two-dimensional prediction module is used for splicing the first vector and the second vector and then inputting the spliced first vector and second vector into a label two-dimensional co-occurrence prediction model to obtain a two-dimensional co-occurrence prediction result;

the high-dimensional selection module is used for randomly selecting a preset number of intention labels from the intention label sequence as high-dimensional prediction labels, calculating a fusion vector according to label characterization vectors of the intention labels, and setting the label characterization vectors of the intention labels except the high-dimensional prediction labels in the full-quantity label sequence as complementary set vectors;

the high-dimensional prediction module is used for splicing the fusion vector and each complementary set vector and then inputting the spliced fusion vector and each complementary set vector into a label high-dimensional co-occurrence prediction model to obtain a high-dimensional co-occurrence prediction result;

a loss calculation module for calculating a joint loss based on the multi-intent prediction result, the two-dimensional co-occurrence prediction result, and the high-dimensional co-occurrence prediction result;

the model adjusting module is used for adjusting the initial intention recognition model according to the joint loss until the joint loss meets a training stopping condition to obtain an intention recognition model;

and the intention recognition module is used for performing intention recognition on the text to be recognized through the intention recognition model to obtain a multi-intention recognition result.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

inputting the training text and the full-scale label sequence into an initial intention recognition model to perform cross coding on the training text and the full-scale label sequence to obtain a joint characterization vector;

randomly selecting an intention label from the intention label sequence to take a label characterization vector thereof as a first vector, and randomly selecting an intention label from the full quantity label sequence to take a label characterization vector thereof as a second vector; wherein, the label characterization vector is a vector obtained by cross coding the intention label;

In order to solve the foregoing technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

Compared with the prior art, the embodiment of the application mainly has the following beneficial effects: the method comprises the steps of obtaining a training text with an intention label sequence and recording a full quantity label sequence of all intention labels, performing cross coding on the training text and the full quantity label sequence to obtain a joint characterization vector, and performing attention interaction on a text sequence characterization vector and a label sequence characterization vector in the training text and the full quantity label sequence, so that semantic connections between text characters and the intention labels and between the intention labels are increased, the feature extraction capability of a model is improved, a training text characterization vector with richer information is obtained, and the accuracy of an initial intention prediction model for generating a multi-intention prediction result according to the training text characterization vector is improved; selecting an intention label from the intention label sequence, selecting the intention labels from the full label sequence to perform label two-dimensional co-occurrence prediction to obtain a two-dimensional co-occurrence prediction result, selecting a plurality of intention labels from the intention label sequence, performing label high-dimensional co-occurrence prediction by combining the rest labels in the full label sequence to obtain a high-dimensional co-occurrence prediction result, and enhancing label correlation learning; calculating joint loss according to the multi-intention prediction result, the two-dimensional co-occurrence prediction result and the high-dimensional co-occurrence prediction result so as to adjust the model to obtain an intention identification model; inputting the text to be recognized into the intention recognition model to obtain a multi-intention recognition result; according to the method and the device, the feature extraction capability of the model is improved in the main task of the intention recognition, the text representation is enriched, the relevance of the label is learned in the auxiliary task of the label co-occurrence prediction, and the accuracy of the intention recognition of the trained intention recognition model is greatly improved.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of an intent recognition method according to the present application;

FIG. 3 is a schematic diagram of an embodiment of an intent recognition apparatus according to the present application;

FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the foregoing drawings are used for distinguishing between different objects and not for describing a particular sequential order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein may be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the

terminal devices

101, 102, 103.

It should be noted that the intention identifying method provided in the embodiments of the present application is generally executed by a server, and accordingly, the intention identifying device is generally disposed in the server.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow diagram of one embodiment of an intent recognition method in accordance with the present application is shown. The intention identification method comprises the following steps:

step S201, a training text with an intention label sequence and a full label sequence are obtained.

In the present embodiment, an electronic device (e.g., a server shown in fig. 1) on which the intention recognition method operates may communicate with the terminal through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G/5G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.

The training text can be a text used in training the model, the training text can contain various intentions, the intention tag sequence of the training text contains a plurality of intention tags, and the intentions contained in the training text are recorded.

The method and the device are used for detecting the intentions contained in the text in a certain scene, all intentions which can appear in the scene are defined in advance, and the intention labels corresponding to all intentions form a full label sequence.

Specifically, the method performs multi-intent recognition through an intent recognition model, and the intent recognition model is required to be obtained through model training. First, the training text with the intention tag sequence and the full tag sequence need to be obtained.

The method and the system can be applied to various scenes, and in one embodiment, the method and the system are applied to a customer service semantic quality inspection scene, in which the intention in a communication session between customer service and a customer needs to be detected so as to detect whether the customer service expresses related semantic information according to requirements. The dialogue voice of the customer service can be converted into a dialogue text, and the dialogue text is used as a training text.

For example, T _i (text) is the Tth of the customer service _i Paragraph dialog text (e.g. the content may be: you good, i am xxx, i am the product xxx of your consultation), which may be used as training text, length n training text T _i Consisting of n characters token, T _i ＝{x ₁ ,x ₂ ,…,x _n }. L (label) is a full-scale label sequence, which may include, for example, intention labels such as self-introduction, product recommendation, product question answering, etc. L ═ y ₁ ,y ₂ ,…,y _m Is a full-size tag sequence containing m intention tags.

Each training text T _i Will have a corresponding intention tag sequence Y _i ，Y _i Is the whole quantity tag orderA proper subset of columns L. E.g. T ₁ -Y ₁ ＝{y ₁ ,y ₄ Indicates that the first training text contains two intention labels of self introduction and product question answering. The method and the device have the advantages that the marked training text T is learned _i Splicing of the full-size tag sequence L and the intention tag sequence Y _i So that the intention learning model can be used for the text T to be recognized without labels _j Identifying the intention to obtain the intention label sequence Y composed of the intention types contained in it _j 。

Step S202, inputting the training text and the full quantity label sequence into an initial intention recognition model to perform cross coding on the training text and the full quantity label sequence to obtain a joint characterization vector.

The initial intention recognition model can be an intention recognition model which is not trained yet, the intention recognition model can be built based on a neural network, and multi-intention recognition can be achieved.

Specifically, the training text T and the full-scale label sequence L are spliced and then input into the initial intention recognition model. The initial intention recognition model is provided with an encoder, and can perform cross coding on a training text and a full-scale label sequence, namely, each character in the training text and each intention label in the full-scale label sequence are subjected to cross coding, so that attention interaction between each character and each character, attention interaction between each character and each intention label, and attention interaction between each intention label and each intention label are realized, and a joint characterization vector is obtained.

And step S203, performing attention interaction on the text sequence characterization vector and the label sequence characterization vector in the combined characterization vector to obtain a training text characterization vector.

Specifically, the joint characterization vector comprises a text sequence characterization vector and a label sequence characterization vector, wherein the text sequence characterization vector is obtained by cross coding each character in a training text, and the label sequence characterization vector is obtained by cross coding each intention label in a full-scale label sequence.

And performing interactive calculation on the text sequence characterization vector and the label sequence characterization vector in the combined characterization vector through the initial intention recognition model, and thus continuing performing cross attention interaction to obtain a training text characterization vector.

Through cross coding, semantic connection between text characters and tokens is introduced, semantic connection between text characters and intention labels and semantic connection between intention labels are increased, and feature extraction capability of the model is improved. The method and the device have the advantages that characters in the training text and the intention labels in the full-scale label sequence are jointly embedded and expressed, and the traditional method of extracting the CLS to obtain the text expression vector is replaced. And meanwhile, the text sequence representation vector and the label sequence representation vector in the combined representation vector are subjected to cross attention interaction, so that the text representation vector with richer semantic information is obtained.

And step S204, processing the training text representation vector through the initial intention recognition model to obtain a multi-intention prediction result.

Specifically, the initial intention recognition model can perform intention recognition according to the training text representation vector to obtain a multi-intention prediction result. At least one intent type may be included in the multi-intent prediction result.

Step S205, randomly selecting an intention label from the intention label sequence to take a label characterization vector thereof as a first vector, and randomly selecting an intention label from the full quantity label sequence to take a label characterization vector thereof as a second vector; and the label characterization vector is a vector obtained by cross coding the intention label.

Specifically, the multi-label classification task for intention identification is a main task, and the method further comprises two auxiliary tasks, namely a label two-dimensional co-occurrence prediction task and a label high-dimensional co-occurrence prediction task, and label correlation learning is performed through the auxiliary tasks, so that the problems of low-frequency label prediction accuracy (the classification problem in the real world usually shows long-tail label distribution, some low-frequency labels are only related to a few examples and are difficult to learn), label dependency (the contents of some labels are similar and may appear together, which is the dependency between the labels), label combination diversity and slow model reasoning speed (some models are predicted in an autoregressive mode, and the reasoning speed is slow) are solved.

In a label two-dimensional co-occurrence prediction task, an intention label is randomly selected from an intention label sequence, and a label representation vector corresponding to the intention label is set as a first vector; and then randomly selecting an intention label from the full-quantity label sequence, and taking a label characterization vector corresponding to the intention label as a second vector.

The full amount of label sequences can generate label sequence characterization vectors through the cross coding of the initial intention identification model, and the label sequence characterization vectors are formed by label characterization vectors of the full amount of intention labels. Since the sequence of intention tags is a proper subset of the full sequence of intention tags, after the token sequence characterization vector is obtained, the token characterization vector of each intention tag in the sequence of intention tags can also be obtained.

And S206, splicing the first vector and the second vector, and inputting the spliced vectors into a label two-dimensional co-occurrence prediction model to obtain a two-dimensional co-occurrence prediction result.

Specifically, the first vector and the second vector are spliced and input into the tag two-dimensional co-occurrence prediction model, and the first vector and the second vector are spliced together through a concat operation, for example. It is understood that the intention labels corresponding to the first vector are labels that are necessarily included in the training text, and the intention labels corresponding to the second vector are from the full-scale label sequence, and may or may not be included in the training text.

The label two-dimensional co-occurrence prediction model can be a two-classifier, such as an MLP two-classifier, which is a dynamic classifier based on neural networks. And outputting a two-dimensional co-occurrence prediction result by the label two-dimensional co-occurrence prediction model, and predicting whether the intention label corresponding to the first vector and the intention label corresponding to the second vector exist in the training text at the same time.

Step S207, randomly selecting a preset number of intention labels from the intention label sequence as high-dimensional prediction labels, calculating a fusion vector according to the label characterization vectors, and setting the label characterization vectors of the intention labels except the high-dimensional prediction labels in the full quantity label sequence as complement vectors.

In the task of label high-dimensional co-occurrence prediction, a preset number of intention labels are extracted from an intention label sequence to be used as high-dimensional prediction labels, for example, if the intention label sequence comprises g intention labels, k intention labels are randomly selected from the intention label sequence to be used as the high-dimensional prediction labels, wherein k is more than or equal to 0 and less than g, label characterization vectors corresponding to the high-dimensional prediction labels are added, and then the average value is calculated to obtain a fusion vector.

In the total label sequence containing m intention labels, k intention labels extracted in the above way are removed, and (m-k) intention labels are left, the labels are complement labels, the high-dimensional co-occurrence prediction task of the labels is to predict whether the extracted high-dimensional prediction labels and the complement labels co-occur, namely whether the extracted high-dimensional prediction labels and the complement labels are simultaneously contained in the training text, and meanwhile, the label characterization vector corresponding to the complement labels is set as the complement vector.

And S208, splicing the fusion vector and each complementary set vector, and inputting the spliced fusion vector and each complementary set vector into a label high-dimensional co-occurrence prediction model to obtain a high-dimensional co-occurrence prediction result.

Specifically, the fusion vector and each complementary set vector are spliced respectively, and then a label high-dimensional co-occurrence prediction model is input. In one embodiment, after the fused vector and each complementary set vector are spliced respectively, a full connection layer may be input first, and then a label high-dimensional co-occurrence prediction model may be input.

The labeled high-dimensional co-occurrence prediction model may include a plurality of two classifiers, such as an MLP two classifier. When the number of the complementary set labels is (m-k), the fusion vector and each complementary set vector are spliced respectively to obtain (m-k) spliced vectors, and then the (m-k) spliced vectors are input into (m-k) classifiers in the label high-dimensional co-occurrence prediction model.

The output of each two classifier can be the prediction of whether the high-dimensional prediction label and the complementary set label coexist in the training text by the two classifiers, and the output of each two classifier forms a high-dimensional coexistence prediction result.

In one embodiment, the semantic tags included in the training text are subsets of the full-scale tag sequence, including all, none, and a portion, the intent tag mentioned is denoted as Y +, and the semantic tag not mentioned is denoted as Y-. In constructing a dataset of tag two-dimensional co-occurrence prediction tasks, each set of elements in the dataset contains two tag characterization vectors < Ya, Yb >, where Ya is sampled only from Y + and Yb is sampled from Y + and Y-. Similarly, when constructing a data set of the label high-dimensional co-occurrence prediction task, k intention labels are randomly selected from Y + to form Ya, and then whether the remaining intention labels in Y + and Y-are co-occurring with Ya is predicted.

In step S209, a joint loss is calculated based on the multi-intent prediction result, the two-dimensional co-occurrence prediction result, and the high-dimensional co-occurrence prediction result.

Specifically, the training of the application relates to three models, namely an initial intention recognition model, a label two-dimensional co-occurrence prediction model and a label high-dimensional co-occurrence prediction model. The initial intention recognition model outputs a multi-intention prediction result, the label two-dimensional co-occurrence prediction model outputs a two-dimensional co-occurrence prediction result, and the label high-dimensional co-occurrence prediction model outputs a high-dimensional co-occurrence prediction result.

According to the method and the device, loss calculation needs to be carried out jointly according to the multi-intention prediction result, the two-dimensional co-occurrence prediction result and the high-dimensional co-occurrence prediction result, and the joint loss is obtained.

And step S210, adjusting the initial intention recognition model according to the joint loss until the joint loss meets the training stopping condition to obtain the intention recognition model.

Specifically, the initial intention recognition model is adjusted according to the joint loss, after the initial intention recognition model is adjusted, iterative training is carried out on the initial intention recognition model according to the training text and the full-amount label sequence until the obtained joint loss meets the training stopping condition, the training is stopped, and the initial intention recognition model when the training is stopped is set as the intention recognition model.

And step S211, performing intention recognition on the text to be recognized through the intention recognition model to obtain a multi-intention recognition result.

Specifically, when the method is applied, a text to be recognized is obtained, the text to be recognized is input into the intention recognition model for intention recognition, and then a multi-intention recognition result can be obtained. The multi-intent recognition result may contain at least one intent type.

In the embodiment, a training text with an intention label sequence is obtained, a full label sequence of all intention labels is recorded, the training text and the full label sequence are subjected to cross coding to obtain a joint characterization vector, attention interaction is carried out on the text sequence characterization vector and the label sequence characterization vector, semantic connections between text characters and the intention labels and between the intention labels are increased, the feature extraction capability of a model is improved, a training text characterization vector with richer information is obtained, and the accuracy of an initial intention prediction model for generating a multi-intention prediction result according to the training text characterization vector is improved; selecting an intention label from the intention label sequence, selecting the intention labels from the full label sequence to perform label two-dimensional co-occurrence prediction to obtain a two-dimensional co-occurrence prediction result, selecting a plurality of intention labels from the intention label sequence, performing label high-dimensional co-occurrence prediction by combining the rest labels in the full label sequence to obtain a high-dimensional co-occurrence prediction result, and enhancing label correlation learning; calculating joint loss according to the multi-intention prediction result, the two-dimensional co-occurrence prediction result and the high-dimensional co-occurrence prediction result so as to adjust the model to obtain an intention identification model; inputting the text to be recognized into the intention recognition model to obtain a multi-intention recognition result; according to the method and the device, the feature extraction capability of the model is improved in the main task of the intention recognition, the text representation is enriched, the relevance of the label is learned in the auxiliary task of the label co-occurrence prediction, and the accuracy of the intention recognition of the trained intention recognition model is greatly improved.

Further, the step S202 may include: constructing an initial sequence according to each character in the training text and each intention label in the full-scale label sequence; mapping the initial sequence into a vector sequence; inputting the vector sequence into an encoder, and performing cross coding on character vectors and label vectors in the vector sequence through a plurality of coding layers in the encoder to obtain a text sequence characterization vector and a label sequence characterization vector output by the last coding layer; and determining the text sequence characterization vector and the label sequence characterization vector as a joint characterization vector.

Specifically, the training text is composed of a plurality of characters (tokens), the full label sequence comprises a plurality of intention labels, the training text T and the full label sequence L are divided through a separator sep, a cls character is added in front of the training text to serve as a start character, and a sep is added at the end position of the full label sequence L to serve as an end character, so that an initial sequence is obtained.

The initial sequence is mapped into a vector sequence through an embedded layer, each character and intention label in the vector sequence has a vector representation, the representation is token-level representation, the hidden layer dimension of each token can be set to 768, the vector sequence EX [ [ cls ], [ x1], …, [ xn ], [ sep ], [ y1], …, [ ym ], [ sep ] ]isobtained, wherein each element in the parentheses [ ] is a vector of 1 × 768, and the dimension of EX is (n + m +3) × 768.

The vector sequence EX is then input into an encoder, which contains multiple encoding layers. In one embodiment, the encoder may be constructed based on a bert model, which contains 12 coding layers, i.e., a 12-layer encoder structure. Each layer of coding layer can carry out token-level attention interaction, character vectors and label vectors in vector sequences are subjected to cross coding, semantic connection between the character vectors and the character vectors, semantic connection between the character vectors and the label vectors and semantic connection between the label vectors are introduced, and the output of the upper layer of coding layer is the input of the lower layer of coding layer.

Finally, a text sequence characterization vector and a label sequence characterization vector output by a 12 th layer encoder in the bert model are obtained, wherein the text sequence characterization vector is a vector sequence obtained by encoding each character in the training text, the label sequence characterization vector is a vector sequence obtained by encoding each intention label in a full-quantity label sequence, and the text sequence characterization vector and the label sequence characterization vector output by the last layer encoding layer are determined as a joint characterization vector, namely, the obtained E12X [ [ hcls ], [ hx1], …, [ hxn ], [ hsep ], [ hy1], …, [ hym ], [ hsep ] ] are determined as joint characterization vectors, and the dimensions of EX and E12X are the same, but the numerical values are changed.

In this embodiment, an initial sequence is constructed according to each character and each intention tag, the initial sequence is mapped to a vector sequence, and then the vector sequence is input to an encoder to perform cross coding between the character and the intention tag, so that interaction between the character and the tag and interaction between the tag and the label are additionally increased, and the feature extraction capability of the model is improved.

Further, the step S203 may include: transposing the label sequence characterization vectors in the combined characterization vectors to obtain transposed label sequence characterization vectors; performing point multiplication operation on the text sequence characterization vector and the transposed label sequence characterization vector in the combined characterization vector to obtain a correlation degree score matrix; the relevancy score matrix characterizes the relevancy between the character and the intention label; inputting the correlation fraction matrix into the activation model to obtain a third vector; inputting the third vector into the first activation function, and transposing a function result of the activation function to obtain a fourth vector; and performing point multiplication operation on the fourth vector and the text sequence characterization vector to obtain a training text characterization vector.

Specifically, the joint token vector includes a text sequence token vector Hx and a tag sequence token vector Hy. Wherein Hx [ [ Hx1], …, [ hxn ] ], is a text sequence characterization vector in E12X, and the dimension is n × 768; hy [ [ Hy1], …, [ hym ] ], is a tag sequence characterization vector in E12X with dimension m × 768.

Transposing the label sequence representation vector Hy to obtain a transposed label sequence representation vector, and then performing point multiplication operation on the text sequence representation vector Hx and the transposed label sequence representation vector to obtain a correlation degree score matrix W in one-to-one correspondence between the character token and the intention label token. The correlation score matrix W is a matrix of n m, W _ij The value of the position element represents the relevance score of the ith character token in the training text and the jth intention label token in the full quantity label sequence.

Because the point multiplication operation belongs to linear operation, in order to improve the effectiveness of sparse regularization, prevent overfitting and enhance the generalization capability of the model, the correlation fraction matrix W is input into the activation model. The activation model can be constructed based on a CNN model, and is provided with a ReLU activation function, the correlation degree fraction matrix W is activated through the CNN model, and maximum pooling is used to obtain a third vector with n x1 dimension. And then, passing the third vector through a first activation function, wherein the first activation function can be a tanh hyperbolic tangent activation function, and transposing a function result of the first activation function to obtain a fourth vector.

And performing dot product operation on the fourth vector and the text sequence characterization vector Hx to obtain a training text characterization vector New _ x with the dimensionality of 1 x 768.

In the embodiment, attention interaction is performed on the text sequence representation vector and the label sequence representation vector, and the traditional method that CLS is directly used as the text representation is replaced, so that the training text representation vector with richer semantic information is obtained.

Further, the step S204 may include: inputting the training text representation vector into a full-connection layer of the initial intention recognition model to obtain a fifth vector; inputting the fifth vector into a second activation function to obtain a sixth vector; and generating a multi-purpose prediction result according to the numerical value of each element in the sixth vector.

Specifically, a training text characterization vector New _ x (1 x 768 dimensions) passes through a fully connected sense layer, and the fully connected layer captures more fine-grained features from different regions of the training text, so that a fifth vector is obtained.

And inputting the fifth vector into a second activation function, wherein the second activation function can be a sigmoid activation function, and a sixth vector is obtained.

The sixth vector dimension is 1 × m, the sixth vector dimension comprises m elements, the numerical value of each element is between 0 and 1, the sixth vector dimension represents the scores of the training text on the m intention labels, if the numerical value is larger than 0.5, the sixth vector dimension represents that the training text mentions the intention label semantic corresponding to the element, and the sixth vector dimension is marked as 1; if the value is less than or equal to 0.5, the training text does not refer to the intention label semantic corresponding to the element, and the intention label semantic is marked as 0, so that a multi-intention prediction result represented by 0 and 1 can be generated.

In this embodiment, the intention prediction is performed through the initial intention recognition model to obtain a sixth vector, and a numerical value of each position in the sixth vector is a prediction of whether related intention label semantics are mentioned in the training text, so that a multi-intention prediction result can be generated.

Further, the step S209 may include: calculating a first loss based on the multi-intent prediction result and the sequence of intent tags; constructing a two-dimensional co-occurrence label and a high-dimensional co-occurrence label according to the selected intention label; calculating a second loss through the two-dimensional co-occurrence prediction result and the two-dimensional co-occurrence label; calculating a third loss according to the high-dimensional co-occurrence prediction result and the high-dimensional co-occurrence label; and performing linear operation on the first loss, the second loss and the third loss to obtain a combined loss.

Specifically, the combined loss comprises three portions, which are a first loss, a second loss, and a third loss in that order. The first Loss is obtained by calculation according to a multi-intention prediction result and an intention label sequence, the first Loss is in a form of signal + Binary Cross Entropy, and Binary Cross Entropy losses of a training text on all intention labels (m intention labels) are summed to serve as a first Loss Loss _ m of the training text on a multi-label classification task.

The second Loss is calculated according to the two-dimensional co-occurrence prediction result and the two-dimensional co-occurrence label, and the Loss function still adopts a form of sigmoid + Binary Cross Entropy and is marked as Loss _ b.

And calculating the third Loss according to the high-dimensional co-occurrence prediction result and the high-dimensional co-occurrence label, adding (m-k) losses in a form of signal + Binary Cross Entropy (Binary Cross Entropy) still adopted by a Loss function, and marking as Loss of the high-dimensional co-occurrence prediction task as Loss _ c.

Adding weights to the first Loss, the second Loss and the third Loss, and then performing linear operation on the weighted first Loss, the weighted second Loss and the weighted third Loss, specifically, performing weighted summation operation on the losses to obtain a joint Loss, wherein the joint Loss is less _ m + r _ less _ b + (1-r) less _ c, and r may be a predefined hyper-parameter and belongs to a real value between (0 and 1).

Two-dimensional co-occurrence labels and high-dimensional co-occurrence labels can be constructed. The intention label sequence is a label of the training text, which indicates the intention labels related to the training text, and after the intention labels are selected from the full amount of label sequences, the intention label sequence is compared to know whether the label co-occurrence is generated or not, so that a two-dimensional co-occurrence label and a high-dimensional co-occurrence label can be constructed.

In the embodiment, the joint loss is calculated based on the multi-intention prediction result, the two-dimensional co-occurrence prediction result and the high-dimensional co-occurrence prediction result, the loss caused by the intention identification main task and the label co-occurrence prediction auxiliary task is considered, and the accuracy of loss calculation is improved.

Further, the step S210 may include: adjusting model parameters of an initial intention recognition model, a two-dimensional co-occurrence prediction model and a high-dimensional co-occurrence prediction model with the aim of reducing joint loss; and performing iterative training on the initial intention recognition model, the two-dimensional co-occurrence prediction model and the high-dimensional co-occurrence prediction model after the parameters are adjusted until the joint loss meets the training stopping condition to obtain the intention recognition model.

Specifically, the server aims at reducing joint loss, adjusts model parameters of an initial intention recognition model, a two-dimensional co-occurrence prediction model and a high-dimensional co-occurrence prediction model at the same time, performs iterative training according to a training text and a full-scale label sequence after current parameter adjustment is completed, and stops training until the obtained joint loss meets a training stop condition to obtain an intention recognition model. Wherein the training stop condition may be that the joint loss is less than a preset loss threshold.

In the embodiment, the joint loss is reduced, and the model parameters of the initial intention recognition model, the two-dimensional co-occurrence prediction model and the high-dimensional co-occurrence prediction model are adjusted at the same time until the training is finished to obtain the intention recognition model, so that the intention recognition model can be used for performing multi-intention recognition.

Further, the step S211 may include: acquiring a text to be identified; inputting a text to be recognized and a full label sequence into an intention recognition model, and performing cross coding on the text to be recognized and the full label sequence to obtain a joint characterization vector; performing attention interaction on the text sequence characterization vector and the label sequence characterization vector in the combined characterization vector to obtain a text characterization vector; and processing the text representation vector through the intention recognition model to obtain a multi-intention recognition result.

Specifically, when the method is applied, a text to be recognized is obtained, and the text to be recognized and a full-scale label sequence are input into a trained intention recognition model. The method comprises the steps that an intention recognition model accords with the processes of processing and training a text to be recognized and a full label sequence, cross coding is carried out on the text to be recognized and the full label sequence to obtain a joint characterization vector, and attention interaction is carried out on the text sequence characterization vector and the label sequence characterization vector in the joint characterization vector to obtain a text characterization vector; and performing intention recognition according to the text representation vector to obtain a multi-intention recognition result.

The multi-intent recognition result may include m elements, each element takes a value of 0 or 1, and is used to indicate whether the text to be recognized refers to the corresponding intent tag semantics. When the method is applied, the label two-dimensional co-occurrence prediction model and the label high-dimensional co-occurrence prediction model do not play a role any more.

In the embodiment, the trained intention recognition model increases semantic connections between text characters and intention labels and between intention labels, improves the feature extraction capability of the model, obtains text representation vectors with richer semantic information, and improves the accuracy of intention recognition according to the text representation vectors.

It should be emphasized that, in order to further ensure the privacy and security of the training text with the intention label sequence and the full label sequence, the training text with the intention label sequence and the full label sequence may also be stored in a node of a blockchain; it is to be understood that the text to be recognized may also be stored in a node of a blockchain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

This application belongs to wisdom city field, for example wisdom house and wisdom life etc. can promote the construction in wisdom city through this scheme in order to realize the intention discernment.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, the processes of the embodiments of the methods described above can be included. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless otherwise indicated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of an intention identifying apparatus, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied in various electronic devices.

As shown in fig. 3, the intention recognition apparatus 300 according to the present embodiment includes: an obtaining module 301, a cross coding module 302, a vector interaction module 303, an intention predicting module 304, a two-dimensional selecting module 305, a two-dimensional predicting module 306, a high-dimensional selecting module 307, a high-dimensional predicting module 308, a loss calculating module 309, a model adjusting module 310, and an intention identifying module 311, wherein:

an obtaining module 301, configured to obtain a training text with an intention tag sequence and a full tag sequence.

And a cross coding module 302, configured to input the training text and the full-scale tag sequence into the initial intention recognition model, so as to cross code the training text and the full-scale tag sequence, thereby obtaining a joint characterization vector.

And the vector interaction module 303 is configured to perform attention interaction on the text sequence characterization vector and the label sequence characterization vector in the combined characterization vector to obtain a training text characterization vector.

And the intention prediction module 304 is used for processing the training text representation vector through the initial intention recognition model to obtain a multi-intention prediction result.

A two-dimensional selection module 305, configured to randomly select an intention tag from the intention tag sequence to use its tag characterization vector as a first vector, and randomly select an intention tag from the full quantity tag sequence to use its tag characterization vector as a second vector; and the label characterization vector is a vector obtained by cross coding the intention label.

And the two-dimensional prediction module 306 is configured to splice the first vector and the second vector and input the spliced first vector and second vector into the tag two-dimensional co-occurrence prediction model to obtain a two-dimensional co-occurrence prediction result.

A high-dimensional selecting module 307, configured to randomly select a preset number of intention labels from the intention label sequence as high-dimensional prediction labels, to calculate a fusion vector according to the label characterization vectors thereof, and to set the label characterization vectors of the intention labels other than each high-dimensional prediction label in the full-quantity label sequence as a complementary set vector.

And the high-dimensional prediction module 308 is configured to splice the fusion vector and each complementary set vector and input the spliced fusion vector and each complementary set vector into the label high-dimensional co-occurrence prediction model to obtain a high-dimensional co-occurrence prediction result.

A loss calculation module 309, configured to calculate a joint loss based on the multi-intent prediction result, the two-dimensional co-occurrence prediction result, and the high-dimensional co-occurrence prediction result.

And the model adjusting module 310 is configured to adjust the initial intention recognition model according to the joint loss until the joint loss meets the training stopping condition, so as to obtain the intention recognition model.

And the intention identifying module 311 is configured to perform intention identification on the text to be identified through an intention identification model, so as to obtain a multi-intention identification result.

In the embodiment, a training text with an intention label sequence is obtained, a full label sequence of all intention labels is recorded, the training text and the full label sequence are subjected to cross coding to obtain a joint characterization vector, attention interaction is carried out on the text sequence characterization vector and the label sequence characterization vector, semantic connections between text characters and the intention labels and between the intention labels are increased, the feature extraction capability of a model is improved, a training text characterization vector with richer information is obtained, and the accuracy of an initial intention prediction model for generating a multi-intention prediction result according to the training text characterization vector is improved; selecting the intention labels from the intention label sequence, selecting the intention labels from the full amount label sequence to perform label two-dimensional co-occurrence prediction to obtain a two-dimensional co-occurrence prediction result, selecting a plurality of intention labels from the intention label sequence, performing label high-dimensional co-occurrence prediction by combining the rest labels in the full amount label sequence to obtain a high-dimensional co-occurrence prediction result, and enhancing label correlation learning; calculating joint loss according to the multi-intention prediction result, the two-dimensional co-occurrence prediction result and the high-dimensional co-occurrence prediction result so as to adjust the model to obtain an intention identification model; inputting the text to be recognized into the intention recognition model to obtain a multi-intention recognition result; according to the method and the device, the feature extraction capability of the model is improved in the main task of the intention recognition, the text representation is enriched, the relevance of the label is learned in the auxiliary task of the label co-occurrence prediction, and the accuracy of the intention recognition of the trained intention recognition model is greatly improved.

In some optional implementations of this embodiment, the cross coding module 302 may include: the device comprises an initial construction submodule, an initial mapping submodule, a cross coding submodule and a vector determination submodule, wherein:

and the initial construction sub-module is used for constructing an initial sequence according to each character in the training text and each intention label in the full quantity label sequence.

And the initial mapping submodule is used for mapping the initial sequence into a vector sequence.

And the cross coding submodule is used for inputting the vector sequence into the encoder so as to carry out cross coding on the character vector and the label vector in the vector sequence through a plurality of coding layers in the encoder to obtain a text sequence characterization vector and a label sequence characterization vector output by the last coding layer.

And the vector determination submodule is used for determining the text sequence characterization vector and the label sequence characterization vector as a joint characterization vector.

In some optional implementations of this embodiment, the vector interaction module 303 may include: the device comprises a sequence transposition submodule, a point multiplication operation submodule, a matrix input submodule, a vector input submodule and a vector operation submodule, wherein:

and the sequence transposition submodule is used for transposing the label sequence characterization vectors in the combined characterization vectors to obtain transposed label sequence characterization vectors.

The point multiplication operation submodule is used for carrying out point multiplication operation on the text sequence representation vector and the transposed label sequence representation vector in the combined representation vector to obtain a correlation score matrix; the relevancy score matrix characterizes the relevancy between the character and the intent tag.

And the matrix input submodule is used for inputting the correlation fraction matrix into the activation model to obtain a third vector.

And the vector input submodule is used for inputting the third vector into the first activation function and transposing the function result of the activation function to obtain a fourth vector.

And the vector operation submodule is used for performing point multiplication operation on the fourth vector and the text sequence characterization vector to obtain a training text characterization vector.

In some optional implementations of this embodiment, the intent prediction module 304 includes: a characterization input submodule, an activation input submodule, and a prediction generation submodule, wherein:

and the representation input submodule is used for inputting the training text representation vector into the full-connection layer of the initial intention recognition model to obtain a fifth vector.

And the activation input submodule is used for inputting the fifth vector into the second activation function to obtain a sixth vector.

And the prediction generation submodule is used for generating a multi-purpose prediction result according to the numerical value of each element in the sixth vector.

In some optional implementations of this embodiment, the loss calculating module 309 may include: the label computation system comprises a first computation submodule, a label construction submodule, a second computation submodule, a third computation submodule and a joint computation submodule, wherein:

a first computation submodule for computing a first penalty based on the multi-intent prediction result and the sequence of intent tags.

And the label constructing submodule is used for constructing a two-dimensional co-occurrence label and a high-dimensional co-occurrence label according to the selected intention label.

And the second calculation submodule is used for calculating second loss according to the two-dimensional co-occurrence prediction result and the two-dimensional co-occurrence label.

And the third calculation submodule is used for calculating a third loss according to the high-dimensional co-occurrence prediction result and the high-dimensional co-occurrence label.

And the joint calculation submodule is used for carrying out linear operation on the first loss, the second loss and the third loss to obtain the joint loss.

In some optional implementations of this embodiment, the model adjustment module 310 may include: parameter adjustment submodule and iterative training submodule, wherein:

and the parameter adjusting submodule is used for adjusting the model parameters of the initial intention recognition model, the two-dimensional co-occurrence prediction model and the high-dimensional co-occurrence prediction model by taking the joint loss reduction as a target.

And the iterative training submodule is used for iteratively training the initial intention recognition model, the two-dimensional co-occurrence prediction model and the high-dimensional co-occurrence prediction model after the parameters are adjusted until the joint loss meets the training stopping condition to obtain the intention recognition model.

In some optional implementations of this embodiment, the intention identifying module 311 may include: the text acquisition submodule, the coding submodule, the interaction submodule and the intention identification submodule, wherein:

and the text acquisition submodule is used for acquiring the text to be recognized.

And the coding sub-module is used for inputting the text to be recognized and the full label sequence into the intention recognition model so as to carry out cross coding on the text to be recognized and the full label sequence to obtain a joint characterization vector.

And the interaction submodule is used for performing attention interaction on the text sequence characterization vector and the label sequence characterization vector in the combined characterization vector to obtain a text characterization vector.

And the intention identification submodule is used for processing the text characterization vectors through the intention identification model to obtain a multi-intention identification result.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only computer device 4 having components 41-43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 41 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Of course, the memory 41 may also include both internal and external storage devices of the computer device 4. In this embodiment, the memory 41 is generally used for storing an operating system and various types of application software installed on the computer device 4, such as computer readable instructions of the intention identification method. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute computer readable instructions stored in the memory 41 or process data, such as executing computer readable instructions of the intention identification method.

The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing communication connection between the computer device 4 and other electronic devices.

The computer device provided in the present embodiment may perform the above-described intention identifying method. The intention identification method here may be the intention identification method of the respective embodiments described above.

The present application further provides another embodiment, which is to provide a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of the intent recognition method as described above.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

1. An intention recognition method, characterized by comprising the steps of:

2. The method of claim 1, wherein the step of inputting the training text and the full-scale label sequence into an initial intent recognition model to cross-code the training text and the full-scale label sequence to obtain a joint characterization vector comprises:

constructing an initial sequence according to each character in the training text and each intention label in the full-scale label sequence;

mapping the initial sequence into a vector sequence;

inputting the vector sequence into an encoder, and performing cross coding on a character vector and a label vector in the vector sequence through a plurality of coding layers in the encoder to obtain a text sequence characterization vector and a label sequence characterization vector output by the last coding layer;

determining the text sequence characterization vector and the tag sequence characterization vector as a joint characterization vector.

3. The method according to claim 2, wherein the step of performing attention interaction on the text sequence token vector and the tag sequence token vector in the joint token vector to obtain a training text token vector comprises:

transposing the label sequence characterization vector in the combined characterization vector to obtain a transposed label sequence characterization vector;

performing point multiplication operation on the text sequence characterization vector in the combined characterization vector and the transposed label sequence characterization vector to obtain a correlation score matrix; the relevancy score matrix characterizes the relevancy between the character and the intention label;

inputting the correlation degree fractional matrix into an activation model to obtain a third vector;

inputting the third vector into a first activation function, and transposing a function result of the activation function to obtain a fourth vector;

and performing point multiplication operation on the fourth vector and the text sequence characterization vector to obtain a training text characterization vector.

4. The method of claim 1, wherein the step of processing the training text characterization vector by the initial intent recognition model to obtain a multi-intent prediction result comprises:

inputting the training text representation vector into a full-connection layer of the initial intention recognition model to obtain a fifth vector;

inputting the fifth vector into a second activation function to obtain a sixth vector;

and generating a multi-purpose prediction result according to the numerical value of each element in the sixth vector.

5. The intent recognition method of claim 1, wherein said step of computing a joint loss based on said multi-intent predictor, said two-dimensional co-occurrence predictor and said high-dimensional co-occurrence predictor comprises:

calculating a first loss based on the multi-intent prediction result and the sequence of intent tags;

constructing a two-dimensional co-occurrence label and a high-dimensional co-occurrence label according to the selected intention label;

calculating a second loss through the two-dimensional co-occurrence prediction result and the two-dimensional co-occurrence label;

calculating a third loss according to the high-dimensional co-occurrence prediction result and the high-dimensional co-occurrence label;

and performing linear operation on the first loss, the second loss and the third loss to obtain a joint loss.

6. The method according to claim 1, wherein the adjusting the initial intent recognition model according to the joint loss until the joint loss satisfies a training stop condition, and the obtaining the intent recognition model comprises:

adjusting model parameters of the initial intent recognition model, the two-dimensional co-occurrence prediction model, and the high-dimensional co-occurrence prediction model with a goal of reducing the joint loss;

and performing iterative training on the initial intention recognition model, the two-dimensional co-occurrence prediction model and the high-dimensional co-occurrence prediction model after parameter adjustment until the joint loss meets a training stopping condition to obtain an intention recognition model.

7. The intention recognition method of claim 1, wherein the step of performing intention recognition on the text to be recognized through the intention recognition model to obtain multiple intention recognition results comprises:

acquiring a text to be identified;

inputting the text to be recognized and the full label sequence into an intention recognition model so as to carry out cross coding on the text to be recognized and the full label sequence to obtain a joint characterization vector;

performing attention interaction on the text sequence characterization vector and the label sequence characterization vector in the joint characterization vector to obtain a text characterization vector;

and processing the text characterization vector through the intention recognition model to obtain a multi-intention recognition result.

8. An intention recognition apparatus, comprising:

the two-dimensional prediction module is used for splicing the first vector and the second vector and then inputting the spliced vectors into a label two-dimensional co-occurrence prediction model to obtain a two-dimensional co-occurrence prediction result;

and the intention identification module is used for carrying out intention identification on the text to be identified through the intention identification model to obtain a multi-intention identification result.

9. A computer device comprising a memory having computer readable instructions stored therein and a processor which when executed implements the steps of the intent recognition method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that it has computer-readable instructions stored thereon, which, when executed by a processor, implement the steps of the intent recognition method of any one of claims 1 to 7.