CN111680514A - Information processing and model training method, device, equipment and storage medium - Google Patents

Information processing and model training method, device, equipment and storage medium Download PDF

Info

Publication number
CN111680514A
CN111680514A CN201910138086.7A CN201910138086A CN111680514A CN 111680514 A CN111680514 A CN 111680514A CN 201910138086 A CN201910138086 A CN 201910138086A CN 111680514 A CN111680514 A CN 111680514A
Authority
CN
China
Prior art keywords
intention
training data
training
data
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910138086.7A
Other languages
Chinese (zh)
Other versions
CN111680514B (en
Inventor
胡伟
宋俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Orion Star Technology Co Ltd
Original Assignee
Beijing Orion Star Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Orion Star Technology Co Ltd filed Critical Beijing Orion Star Technology Co Ltd
Priority to CN201910138086.7A priority Critical patent/CN111680514B/en
Publication of CN111680514A publication Critical patent/CN111680514A/en
Application granted granted Critical
Publication of CN111680514B publication Critical patent/CN111680514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides an information processing and model training method, device, equipment and storage medium. According to the invention, the intention identification result is obtained firstly, and then the slot position information is extracted by adopting the slot position extraction sub-model corresponding to the intention identification result, so that other unnecessary irrelevant slot position information in the intention corresponding to the intention identification result can not be extracted, and the accuracy of slot position extraction can be improved.

Description

Information processing and model training method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of natural language processing, in particular to an information processing and model training method, device, equipment and storage medium.
Background
In natural language applications, some intelligent terminal devices need to understand the interactive voice information of users so as to perform correct operations, such as: the smart speaker needs to understand the user's intention to play the correct resource or to respond correctly to the user. Therefore, the intelligent terminal device needs to perform intention identification and slot position extraction on the user interaction information, where the slot position is information that needs to be extracted when a definite instruction is generated according to the user interaction information, for example, the user interaction information is "forgetting water that i want to listen to liudeb hua", we want the intelligent terminal device to be able to identify that the field is music and the intention is playing music, and extract singer "liudeb hua" and song name "forgetting water" from the user interaction information, so the singer "liudeb hua" and the song name "forgetting water" are slot position information.
In the prior art, the intention recognition and the slot extraction are performed on the user interaction information, two tasks are usually performed, namely, the intention recognition is performed through a first model, then the slot extraction is performed through a second model, a training data set which is obtained in advance is usually adopted in the model training process for training, the slot information which is irrelevant to the real intention is usually extracted through the trained slot extraction model, and the accuracy of the slot extraction is low.
Disclosure of Invention
The invention provides an information processing and model training method, device, equipment and storage medium, which are used for improving the accuracy of slot extraction of interactive information.
A first aspect of the present invention provides an information processing method, including:
acquiring interactive information to be processed;
determining an intention recognition result corresponding to the interactive information;
and determining a slot position extraction result corresponding to the interactive information according to the slot position extraction submodel corresponding to the intention recognition result through a pre-trained joint learning model.
Optionally, the determining an intention recognition result corresponding to the interaction information includes:
and acquiring an intention identification result corresponding to the interactive information through an intention identification submodel in the joint learning model.
Optionally, the determining an intention recognition result corresponding to the interaction information includes:
matching the interaction information with a preset grammar set;
and if the matched grammar can be obtained from the preset grammar set, determining an intention identification result corresponding to the interactive information according to the matched grammar.
The second aspect of the present invention provides a model training method, including:
acquiring a plurality of training data and marking data corresponding to the training data, wherein the marking data comprises intention classification marking data and slot position extraction marking data;
classifying the training data into a plurality of training data groups according to intention classification labeling data;
and aiming at each training data group, training a joint learning model by using training data contained in the training data group and slot extraction marking data of the training data, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels. Further, for each training data set, the training data included in the training data set and the slot extraction tagging data of the training data are used to train the joint learning model, and the method further includes:
and aiming at each training data group, carrying out joint learning training on an intention recognition sub-model and the slot position extraction sub-model in a joint learning model by using training data contained in the training data group and intention classification marking data of the training data.
Further, before training the joint learning model, the method further includes:
performing word segmentation processing on the training data through a pre-trained word vector model to obtain each word vector corresponding to the training data and context information corresponding to the word vector;
the training of the joint learning model comprises:
and performing joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, the context information corresponding to the word vector, the intention classification marking data of the training data and the slot position extraction marking data.
Further, before training the joint learning model, the method further includes:
performing word segmentation processing on the training data through a pre-trained word vector model to obtain each word vector corresponding to the training data and context information corresponding to the word vector;
the training of the joint learning model comprises:
and performing joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, the context information corresponding to the word vector, the intention classification marking data of the training data and the slot position extraction marking data.
Further, the performing the joint learning training on the intention recognition submodel and the slot position extraction submodel in the joint learning model includes:
aiming at each training data set, respectively inputting training data contained in the training data set into the intention recognition submodel and the slot position extraction submodel corresponding to the intention classification marking data to obtain a prediction result of intention recognition and a prediction result of slot position extraction;
acquiring intention identification loss according to intention classification marking data of the training data and a prediction result of intention identification, and acquiring slot extraction loss according to the slot extraction marking data and the prediction result of slot extraction;
and integrating the intention identification loss and the slot position extraction loss to obtain the total loss of a joint learning model, and respectively carrying out parameter optimization on the intention identification submodel and the slot position extraction submodel according to the total loss of the joint learning model.
A third aspect of the present invention is to provide an information processing apparatus comprising:
the acquisition module is used for acquiring interactive information to be processed;
the processing module is used for determining an intention recognition result corresponding to the interactive information; and determining a slot position extraction result corresponding to the interactive information according to the slot position extraction submodel corresponding to the intention recognition result through a pre-trained joint learning model.
Optionally, the processing module is configured to obtain an intention identification result corresponding to the interaction information through an intention identification submodel in the joint learning model.
Optionally, the processing module is configured to match the interaction information with a preset grammar set; and if the matched grammar can be obtained from the preset grammar set, determining an intention identification result corresponding to the interactive information according to the matched grammar.
A fourth aspect of the present invention provides a model training apparatus comprising:
the data acquisition module is used for acquiring a plurality of training data and marking data corresponding to the training data, wherein the marking data comprises intention classification marking data and slot extraction marking data;
the data grouping module is used for dividing the training data into a plurality of training data groups according to intention classification marking data;
and the training module is used for training a joint learning model by utilizing the training data contained in the training data group and the slot extraction marking data of the training data aiming at each training data group, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels.
Further, the training module is further configured to:
and aiming at each training data group, carrying out joint learning training on an intention recognition sub-model and the slot position extraction sub-model in a joint learning model by using training data contained in the training data group and intention classification marking data of the training data.
Further, the data obtaining module is further configured to perform word segmentation processing on the training data to obtain each word vector corresponding to the training data, and obtain context information corresponding to the word vector;
the training module is used for carrying out joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, the context information corresponding to the word vector, the intention classification marking data of the training data and the slot position extraction marking data.
Further, the data acquisition module is further configured to perform word segmentation processing on the training data through a pre-trained word vector model to obtain each word vector corresponding to the training data and context information corresponding to the word vector;
the training module is used for carrying out joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, the context information corresponding to the word vector, the intention classification marking data of the training data and the slot position extraction marking data.
Further, the training module is configured to:
aiming at each training data set, respectively inputting training data contained in the training data set into the intention recognition submodel and the slot position extraction submodel corresponding to the intention classification marking data to obtain a prediction result of intention recognition and a prediction result of slot position extraction;
acquiring intention identification loss according to intention classification marking data of the training data and a prediction result of intention identification, and acquiring slot extraction loss according to the slot extraction marking data and the prediction result of slot extraction;
and integrating the intention identification loss and the slot position extraction loss to obtain the total loss of a joint learning model, and respectively carrying out parameter optimization on the intention identification submodel and the slot position extraction submodel according to the total loss of the joint learning model.
A fifth aspect of the present invention provides an information processing apparatus comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of the first aspect.
A sixth aspect of the present invention is to provide an information processing model training apparatus, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of the second aspect.
A seventh aspect of the present invention is to provide a computer-readable storage medium having a computer program stored thereon;
which when executed by a processor implements the method according to the first aspect.
An eighth aspect of the present invention is to provide a computer-readable storage medium having stored thereon a computer program;
which when executed by a processor implements the method according to the second aspect.
According to the information and model training processing method, device, equipment and storage medium provided by the invention, the interaction information to be processed is obtained, then the intention recognition result corresponding to the interaction information is determined, and then the slot position extraction result corresponding to the interaction information is determined according to the slot position extraction submodel corresponding to the intention recognition result through the pre-trained joint learning model. According to the invention, the intention identification result is obtained firstly, and then the slot position information is extracted by adopting the slot position extraction sub-model corresponding to the intention identification result, so that other unnecessary irrelevant slot position information in the intention corresponding to the intention identification result can not be extracted, and the accuracy of slot position extraction can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of an information processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a model training method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a model training method according to another embodiment of the present invention;
FIG. 4 is a block diagram of an information processing apparatus according to an embodiment of the present invention;
FIG. 5 is a block diagram of a model training apparatus according to an embodiment of the present invention;
FIG. 6 is a block diagram of an electronic device provided by an embodiment of the invention;
fig. 7 is a block diagram of an electronic device according to another embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments of the present invention are only a part of the embodiments of the present invention, and not all of the embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of an information processing method according to an embodiment of the present invention. As shown in fig. 1, the present embodiment provides an information processing method, which includes the following specific steps:
s101, obtaining interactive information to be processed.
In this embodiment, the interactive information to be processed may be user interactive speech (query), text information corresponding to the user interactive speech may be obtained through speech recognition, and certainly, the interactive information may also be text information input by a user through a keyboard or a touch screen, or interactive information acquired through other approaches.
And S102, determining an intention identification result corresponding to the interactive information.
In this embodiment, optionally, in S102, an intention identification result corresponding to the interaction information may be obtained through an intention identification submodel in a pre-trained joint learning model.
In this embodiment, the joint learning model is obtained through pre-training, and includes an intention recognition submodel and a plurality of slot position extraction submodels, where the intention recognition submodel is used to obtain intention recognition results according to the interaction information, and each intention recognition result corresponds to one slot position extraction submodel, and the intention recognition submodel and the plurality of slot position extraction submodels are obtained through training in a joint learning process, and the training process will be described in detail later.
Optionally, S102 may also match the interaction information with a preset grammar set; and if the matched grammar can be obtained from the preset grammar set, determining an intention identification result corresponding to the interactive information according to the matched grammar.
In this embodiment, the predetermined grammar set includes a plurality of grammars, each grammar corresponding to an intention. In particular, a predetermined set of grammars can include multiple intentions, each intent including multiple grammars. And matching the interactive information with each grammar in a preset grammar set, and obtaining the intention identification result according to the intention corresponding to the grammar after obtaining the matched grammar. In this embodiment, since the grammars in the preset grammar set are limited, there may be a grammar that cannot be obtained according to the preset grammar set and matches the interaction information, and at this time, an intention identification result corresponding to the interaction information may be obtained through an intention identification submodel in the joint learning model.
S103, determining a slot position extraction result corresponding to the interactive information according to the slot position extraction sub-model corresponding to the intention recognition result through a pre-trained joint learning model.
In this embodiment, after the intention recognition result is obtained, the slot position extraction sub-model corresponding to the intention recognition result is selected from the joint learning model according to the intention recognition result, and the slot position extraction result corresponding to the interactive information can be obtained through the slot position extraction sub-model. Because the slot position extraction submodel corresponds to the intention, the slot position information required under the intention can be extracted, and other irrelevant slot position information which is not required by the intention can not be extracted, for example, the slot position information of time may not be required in the intention of chatting, and the slot position information of time can not be extracted by adopting the slot position extraction submodel corresponding to the intention of chatting; and if the slot position information of weather is not needed in the intention of playing songs, the slot position information of weather cannot be extracted by adopting the slot position extraction sub-model corresponding to the intention of playing songs, and the accuracy of the slot position extraction result is improved by firstly obtaining the intention identification result and then extracting the slot position information by adopting the slot position extraction sub-model corresponding to the intention identification result.
In the information processing method provided by this embodiment, the mutual information to be processed is obtained, then the intention identification result corresponding to the mutual information is determined, and then the slot extraction submodel corresponding to the intention identification result is determined according to the slot extraction submodel corresponding to the intention identification result through the pre-trained joint learning model. In the embodiment, the intention identification result is obtained first, and then the slot position information is extracted by the slot position extraction sub-model corresponding to the intention identification result, so that other unnecessary irrelevant slot position information in the intention corresponding to the intention identification result is not extracted, and the accuracy of slot position extraction can be improved.
On the basis of any of the above embodiments, before obtaining an intention recognition result corresponding to the interaction information through an intention recognition submodel in a joint learning model, the method further includes:
performing word segmentation processing on the interactive information to obtain each word vector corresponding to the interactive information, and obtaining context information corresponding to the word vector;
and inputting each word vector corresponding to the interactive information and the context information corresponding to the word vector into the joint learning model.
In this embodiment, each word vector corresponding to the interactive information may be obtained by inputting the interactive information into word vector models such as word2vec and glove. Because word vectors are isolated, different interactive information may have the same word vector, but expressed intentions are not the same, in this embodiment, semantics and context content of each word vector also need to be analyzed, that is, context information corresponding to the word vector needs to be obtained, specifically, each word vector may be input into the bottom layer model to obtain the context semantic vector corresponding to each word vector as the context information corresponding to the word vector, in this embodiment, the bottom layer model may employ LSTM (Long Short-Term Memory, Long Short-Term Memory network), Bi-LSTM (Bi-directional Long Short-Term Memory, Bi-directional Long Short-Term Memory network), or RNN (Recurrent neural network), and a specific processing procedure may employ an existing method, which is not described herein again. Further, each word vector corresponding to the interactive information and the context information corresponding to the word vector are input to an intention recognition submodel in the joint learning model for intention recognition, and then input to a slot extraction submodel corresponding to the intention recognition result for slot extraction.
Further, an intention identification result corresponding to the interaction information is obtained through an intention identification submodel in the joint learning model, specifically, a probability that the interaction information belongs to each intention is obtained through the intention identification submodel, and an intention with the highest probability is used as an intention identification result corresponding to the interaction information.
In another optional embodiment, the interactive information may be subjected to word segmentation processing through a pre-trained word vector model, so as to obtain each word vector corresponding to the interactive information and context information corresponding to the word vector; and inputting each word vector corresponding to the interactive information and the context information corresponding to the word vector into the joint learning model.
On the basis of the above embodiment, when the intention recognition result is obtained through a preset grammar set, before the interaction information is matched with the preset grammar set, the method further includes:
the method comprises the steps of segmenting the interactive information to obtain a plurality of vocabularies contained in the interactive information, matching the plurality of vocabularies with each grammar in a preset grammar set according to the plurality of vocabularies, specifically, matching the plurality of vocabularies with the nodes of the grammar, determining the grammar to be the matched grammar if the plurality of vocabularies can be matched with the nodes of a certain grammar, and taking the intention corresponding to the grammar as the intention recognition result corresponding to the interactive information.
Fig. 2 is a flowchart of a model training method according to an embodiment of the present invention. As shown in fig. 2, the embodiment provides a model training method, which includes the following specific steps:
s201, obtaining a plurality of training data and marking data corresponding to the training data, wherein the marking data comprises intention classification marking data and slot extraction marking data.
In this embodiment, training data is obtained in advance, where the training data is interactive information with known intentions and slot position information, that is, the training data has corresponding annotation data, and the annotation data includes intention classification annotation data and slot position extraction annotation data.
S202, classifying the training data into a plurality of training data groups according to intention classification labeling data.
In this embodiment, the training data is classified according to the intention classification labeling data to obtain a plurality of training data sets, and each training data set corresponds to one intention classification.
S203, aiming at each training data group, training a joint learning model by using training data contained in the training data group and slot extraction marking data of the training data, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels.
In this embodiment, when performing joint training on the intention recognition submodel and the slot position extraction submodel in the joint learning model, it is necessary to train the slot position extraction submodel according to different intents, for example, training the initial slot position extraction submodel by using the training data included in the training data group a1 corresponding to the intention classification a1 and the slot position extraction labeling data of the training data, so as to obtain the slot position extraction submodel C1 corresponding to the intention classification a 1; training the initial slot extraction submodel by using the training data contained in the training data group A2 corresponding to the intention classification a2 and the slot extraction marking data of the training data, so as to obtain the slot extraction submodel C2 corresponding to the intention classification a2, and so on. When n intent classifications exist, n slot extraction submodels are obtained, wherein the architecture of each slot extraction submodel is the same (based on the same initial model), but the model parameters are different. Because the training process of the slot extraction submodel is to distinguish intention classifications for respective training, and because the slot information of the training data set a1 is different from the slot information of other training data sets a2, the training data contained in the training data set a1 corresponding to the a1 intention classification and the slot extraction submodel C1 obtained by training the slot extraction marking data of the training data do not extract the slot information irrelevant to the a1 intention classification.
In an alternative embodiment, before the training of the joint learning model in S203, the method further includes:
performing word segmentation processing on the training data to obtain each word vector corresponding to the training data, and obtaining context information corresponding to the word vector.
In this embodiment, the training data needs to be preprocessed, that is, any training data may be input into word vector models such as word2vec and glove, so as to obtain each word vector corresponding to the training data. Because word vectors are isolated, different training data may have the same word vector, but expressed intentions are not the same, so in this embodiment, semantics and context content of each word vector also need to be analyzed, that is, context information corresponding to the word vector needs to be obtained, specifically, the context semantic vector corresponding to each word vector is obtained by inputting each word vector into the bottom layer model to serve as the context information corresponding to the word vector, in this embodiment, the bottom layer model may adopt LSTM, Bi-LSTM, or RNN, and a specific processing process may adopt an existing method, which is not described herein again.
Correspondingly, in S203, the training of the joint learning model includes:
and extracting marking data according to each word vector corresponding to the training data, the context information corresponding to the word vector and the slot position of the training data, and training a joint learning model.
In another alternative embodiment, before training the joint learning model in S203, the method further includes:
performing word segmentation processing on the training data through a pre-trained word vector model to obtain each word vector corresponding to the training data and context information corresponding to the word vector;
correspondingly, in S203, the training of the joint learning model includes:
and extracting marking data according to each word vector corresponding to the training data, the context information corresponding to the word vector and the slot position of the training data, and training a joint learning model.
In the model training method provided by this embodiment, a plurality of training data and annotation data corresponding to the training data are obtained, where the annotation data includes intention classification annotation data and slot extraction annotation data; classifying the training data into a plurality of training data groups according to intention classification labeling data; and aiming at each training data group, training the joint learning model by utilizing the training data contained in the training data group and the slot extraction marking data of the training data, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels. In the embodiment, the slot position extraction submodels are respectively trained through the training data sets corresponding to different intention classifications to obtain different slot position extraction submodels corresponding to different intention classifications, so that the situation that each slot position extraction submodel extracts slot position information irrelevant to the intention classifications can be avoided, and the accuracy of the slot position extraction result is improved.
On the basis of any of the above embodiments, in S203, for each training data set, the training data included in the training data set and the slot extraction tagging data of the training data are used to train the joint learning model, and the method further includes:
and aiming at each training data group, carrying out joint learning training on an intention recognition sub-model and the slot position extraction sub-model in a joint learning model by using training data contained in the training data group and intention classification marking data of the training data.
In an optional embodiment, before performing the joint learning training on the intent recognition submodel and the slot extraction submodel in the joint learning model, the method further includes:
performing word segmentation processing on the training data to obtain each word vector corresponding to the training data, and obtaining context information corresponding to the word vector.
Correspondingly, the intention recognition submodel and the slot position extraction submodel in the combined learning model are subjected to combined learning training, and the method comprises the following steps:
and performing joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, the context information corresponding to the word vector, the intention classification marking data of the training data and the slot position extraction marking data.
In another optional embodiment, before performing the joint learning training on the intent recognition submodel and the slot position extraction submodel in the joint learning model, the method further includes:
performing word segmentation processing on the training data through a pre-trained word vector model to obtain each word vector corresponding to the training data and context information corresponding to the word vector;
correspondingly, the intention recognition submodel and the slot position extraction submodel in the combined learning model are subjected to combined learning training, and the method comprises the following steps:
and performing joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, the context information corresponding to the word vector, the intention classification marking data of the training data and the slot position extraction marking data.
In this embodiment, the intention recognition submodel and the slot position extraction submodel are subjected to the joint learning training, so that the error of the intention recognition can be prevented from being transmitted to the slot position extraction task when the intention recognition is performed in advance and then the slot position extraction task is performed, and the accuracy of the intention recognition and the slot position extraction can be improved.
Specifically, as shown in fig. 3, the performing of the joint learning training of the intention recognition submodel in the joint learning model and the slot extraction submodel includes:
s301, aiming at each training data group, respectively inputting training data contained in the training data group into the intention recognition submodel and the slot position extraction submodel corresponding to the intention classification marking data to obtain a prediction result of intention recognition and a prediction result of slot position extraction;
s302, obtaining intention identification loss according to intention classification marking data of the training data and a prediction result of intention identification, and obtaining slot position extraction loss according to the slot position extraction marking data and the prediction result of slot position extraction;
and S303, integrating the intention identification loss and the slot position extraction loss to obtain the total loss of a combined learning model, and respectively carrying out parameter optimization on the intention identification submodel and the slot position extraction submodel according to the total loss of the combined learning model.
In this embodiment, for each training data set, the intention identification loss of the intention identification submodel and the slot extraction loss of the slot extraction submodel are respectively obtained, and the intention identification loss and the slot extraction loss are integrated into one loss value, that is, the total loss of the joint learning model, and then back propagation is performed according to the total loss of the joint learning model, and repeated iteration is performed, so that parameter optimization of the intention identification submodel and the slot extraction submodel is realized, and the final intention identification submodel and the slot extraction submodel corresponding to the intention classification are obtained. The total loss of the joint learning model can be obtained by summing the intention recognition loss and the slot extraction loss, or by weighting and summing the intention recognition loss and the slot extraction loss. The embodiment of the present invention is not limited to the specific implementation.
Fig. 4 is a block diagram of an information processing apparatus according to an embodiment of the present invention. The information processing apparatus provided by this embodiment may execute the processing flow provided by the information processing method embodiment, and as shown in fig. 3, the information processing apparatus 40 includes an obtaining module 41 and a processing module 42.
The acquiring module 41 is configured to acquire interactive information to be processed;
a processing module 42, configured to determine an intention identification result corresponding to the interaction information; and determining a slot position extraction result corresponding to the interactive information according to the slot position extraction submodel corresponding to the intention recognition result through a pre-trained joint learning model.
Optionally, the processing module 42 is configured to obtain an intention identification result corresponding to the interaction information through an intention identification submodel in the joint learning model.
Optionally, the processing module 42 is configured to match the interaction information with a preset grammar set; and if the matched grammar can be obtained from the preset grammar set, determining an intention identification result corresponding to the interactive information according to the matched grammar.
The information processing apparatus provided in the embodiment of the present invention may be specifically configured to execute the method embodiment provided in fig. 1, and specific functions are not described herein again.
The information processing device provided by the embodiment of the invention determines the slot position extraction result corresponding to the interactive information by acquiring the interactive information to be processed, then determining the intention identification result corresponding to the interactive information and then extracting the sub-model according to the slot position corresponding to the intention identification result through the pre-trained joint learning model. In the embodiment, the intention identification result is obtained first, and then the slot position information is extracted by the slot position extraction sub-model corresponding to the intention identification result, so that other unnecessary irrelevant slot position information in the intention corresponding to the intention identification result is not extracted, and the accuracy of slot position extraction can be improved.
Fig. 5 is a structural diagram of a model training apparatus according to an embodiment of the present invention. The model training apparatus provided in this embodiment may execute the processing procedure provided in the embodiment of the model training method, as shown in fig. 5, the model training apparatus 50 includes a data obtaining module 51, a data grouping module 52, and a training module 53.
The data obtaining module 51 is configured to obtain a plurality of training data and annotation data corresponding to the training data, where the annotation data includes intention classification annotation data and slot extraction annotation data;
a data grouping module 52, configured to divide the training data into a plurality of training data groups according to intention classification labeling data;
and the training module 53 is configured to train a joint learning model by using the training data included in each training data set and the slot extraction marking data of the training data, where different intent classifications in the joint learning model correspond to different slot extraction submodels.
Further, the training module 53 is further configured to:
and aiming at each training data group, carrying out joint learning training on an intention recognition sub-model and the slot position extraction sub-model in a joint learning model by using training data contained in the training data group and intention classification marking data of the training data.
Further, the data obtaining module 51 is further configured to perform word segmentation processing on the training data to obtain each word vector corresponding to the training data, and obtain context information corresponding to the word vector;
the training module 53 is configured to perform joint learning training on the intention identifier model and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, context information corresponding to the word vector, intention classification tagging data of the training data, and slot position extraction tagging data.
Further, the data obtaining module 51 is further configured to perform word segmentation processing on the training data through a pre-trained word vector model to obtain each word vector corresponding to the training data and context information corresponding to the word vector;
the training module 53 is configured to perform joint learning training on the intention identification submodel and the slot position extraction submodel in the joint learning model according to each word vector corresponding to the training data, context information corresponding to the word vector, intention classification tagging data of the training data, and slot position extraction tagging data.
Further, the training module 53 is configured to:
aiming at each training data set, respectively inputting training data contained in the training data set into the intention recognition submodel and the slot position extraction submodel corresponding to the intention classification marking data to obtain a prediction result of intention recognition and a prediction result of slot position extraction;
acquiring intention identification loss according to intention classification marking data of the training data and a prediction result of intention identification, and acquiring slot extraction loss according to the slot extraction marking data and the prediction result of slot extraction;
and integrating the intention identification loss and the slot position extraction loss to obtain the total loss of a joint learning model, and respectively carrying out parameter optimization on the intention identification submodel and the slot position extraction submodel according to the total loss of the joint learning model.
The model training apparatus provided in the embodiment of the present invention may be specifically configured to execute the method embodiments provided in fig. 2 and fig. 3, and specific functions are not described herein again.
According to the model training device provided by the embodiment of the invention, a plurality of training data and marking data corresponding to the training data are obtained, wherein the marking data comprise intention classification marking data and slot position extraction marking data; classifying the training data into a plurality of training data groups according to intention classification labeling data; and aiming at each training data group, training the joint learning model by utilizing the training data contained in the training data group and the slot extraction marking data of the training data, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels. In the embodiment, the slot position extraction submodels are respectively trained through the training data sets corresponding to different intention classifications to obtain different slot position extraction submodels corresponding to different intention classifications, so that the situation that each slot position extraction submodel extracts slot position information irrelevant to the intention classifications can be avoided, and the accuracy of the slot position extraction result is improved. .
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device provided in the embodiment of the present invention may execute the processing flow provided in the information processing method embodiment, as shown in fig. 6, the electronic device 60 includes a memory 61, a processor 62, a computer program, and a communication interface 63; wherein the computer program is stored in the memory 61 and is configured to be executed by the processor 62 to execute the information processing method described in the above embodiment.
The electronic device in the embodiment shown in fig. 6 may be configured to execute the technical solution of the above-mentioned information processing method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device provided by the embodiment of the present invention may execute the processing flow provided by the embodiment of the model training method, as shown in fig. 7, the electronic device 70 includes a memory 71, a processor 72, a computer program, and a communication interface 73; wherein a computer program is stored in the memory 71 and is configured to be executed by the processor 72 for performing the model training method as described in the above embodiments.
The electronic device in the embodiment shown in fig. 7 may be used to implement the technical solution of the above-mentioned model training method embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
In addition, the present embodiment also provides a computer-readable storage medium on which a computer program is stored, the computer program being executed by a processor to implement the information processing method described in the above embodiment.
In addition, the present embodiment also provides a computer-readable storage medium, on which a computer program is stored, the computer program being executed by a processor to implement the model training method described in the above embodiments.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An information processing method characterized by comprising:
acquiring interactive information to be processed;
determining an intention recognition result corresponding to the interactive information;
and determining a slot position extraction result corresponding to the interactive information according to the slot position extraction submodel corresponding to the intention recognition result through a pre-trained joint learning model.
2. The method according to claim 1, wherein the determining the intention recognition result corresponding to the interaction information comprises:
and acquiring an intention identification result corresponding to the interactive information through an intention identification submodel in the joint learning model.
3. The method according to claim 1, wherein the determining the intention recognition result corresponding to the interaction information comprises:
matching the interaction information with a preset grammar set;
and if the matched grammar can be obtained from the preset grammar set, determining an intention identification result corresponding to the interactive information according to the matched grammar.
4. A model training method is characterized in that,
acquiring a plurality of training data and marking data corresponding to the training data, wherein the marking data comprises intention classification marking data and slot position extraction marking data;
classifying the training data into a plurality of training data groups according to intention classification labeling data;
and aiming at each training data group, training a joint learning model by using training data contained in the training data group and slot extraction marking data of the training data, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels.
5. The method of claim 4, wherein for each training data set, training a joint learning model by using the training data included in the training data set and the slot extraction annotation data of the training data, further comprises:
and aiming at each training data group, carrying out joint learning training on an intention recognition sub-model and the slot position extraction sub-model in a joint learning model by using training data contained in the training data group and intention classification marking data of the training data.
6. The method of claim 5, wherein the joint learning training of the intent recognition submodel and the slot extraction submodel in the joint learning model comprises:
aiming at each training data set, respectively inputting training data contained in the training data set into the intention recognition submodel and the slot position extraction submodel corresponding to the intention classification marking data to obtain a prediction result of intention recognition and a prediction result of slot position extraction;
acquiring intention identification loss according to intention classification marking data of the training data and a prediction result of intention identification, and acquiring slot extraction loss according to the slot extraction marking data and the prediction result of slot extraction;
and integrating the intention identification loss and the slot position extraction loss to obtain the total loss of a joint learning model, and respectively carrying out parameter optimization on the intention identification submodel and the slot position extraction submodel according to the total loss of the joint learning model.
7. An information processing apparatus characterized by comprising:
the acquisition module is used for acquiring interactive information to be processed;
the processing module is used for determining an intention recognition result corresponding to the interactive information; and determining a slot position extraction result corresponding to the interactive information according to the slot position extraction submodel corresponding to the intention recognition result through a pre-trained joint learning model.
8. A model training apparatus, comprising:
the data acquisition module is used for acquiring a plurality of training data and marking data corresponding to the training data, wherein the marking data comprises intention classification marking data and slot extraction marking data;
the data grouping module is used for dividing the training data into a plurality of training data groups according to intention classification marking data;
and the training module is used for training a joint learning model by utilizing the training data contained in the training data group and the slot extraction marking data of the training data aiming at each training data group, wherein different intention classifications in the joint learning model correspond to different slot extraction submodels.
9. An electronic device, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any of claims 1-3 or 4-8.
10. A computer-readable storage medium, having stored thereon a computer program;
the computer program, when executed by a processor, implements the method of any of claims 1-3 or 4-8.
CN201910138086.7A 2019-02-25 2019-02-25 Information processing and model training method, device, equipment and storage medium Active CN111680514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910138086.7A CN111680514B (en) 2019-02-25 2019-02-25 Information processing and model training method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910138086.7A CN111680514B (en) 2019-02-25 2019-02-25 Information processing and model training method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111680514A true CN111680514A (en) 2020-09-18
CN111680514B CN111680514B (en) 2024-03-01

Family

ID=72433185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910138086.7A Active CN111680514B (en) 2019-02-25 2019-02-25 Information processing and model training method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111680514B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489639A (en) * 2020-11-26 2021-03-12 北京百度网讯科技有限公司 Audio signal processing method, device, system, electronic equipment and readable medium
CN112926313A (en) * 2021-03-10 2021-06-08 新华智云科技有限公司 Method and system for extracting slot position information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180121415A1 (en) * 2016-11-03 2018-05-03 Conduent Business Services, Llc Probabilistic matching for dialog state tracking with limited training data
WO2018196684A1 (en) * 2017-04-24 2018-11-01 北京京东尚科信息技术有限公司 Method and device for generating conversational robot
CN108920622A (en) * 2018-06-29 2018-11-30 北京奇艺世纪科技有限公司 A kind of training method of intention assessment, training device and identification device
CN109214417A (en) * 2018-07-25 2019-01-15 百度在线网络技术(北京)有限公司 The method for digging and device, computer equipment and readable medium that user is intended to
CN109241524A (en) * 2018-08-13 2019-01-18 腾讯科技(深圳)有限公司 Semantic analysis method and device, computer readable storage medium, electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180121415A1 (en) * 2016-11-03 2018-05-03 Conduent Business Services, Llc Probabilistic matching for dialog state tracking with limited training data
WO2018196684A1 (en) * 2017-04-24 2018-11-01 北京京东尚科信息技术有限公司 Method and device for generating conversational robot
CN108920622A (en) * 2018-06-29 2018-11-30 北京奇艺世纪科技有限公司 A kind of training method of intention assessment, training device and identification device
CN109214417A (en) * 2018-07-25 2019-01-15 百度在线网络技术(北京)有限公司 The method for digging and device, computer equipment and readable medium that user is intended to
CN109241524A (en) * 2018-08-13 2019-01-18 腾讯科技(深圳)有限公司 Semantic analysis method and device, computer readable storage medium, electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
付博;陈毅恒;邵艳秋;刘挺;: "基于用户自然标注的微博文本的消费意图识别" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489639A (en) * 2020-11-26 2021-03-12 北京百度网讯科技有限公司 Audio signal processing method, device, system, electronic equipment and readable medium
CN112926313A (en) * 2021-03-10 2021-06-08 新华智云科技有限公司 Method and system for extracting slot position information
CN112926313B (en) * 2021-03-10 2023-08-15 新华智云科技有限公司 Method and system for extracting slot position information

Also Published As

Publication number Publication date
CN111680514B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
CN109241524B (en) Semantic analysis method and device, computer-readable storage medium and electronic equipment
CN112100349B (en) Multi-round dialogue method and device, electronic equipment and storage medium
CN108847241B (en) Method for recognizing conference voice as text, electronic device and storage medium
CN107291783B (en) Semantic matching method and intelligent equipment
JP6677419B2 (en) Voice interaction method and apparatus
CN109271493A (en) A kind of language text processing method, device and storage medium
CN108447471A (en) Audio recognition method and speech recognition equipment
CN111445898B (en) Language identification method and device, electronic equipment and storage medium
CN108682420A (en) A kind of voice and video telephone accent recognition method and terminal device
CN111209363B (en) Corpus data processing method, corpus data processing device, server and storage medium
CN109918627A (en) Document creation method, device, electronic equipment and storage medium
CN111159358A (en) Multi-intention recognition training and using method and device
CN106649253A (en) Auxiliary control method and system based on post verification
CN113254613A (en) Dialogue question-answering method, device, equipment and storage medium
CN110795544B (en) Content searching method, device, equipment and storage medium
CN111444321B (en) Question answering method, device, electronic equipment and storage medium
CN111680514B (en) Information processing and model training method, device, equipment and storage medium
CN113220828B (en) Method, device, computer equipment and storage medium for processing intention recognition model
CN113051384A (en) User portrait extraction method based on conversation and related device
CN111354354B (en) Training method, training device and terminal equipment based on semantic recognition
CN114595692A (en) Emotion recognition method, system and terminal equipment
CN110347807B (en) Problem information processing method and device
CN111477212A (en) Content recognition, model training and data processing method, system and equipment
CN115691503A (en) Voice recognition method and device, electronic equipment and storage medium
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant