CN109213851B - Cross-language migration method for spoken language understanding in dialog system - Google Patents
Cross-language migration method for spoken language understanding in dialog system Download PDFInfo
- Publication number
- CN109213851B CN109213851B CN201810724523.9A CN201810724523A CN109213851B CN 109213851 B CN109213851 B CN 109213851B CN 201810724523 A CN201810724523 A CN 201810724523A CN 109213851 B CN109213851 B CN 109213851B
- Authority
- CN
- China
- Prior art keywords
- migration
- spoken language
- language
- language understanding
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013508 migration Methods 0.000 title claims abstract description 93
- 230000005012 migration Effects 0.000 title claims abstract description 93
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000013519 translation Methods 0.000 claims description 94
- 238000005070 sampling Methods 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 24
- 230000014759 maintenance of location Effects 0.000 claims description 13
- 238000009826 distribution Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 7
- 230000003190 augmentative effect Effects 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims 1
- 238000002372 labelling Methods 0.000 abstract description 11
- 230000000694 effects Effects 0.000 abstract description 2
- 238000013480 data collection Methods 0.000 abstract 1
- 230000014616 translation Effects 0.000 description 75
- 230000009471 action Effects 0.000 description 7
- 239000003795 chemical substances by application Substances 0.000 description 7
- 230000003993 interaction Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000012614 Monte-Carlo sampling Methods 0.000 description 3
- 230000002452 interceptive effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000002787 reinforcement Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000004576 sand Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the field of language processing, and provides a cross-language migration method for spoken language understanding in a dialog system, aiming at solving the technical problem of poor migration result quality caused by difficult migration of semantic labels and language culture difference in the cross-language migration of spoken language understanding in the dialog system. To this end, the cross-language migration method of the spoken language in the invention comprises: acquiring marked spoken language understanding data to be migrated; migrating the to-be-migrated data with the category labels by using a pre-constructed spoken language understanding migration model to obtain a first migration result with the category labels; and performing culture migration on the first migration result to obtain the spoken language understanding data of the target language. Based on the steps, the cross-language migration of the spoken language understanding data can be rapidly and accurately carried out, the problem of poor effect of a supervised training method caused by insufficient bilingual band class marking data is solved, and the data collection and labeling cost in model training is reduced.
Description
Technical Field
The invention relates to the technical field of man-machine conversation, in particular to a cross-language migration method for spoken language understanding in a conversation system under the condition of low resources.
Background
The task type dialogue system is a man-machine interaction system which assists a user in completing tasks in a specific field (fields such as restaurants, hotels or air tickets) through a natural language interaction mode. The task-based dialog system needs to have four basic functions: spoken language understanding, dialog state tracking, dialog policies, and dialog generation. Among them, Spoken Language Understanding (SLU), which is an entry of the entire system, is a very important technical module to understand the meaning of utterances in a context defined by a human-computer interactive dialog system; the method generally comprises three subtasks of domain classification, user intention detection and semantic slot filling, and some systems combine the domain classification and the intention detection. If it is desired to have a dialog system that supports different languages for different markets, a large amount of training data needs to be collected and labeled for the SLU for each language, which is very time consuming and labor intensive. Therefore, the single-language SLU system is migrated to other languages, and the method has high application value.
Currently, the method for SLU cross-language migration mainly includes an active-end testing method and a target-end training method. Source testing refers to translating user input in other languages into a source language and then processing the input using the source language's SLU system. Target end training refers to migrating the training corpus of the source language to the target language end and then training the SLU system of the target language. The source end testing method is simple and easy to implement, but the semantic slot result output by the source end testing method is of the source language and needs to be translated back to the target language to be used by a subsequent module in the target language dialog system. The target end training scheme supports direct training and adjustment of the model at the target language end, is more flexible, and does not need to consume extra time for translation of each input of a user after the system is online.
However, each SLU corpus consists of a spoken text and its semantic annotation information, such as: "play album < song > cheer > in < album > lute phase </album >. To migrate such a source language training corpus, not only the text needs to be translated into the target language, but also the semantic tags in the text need to be migrated correctly, so that a general machine translator cannot be directly used, and a large amount of bilingual data required for training a special translator is lacked. In addition to the issue of semantic tag migration, language culture migration is also considered, for example, a user living in London may say "Call a taxi to Tower of London" instead of "Call a taxi to Forbidden City". Therefore, in the migration process of the corpus, not only the migration of the semantic tags but also the cultural differences need to be considered.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the technical problem that in a dialog system, cross-language migration of spoken language understanding is difficult due to difficulty in migration of semantic tags and differences in language culture, the present invention provides a cross-language migration method of spoken language understanding in a dialog system, so as to solve the above technical problem.
In a first aspect, the cross-language migration method for spoken language understanding in a dialog system provided by the present invention comprises the following steps: obtaining source language spoken language understanding data to be migrated; migrating the data to be migrated by using a pre-constructed spoken language understanding migration model to obtain a first migration result; performing culture migration on the first migration result to obtain spoken language understanding data of a target language;
wherein, the method for constructing the spoken language understanding migration model comprises the following steps:
training a general machine translation model by using a preset general parallel corpus to obtain a first optimized translation model; labeling a small amount of bilingual spoken language understanding corpora, then performing semantic slot replacement on the bilingual spoken language understanding corpora, and taking the replaced corpora as bilingual category marking corpora; carrying out supervised incremental training on the first optimized translation model by using the bilingual category labeled corpus to obtain a second optimized translation model; performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the data to be migrated with the category label to obtain a spoken language understanding migration model; and the data to be migrated with the category labels is obtained by replacing semantic slots of the source language spoken language understanding data with the category labels.
Further, in a preferred technical solution provided by the present invention, the semantic slots include semantic tags and semantic slot values, and the step of replacing the semantic slots in the labeled bilingual spoken language understanding corpus with category labels includes: and replacing the labeled semantic labels and semantic groove values in the bilingual spoken language understanding corpus with the specified category labels.
Further, in a preferred technical solution provided by the present invention, the step of performing source-side supervised reinforced translation incremental training on the second optimized translation model by using the to-be-migrated data with the category label to obtain the spoken language understanding migration model includes:
step 41, inputting the monolingual data of the data to be migrated with the category labels to obtain the output of the softmax layer of the second optimized translation model, and obtaining the probability distribution with the dimension being the size of the vocabulary; sampling according to the probability distribution to obtain a sampling sentence; performing local optimal decoding according to the probability distribution to obtain a decoding sentence; respectively calculating the semantic slot retention rates of the sampling sentence and the decoding sentence to serve as a reward item and a reward regular item for training the second optimized translation model; and using the sampling sentence as a label, and optimizing the parameters of the second optimized translation model by using a policy gradient method.
Further, in a preferred technical solution provided by the present invention, the step of "calculating the semantic slot retention rates of the sampling sentence and the decoding sentence respectively" includes: and calculating the semantic slot retention rate of the sampling sentence and the decoding sentence by using the following formula according to the missing and wrong translation information of the semantic slot in the decoding sentence.
Wherein g (c)iS) is a statistical semantic slot s in a sample sentence ciFunction of the number of occurrences in g (e)iS) is a statistical semantic slot s in the decoded sentence eiAs a function of the number of occurrences in (a).
Further, in a preferred embodiment of the present invention, the step of "optimizing the parameters of the second optimized translation model by using the policy gradient method with the sampling sentence as a tag" includes: optimizing said second optimized translation model using said reward function as a loss function according to a mathematical expectation of maximizing a reward:
wherein x is x1,x2,.. is an input sentence,is a sentence generated by decoding, wgP (θ) represents a distribution in which the generated sentence obeys probability p (θ), and r (w)gX) represents the bonus item, i.e., the SKR value, the entire equation is a mathematical expectation of the bonus item,is a mathematical expectation.
Further, in a preferred embodiment of the present invention, the step of "using the sampling sentence as a label and optimizing the parameter of the second optimized translation model by using a policy gradient method" further includes: approximating the mathematical expectation using monte carlo sampling, the variance of the reward function is reduced by optimizing the parameters of the second optimized translation model by:
wherein, wbTranslation result corresponding to greedy decoding, wsCorresponding to the translation result of the sample decoding, ytIs the input of the decoder softmax function, htIs the input vector of the fully-connected layer before the softmax function,is thatThe unique heat vector of (a).
Further, in a preferred embodiment of the present invention, the step of performing culture migration on the first migration result to obtain the spoken language understanding data of the target language includes: determining a semantic slot for cultural dependence in the bilingual category mark corpus, and collecting a target language semantic slot value based on the semantic slot for cultural dependence; constructing a semantic slot database of the cultural dependence according to the semantic slot of the cultural dependence and the semantic slot value of the target language; and replacing the category labels in the bilingual category label corpus by using the semantic slots in the semantic slot database depending on the culture according to a database query mode to obtain the spoken language understanding data of the target language after the culture migration.
In a second aspect, the present application also provides a storage device carrying one or more programs adapted to be loaded by a processor to perform a method implementing any of the above aspects.
In a third aspect, the present application also provides a processing apparatus comprising a processor storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded by a processor to perform any of the methods of the above summary.
Compared with the closest prior art, the technical scheme at least has the following beneficial effects:
according to the cross-language migration method for the spoken language understanding in the voice system, the cross-language migration is performed on the existing monolingual spoken language understanding data through the pre-constructed spoken language understanding migration model, so that the acquisition of the labeled data for the spoken language understanding of the target language is realized, and the cost of collecting and labeling the data of the target language is reduced. In the process of training the spoken language understanding migration model, the migration effect of the semantic labels, namely the quality of the migration result, is effectively improved through a source-end supervised reinforced translation increment training mode. And moreover, through culture migration, the culture difference caused by direct migration is weakened, and the quality of the spoken language understanding corpus of the target language is improved.
Drawings
FIG. 1 is a diagram illustrating the main steps of a cross-language migration method for spoken language understanding in a dialog system according to an embodiment of the present invention;
FIG. 2a is an attention weight heatmap of a spoken language understanding migration model based on a cross-language migration method of spoken language understanding in a dialog system of the present invention;
FIG. 2b is an attention weight heatmap of a second optimized translation model in an embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to FIG. 1, FIG. 1 illustrates the main steps of a cross-language migration method of spoken language in a dialog system. As shown in fig. 1, the main steps of the cross-language migration method for spoken language in the dialog system in this embodiment are:
step 101, obtaining source language spoken language understanding data to be migrated.
In this embodiment, the method and the device can be applied to an electronic device or an application platform of a cross-language migration method for spoken language understanding in a dialog system, and source language spoken language understanding data to be subjected to cross-language migration is obtained. The electronic device may be a server for performing spoken language understanding data processing or task control in a task-based dialog system, and may obtain spoken language understanding data to be migrated from a terminal device communicatively connected to the server or an application platform; specifically, in a human-computer interactive dialogue system, a module with a spoken language understanding function acquires spoken language understanding data input by a user through a speech recognition module, and extracts the spoken language understanding data input by understanding in a context defined by the dialogue system. The terminal equipment can obtain the spoken language understanding data to be subjected to cross-language migration through the sound pickup equipment in the voice recognition module connected with the terminal equipment. In particular, in some interaction domains (e.g., restaurant ordering, hotel services, or airline ticket ordering domains), human-computer interaction systems that assist users in completing tasks through the form of natural language interactions. In a human-computer interactive task-based dialog system, it is desirable that the dialog system can support different languages for different markets, and migrate a single-language system to other languages to become a multi-language system. The language data can be collected through the terminal equipment, or the language data can be obtained from a storage unit in the system; the language data includes source language data and target language data.
And 102, migrating the to-be-migrated data with the category labels by using a pre-constructed spoken language understanding migration model to obtain a first migration result with the category labels.
In this embodiment, based on the source language spoken language understanding data to be migrated acquired in step 101, performing cross-language migration on the spoken language understanding data by using a pre-constructed spoken language understanding migration model to obtain translation data with a category label in a target language, which is referred to as a first migration result. Here, the spoken language understanding migration model may be a model constructed based on a deep neural network, and may be, for example, a Transformer network model, and the Transformer network model is used to complete the cross-language migration of the spoken language understanding data to be migrated, so as to obtain the translation data with the target language and the category label. The spoken language understanding migration model inputs spoken language understanding data with category labels of a source language, outputs the probability of the spoken language understanding data with the category labels of target language translation data, and determines the corresponding translation of the spoken language understanding data to be migrated by utilizing the probability distribution output by a decoder. I.e., determining the corresponding target language spoken language understanding data of the source language spoken language understanding data.
The method for constructing the spoken language understanding migration model comprises the following sub-steps of:
and a substep 1021, training a general machine translation model by using a preset general parallel corpus to obtain a first optimized translation model.
In sub-step 1021, the parallel corpora may be corpora obtained from a predetermined bilingual/multilingual corpus, including source language corpora and target language corpora.
The text of the source language corpus and the translated text of the target language corresponding to the source language corpus in parallel form parallel corpora; the alignment degree of the parallel corpora can be word-level, sentence-level, paragraph-level and chapter-level alignment. And training a translation model by using the source language corpus and the target language corpus in the parallel corpus as sample data, wherein the translation model can be translated by using the first optimized translation model obtained by training.
And a substep 1022 of labeling the bilingual parallel spoken language understanding corpus in the bilingual parallel corpus, replacing the semantic slots in the labeled bilingual parallel spoken language understanding corpus with category labels, and using the replaced bilingual spoken language understanding corpus as the bilingual category label corpus.
Here, a small number of bilingual parallel spoken-understanding corpora may be labeled manually. The labeling can be to extract a small amount of spoken language understanding labeling from the target language data in the general bilingual parallel corpus obtained in the sub-step 1021, and then manually translate the spoken language understanding labeling into the source language, so as to obtain the bilingual spoken language understanding corpus with the labeling; or extracting a small number of samples from the source language spoken language understanding data to be migrated obtained in the step 1 to perform spoken language understanding labeling, and translating the samples into a target language to obtain labeled bilingual spoken language understanding data.
The above-mentioned label to the bilingual parallel spoken language understanding corpus can be the field, semantic slot and user intention of the labeled data, and the labeled spoken language understanding data to be migrated is obtained after the label. The data is then replaced with a class label. Here, the category label replacement means that the semantic slot as a whole is replaced with a category label. For example, spoken language understands data as: "song" in the Play album < album > lute phase "", < song > < "> is a semantic tag, and < album > </album > is a semantic tag, < album > lute phase" < "> and < song > <" > are two different semantic slots, respectively, the result after replacement is "Play < song > in < album >", where < song > is a category mark, and < album > is a category mark.
By way of example, the source language text in annotated bilingual spoken language understanding data is: "I want to dial the telephone number of < contact _ name > white Xiaoxiana </contact _ name >; the corresponding text of the target language is: "I world love to make a call to < contact _ name > Xaoxaiia Bai [ contact _ name > ] s number lease". The source language text in the bilingual parallel spoken language understanding data finally obtained through semantic slot replacement is as follows: "i want to dial the phone number of < contact _ name >, the target language text is: "I world love to make a call to < contact _ name >'s number lease".
Further, in a preferred technical solution provided in this embodiment, the semantic slots include semantic tags and semantic slot values, and the step of "replacing semantic slots in the labeled bilingual spoken language understanding corpus with category labels" includes: and replacing the labeled semantic labels and semantic groove values in the bilingual spoken language understanding corpus with the specified category labels. Here, the semantic slots in the bilingual spoken comprehension corpus, i.e., the semantic tags and semantic slot values, may be uniformly replaced with designated category labels, for example, "playing < song > happy > -in album < album > lute >" is replaced with "playing < song > in album < album >.
And a substep 1023 of performing supervised incremental training on the first optimized translation model by using the bilingual category labeled corpus to obtain a second optimized translation model.
Here, the bilingual category label corpus is used as training data, and all category labels are added into a source language vocabulary and a target language vocabulary of a translation model at the same time to train the first optimized translation model; enabling the first optimized translation model to further learn how to translate the data with the category labels; and the trained second optimized translation model can translate the linguistic data with the category marks.
And a substep 1024 of performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the data to be migrated with the category label to obtain a spoken language understanding migration model.
Here, the language data acquired in the step 101 may be used to perform source-supervised augmented translation incremental training on the second optimized translation model. Generally, the source language spoken language understanding data to be migrated is label-free data, and spoken language understanding labeling can be performed on the source language data manually, namely, the field, the semantic groove and the user intention of the labeled data are labeled, and the labeled spoken language understanding data to be migrated with the category label is obtained after labeling. Then, the data is replaced by class labels as sample data for training.
Further, in a preferred technical solution provided in this embodiment, the step of performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the to-be-migrated data with the category label to obtain the spoken language understanding migration model includes:
inputting the monolingual data to be migrated with the category labels to obtain the output of the softmax layer of the second optimized translation model, and obtaining the probability distribution with the dimension being the size of the vocabulary; sampling according to the probability distribution to obtain a sampling sentence; performing local optimal decoding according to the probability distribution to obtain a decoding sentence; respectively calculating the semantic slot retention rates of the sampling sentence and the decoding sentence to serve as a reward item and a reward regular item for training the second optimized translation model; and taking the sampling sentence as a label, and optimizing the parameters of the second optimized translation model by using a policy gradient method to obtain a spoken language understanding migration model.
The semantic slots are slots that analyze user input into different semantics predefined according to different scenes in Natural Language Understanding (NLU) for performing Language processing. For example, in the corpus "play album < song > < playing > cheer >" in the album < album > lute phase </album, the semantic slot consists of two parts: semantic tags and content labeled by the semantic tags; the < album > lute phase </album > is a semantic groove, and the lute phase is a semantic groove value; < song > joy < \\ song > is a semantic slot, and joy is a semantic slot value.
Further, in a preferred technical solution provided in this embodiment, the step of "calculating the semantic slot retention rates of the sampling sentence and the decoding sentence respectively" includes:
according to the missing and wrong information of the semantic slots in the decoding sentences, calculating the semantic slot retention rates of the sampling sentences and the decoding sentences by using the following formulas:
wherein g (c)iS) is a statistical semantic slot s in a sample sentence ciFunction of the number of occurrences in g (e)iS) is a statistical semantic slot s in decoding eiThe function of (1).
In order to directly optimize the spoken language understanding migration model, the existing translation problem needs to be migrated to the framework of reinforcement learning: and taking the second optimized translation model as an agent in reinforcement learning, and using the second optimized translation model to interact with an external environment, wherein the external environment is a word. The network parameters of the second optimized translation model define a strategy in reinforcement learning, and the execution result of the strategy, namely the action of the agent, refers to the prediction of the next word in the generated sequence by the model at each moment. Each time an action is taken, the agent updates its internal state, i.e., attention weight. Once the agent generates the terminator EOS, i.e., a sentence translation is complete, the agent observes the reward signal. The reward signal in this task is the semantic slot retention of the spoken language understanding migration model. During the training process, agent will select actions according to the current strategy, and only after the terminator is generated will the reward be observed; the reward is calculated based on the sampling sentence and the decoding sentence.
Here, the policy is an abstract concept, and the internal parameters of the whole translation model jointly define the policy; the parameters of the model are understood as that the independent variable is input x, the dependent variable is a function y (f) (x) of the output y, the function f is the strategy, the output y is the action, the input x is the current state, the translation model determines the output according to the parameters of the translation model and the current input (including the hidden variables of the translated part), namely, the agent for strengthening the learning determines the action according to the strategy and the state. The size of the action space is the size of the whole word list, and different words are selected as translation words at the current moment, so that different actions are meant.
Further, in a preferred technical solution provided in this embodiment, the step of "using the sampling sentence as a label and optimizing the parameter of the second optimized translation model by using a policy gradient method" includes:
optimizing said second optimized translation model using said reward function as a loss function according to a mathematical expectation of maximizing a reward:
wherein x is x1,x2,.. is an input sentence,is a sentence generated by decoding, wgP (θ) represents a distribution in which the generated sentence obeys probability p (θ), and r (w)gX) represents a bonus item, the above equation (2) is a mathematical expectation of the bonus item, wherein the symbolsIs a mathematical expectation, representing a mathematical operation.
Further, the step of optimizing the parameters of the second optimized translation model using a policy gradient method using the sampling sentence as a label further includes:
approximating the mathematical expectation with a monte carlo sample, reducing the variance of the reward function, optimizing the parameters of the second optimized translation model by:
wherein, wbTranslation result corresponding to greedy decoding, wsCorresponding to the translation result of the sample decoding, ytIs the input of the scrambler softmax function, htIs the input vector of the fully-connected layer before the softmax function,is thatThe unique heat vector of (a).
The expected value in the formula (2) can be approximated by using Monte Carlo sampling, meanwhile, a regular term is introduced to reduce the variance caused by the Monte Carlo sampling, the regular term selects the semantic slot retention rate value corresponding to the translation result of the model greedy decoding, and the formula (3) can be obtained by derivation according to the gradient chain rule. The second optimized translation model is optimized using equation (3) as a loss function.
In the loss function, a second optimized translation model is utilized to decode the currently input sampling sentence according to a greedy algorithm to obtain a sentence wbAccording to the sentence wbCalculating r of the second optimized translation modelb(ii) a Sampling and decoding the current input sampling sentence by using the current second optimized translation model to obtain a sentence wsSimultaneously calculating r of the second optimized translation models(ii) a Using the above rbAnd rsAnd wsAnd performing gradient optimization on the second optimized translation model, and updating parameters of the second optimized translation model to obtain the spoken language understanding migration model.
And 103, performing culture migration on the first migration result to obtain the spoken language understanding data of the target language.
In this embodiment, the object of the culture migration is a semantic slot. And performing culture migration on the semantic slot in the first migration result to obtain the spoken language understanding data of the target language. Performing culture migration on the first migration result to obtain spoken language understanding data of the target language, wherein the step of implementing the culture migration comprises the following steps of:
and determining a semantic slot for cultural dependence in the parallel corpus, and collecting a target language semantic slot value based on the semantic slot for cultural dependence.
Constructing a semantic slot database according to the semantic slots and the semantic slot values; and replacing the category labels in the bilingual category label corpus by using the semantic slots in the semantic slot database depending on the culture according to a database query mode to obtain the spoken language understanding data of the target language after the culture migration.
The category marks in the first migration result can be randomly replaced according to the query result of the database, that is, the category marks are replaced by corresponding semantic slots, for example, "Play < song > in < album >" is replaced by "Play < song > Thunder < \\\ song > in < album > Evolve </album >.
By way of example, comparing the technical scheme of the invention with a translation method for aligning words after translation and a category translation method for directly translating semantic tags, the evaluation indexes comprise the retention ratio SKR of a semantic slot, the F1 value of a semantic slot filling task and the accuracy of field classification. The test data is 1500 pieces of English data, the data to be migrated is 3000 pieces of Chinese data, and the attached table 1 and the attached table 2 are results of incremental training using 1200 pieces of SLU bilingual data and 90 pieces of SLU bilingual data, respectively. The technical scheme at least has the following beneficial effects:
attached table 1 experimental results of the present invention and interpretation and class translation under 1200 SLU bilingual data:
attached table 2 experimental results of the present invention and translation and category translation under 90 SLU bilingual data:
in the above attached tables 1 and 2, SKR, Slot _ F1, and Dom _ Acc respectively indicate evaluation indexes in spoken language understanding data translation, retention rates of semantic slots, semantic Slot filling task F1 values, and accuracy rates of domain classification.
Referring to fig. 2a and 2b, fig. 2a and 2b illustrate a comparison of the present invention with category-based translations on an attention weight heatmap during translation. It can be seen that after the method of the present invention is adopted, the spoken language understanding migration model can focus on the category labels at the correct positions.
The present application also provides a storage device carrying one or more programs adapted to be loaded and executed by a processor, which when executed by the device may be adapted to carry out any of the methods of the embodiments described above.
The present application further provides a processing apparatus comprising a processor adapted to execute various programs; and a storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded and executed by a processor to implement any of the methods in the above embodiments.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
Claims (9)
1. A method for cross-language migration of spoken language understanding in a dialog system, the method comprising:
obtaining source language spoken language understanding data to be migrated;
migrating the source language spoken language understanding data to be migrated by using a pre-constructed spoken language understanding migration model to obtain a first migration result;
performing culture migration on the first migration result to obtain spoken language understanding data of a target language; wherein the construction of the spoken language understanding migration model comprises the following steps:
training a general machine translation model by using a preset general parallel corpus to obtain a first optimized translation model; wherein, the text of the source language corpus and the translated text of the target language corresponding to the source language corpus in parallel form a parallel corpus;
marking bilingual parallel spoken language understanding corpora in the parallel corpora, replacing the whole semantic slots in the marked bilingual spoken language understanding corpora with category marks, and taking the replaced bilingual spoken language understanding corpora as bilingual category mark corpora;
carrying out supervised incremental training on the first optimized translation model by using the bilingual category labeled corpus to obtain a second optimized translation model;
performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the data to be migrated with the category label to obtain a spoken language understanding migration model; and the data to be migrated with the category labels is obtained by replacing semantic slots of the source language spoken language understanding data with the category labels.
2. The method for cross-language migration of spoken language understanding in a dialog system according to claim 1, characterized in that the semantic slots comprise semantic tags and semantic slot values, and
the step of replacing the whole semantic slots in the annotated bilingual spoken language understanding corpus with category labels comprises: and replacing the labeled semantic labels and semantic groove values in the bilingual spoken language understanding corpus with the specified category labels.
3. The method for cross-language migration of spoken language understanding in dialog systems according to claim 1, wherein the step of performing source-side supervised augmented translation incremental training on the second optimized translation model using data to be migrated with class labels to obtain a spoken language understanding migration model comprises:
inputting the monolingual data of the data to be migrated with the category labels to obtain the output of the softmax layer of the second optimized translation model, and obtaining the probability distribution with the dimension being the size of the vocabulary;
sampling according to the probability distribution to obtain a sampling sentence;
performing local optimal decoding according to the probability distribution to obtain a decoding sentence;
respectively calculating semantic slot retention rates SKRs of the sampling sentence and the decoding sentence to serve as reward items and reward regular items for training the second optimized translation model;
and using the sampling sentence as a label, and optimizing parameters of the second optimized translation model by using a strategy gradient method.
4. The method for cross-language migration of spoken language understanding in dialog systems according to claim 3, characterized in that the step of "calculating semantic slot retention SKR of the sampled sentence and the decoded sentence, respectively" comprises:
according to the missing and wrong information of the semantic slots in the decoding sentences, calculating the semantic slot retention rates of the sampling sentences and the decoding sentences by using the following formulas:
wherein g (c)iS) is a statistical semantic slot s in a sample sentence ciFunction of the number of occurrences in g (e)iS) is a statistical semantic slot s in decoding eiAs a function of the number of occurrences in (a).
5. The method for trans-language migration of spoken language understanding in dialog system according to claim 4, characterized in that the step of optimizing the parameters of the second optimized translation model using policy gradient method with the sampling sentence as a label comprises:
optimizing the second optimized translation model according to the mathematical expectation of maximizing reward using the following reward function as a loss function:
wherein x is x1,x2,.. is an input sentence,is a sentence generated by decoding, wgP (theta) represents that the probability of obeying a generated sentence is pDistribution of (theta), r (w)gX) represents the bonus item, i.e., the SKR value, the entire equation is mathematically expecting the bonus item, and E is mathematically expecting.
6. The method for cross-language migration of spoken language understanding in dialog systems according to claim 5, wherein the step of optimizing the parameters of the second optimized translation model using a policy gradient method with the sample sentence as a label further comprises:
approximating the mathematical expectation with a monte carlo sample, reducing the variance of the reward function, optimizing the parameters of the second optimized translation model by:
7. The method for cross-language migration of spoken language understanding in dialog system according to claim 1, wherein the step of performing cultural migration of the first migration result to obtain spoken language understanding data of the target language comprises:
determining a semantic slot for cultural dependence in the bilingual category label corpus, and collecting a target language semantic slot value based on the semantic slot for cultural dependence;
constructing a semantic slot database of the cultural dependence according to the semantic slot of the cultural dependence and the semantic slot value of the target language;
and replacing the category labels in the bilingual category label corpus by using the semantic slots in the semantic slot database depending on the culture according to a database query mode to obtain the spoken language understanding data of the target language after the culture migration.
8. A storage means having stored therein a plurality of programs, characterized in that said programs are adapted to be loaded by a processor for performing a cross-language migration method for spoken language understanding in a dialog system according to any of claims 1-7.
9. A processing apparatus comprising a processor storage device adapted to store a plurality of programs; characterized in that said program is adapted to be loaded by a processor to execute a cross-language migration method for spoken language understanding in a dialog system as claimed in any of the claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810724523.9A CN109213851B (en) | 2018-07-04 | 2018-07-04 | Cross-language migration method for spoken language understanding in dialog system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810724523.9A CN109213851B (en) | 2018-07-04 | 2018-07-04 | Cross-language migration method for spoken language understanding in dialog system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109213851A CN109213851A (en) | 2019-01-15 |
CN109213851B true CN109213851B (en) | 2021-05-25 |
Family
ID=64990157
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810724523.9A Active CN109213851B (en) | 2018-07-04 | 2018-07-04 | Cross-language migration method for spoken language understanding in dialog system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109213851B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200364303A1 (en) * | 2019-05-15 | 2020-11-19 | Nvidia Corporation | Grammar transfer using one or more neural networks |
CN110188182B (en) | 2019-05-31 | 2023-10-27 | 中国科学院深圳先进技术研究院 | Model training method, dialogue generating method, device, equipment and medium |
CN110298391B (en) * | 2019-06-12 | 2023-05-02 | 同济大学 | Iterative incremental dialogue intention type recognition method based on small sample |
CN110472252B (en) * | 2019-08-15 | 2022-12-13 | 昆明理工大学 | Method for translating Hanyue neural machine based on transfer learning |
CN113312453B (en) * | 2021-06-16 | 2022-09-23 | 哈尔滨工业大学 | Model pre-training system for cross-language dialogue understanding |
CN113919368B (en) * | 2021-10-11 | 2024-05-24 | 北京大学 | Low-resource dialogue generation method and system based on multi-language modeling |
CN116595999B (en) * | 2023-07-17 | 2024-04-16 | 深圳须弥云图空间科技有限公司 | Machine translation model training method and device |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010003032A (en) * | 2008-06-19 | 2010-01-07 | Brother Ind Ltd | Translation support method, translation support device, and translation support program |
CN101937677A (en) * | 2010-09-21 | 2011-01-05 | 陈宝祥 | Voice comparison integrated translation processing system |
CN105550174A (en) * | 2015-12-30 | 2016-05-04 | 哈尔滨工业大学 | Adaptive method of automatic machine translation field on the basis of sample importance |
CN106844352B (en) * | 2016-12-23 | 2019-11-08 | 中国科学院自动化研究所 | Word prediction method and system based on neural machine translation system |
CN107066455B (en) * | 2017-03-30 | 2020-07-28 | 唐亮 | Multi-language intelligent preprocessing real-time statistics machine translation system |
CN107656997B (en) * | 2017-09-20 | 2021-01-15 | Oppo广东移动通信有限公司 | Natural language processing method and device, storage medium and terminal equipment |
CN107729327A (en) * | 2017-09-30 | 2018-02-23 | 联想(北京)有限公司 | A kind of interpretation method and a kind of lexical or textual analysis device |
-
2018
- 2018-07-04 CN CN201810724523.9A patent/CN109213851B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN109213851A (en) | 2019-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109213851B (en) | Cross-language migration method for spoken language understanding in dialog system | |
JP7122341B2 (en) | Method and apparatus for evaluating translation quality | |
US8903707B2 (en) | Predicting pronouns of dropped pronoun style languages for natural language translation | |
CN107423363B (en) | Artificial intelligence based word generation method, device, equipment and storage medium | |
CN108124477B (en) | Improving word segmenters to process natural language based on pseudo data | |
CN108932226A (en) | A kind of pair of method without punctuate text addition punctuation mark | |
US8874433B2 (en) | Syntax-based augmentation of statistical machine translation phrase tables | |
TW201921267A (en) | Method and system for generating a conversational agent by automatic paraphrase generation based on machine translation | |
Xu et al. | A deep neural network approach for sentence boundary detection in broadcast news. | |
JP7335300B2 (en) | Knowledge pre-trained model training method, apparatus and electronic equipment | |
CN110119510B (en) | Relationship extraction method and device based on transfer dependency relationship and structure auxiliary word | |
Razumovskaia et al. | Crossing the conversational chasm: A primer on natural language processing for multilingual task-oriented dialogue systems | |
CN111062217A (en) | Language information processing method and device, storage medium and electronic equipment | |
JP2009151777A (en) | Method and apparatus for aligning spoken language parallel corpus | |
KR100918338B1 (en) | Third language text generating method by multi-lingual text inputting and device and storage medium storing program therefor | |
Li et al. | Improving text normalization using character-blocks based models and system combination | |
CN116187282B (en) | Training method of text review model, text review method and device | |
CN113743101A (en) | Text error correction method and device, electronic equipment and computer storage medium | |
CN117273026A (en) | Professional text translation method, device, electronic equipment and storage medium | |
Winiwarter | Learning transfer rules for machine translation from parallel corpora | |
CN114580446A (en) | Neural machine translation method and device based on document context | |
CN111090720B (en) | Hot word adding method and device | |
Neubarth et al. | A hybrid approach to statistical machine translation between standard and dialectal varieties | |
CN113327579A (en) | Speech synthesis method, speech synthesis device, storage medium and electronic equipment | |
Sibeko et al. | An overview of Sesotho BLARK content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |