CN109213851B - Cross-language migration method for spoken language understanding in dialog system - Google Patents

Cross-language migration method for spoken language understanding in dialog system Download PDF

Info

Publication number
CN109213851B
CN109213851B CN201810724523.9A CN201810724523A CN109213851B CN 109213851 B CN109213851 B CN 109213851B CN 201810724523 A CN201810724523 A CN 201810724523A CN 109213851 B CN109213851 B CN 109213851B
Authority
CN
China
Prior art keywords
migration
spoken language
language
language understanding
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810724523.9A
Other languages
Chinese (zh)
Other versions
CN109213851A (en
Inventor
周玉
白赫
张家俊
宗成庆
赵亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Mobvoi Information Technology Co Ltd
Original Assignee
Institute of Automation of Chinese Academy of Science
Mobvoi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science, Mobvoi Information Technology Co Ltd filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201810724523.9A priority Critical patent/CN109213851B/en
Publication of CN109213851A publication Critical patent/CN109213851A/en
Application granted granted Critical
Publication of CN109213851B publication Critical patent/CN109213851B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of language processing, and provides a cross-language migration method for spoken language understanding in a dialog system, aiming at solving the technical problem of poor migration result quality caused by difficult migration of semantic labels and language culture difference in the cross-language migration of spoken language understanding in the dialog system. To this end, the cross-language migration method of the spoken language in the invention comprises: acquiring marked spoken language understanding data to be migrated; migrating the to-be-migrated data with the category labels by using a pre-constructed spoken language understanding migration model to obtain a first migration result with the category labels; and performing culture migration on the first migration result to obtain the spoken language understanding data of the target language. Based on the steps, the cross-language migration of the spoken language understanding data can be rapidly and accurately carried out, the problem of poor effect of a supervised training method caused by insufficient bilingual band class marking data is solved, and the data collection and labeling cost in model training is reduced.

Description

Cross-language migration method for spoken language understanding in dialog system
Technical Field
The invention relates to the technical field of man-machine conversation, in particular to a cross-language migration method for spoken language understanding in a conversation system under the condition of low resources.
Background
The task type dialogue system is a man-machine interaction system which assists a user in completing tasks in a specific field (fields such as restaurants, hotels or air tickets) through a natural language interaction mode. The task-based dialog system needs to have four basic functions: spoken language understanding, dialog state tracking, dialog policies, and dialog generation. Among them, Spoken Language Understanding (SLU), which is an entry of the entire system, is a very important technical module to understand the meaning of utterances in a context defined by a human-computer interactive dialog system; the method generally comprises three subtasks of domain classification, user intention detection and semantic slot filling, and some systems combine the domain classification and the intention detection. If it is desired to have a dialog system that supports different languages for different markets, a large amount of training data needs to be collected and labeled for the SLU for each language, which is very time consuming and labor intensive. Therefore, the single-language SLU system is migrated to other languages, and the method has high application value.
Currently, the method for SLU cross-language migration mainly includes an active-end testing method and a target-end training method. Source testing refers to translating user input in other languages into a source language and then processing the input using the source language's SLU system. Target end training refers to migrating the training corpus of the source language to the target language end and then training the SLU system of the target language. The source end testing method is simple and easy to implement, but the semantic slot result output by the source end testing method is of the source language and needs to be translated back to the target language to be used by a subsequent module in the target language dialog system. The target end training scheme supports direct training and adjustment of the model at the target language end, is more flexible, and does not need to consume extra time for translation of each input of a user after the system is online.
However, each SLU corpus consists of a spoken text and its semantic annotation information, such as: "play album < song > cheer > in < album > lute phase </album >. To migrate such a source language training corpus, not only the text needs to be translated into the target language, but also the semantic tags in the text need to be migrated correctly, so that a general machine translator cannot be directly used, and a large amount of bilingual data required for training a special translator is lacked. In addition to the issue of semantic tag migration, language culture migration is also considered, for example, a user living in London may say "Call a taxi to Tower of London" instead of "Call a taxi to Forbidden City". Therefore, in the migration process of the corpus, not only the migration of the semantic tags but also the cultural differences need to be considered.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the technical problem that in a dialog system, cross-language migration of spoken language understanding is difficult due to difficulty in migration of semantic tags and differences in language culture, the present invention provides a cross-language migration method of spoken language understanding in a dialog system, so as to solve the above technical problem.
In a first aspect, the cross-language migration method for spoken language understanding in a dialog system provided by the present invention comprises the following steps: obtaining source language spoken language understanding data to be migrated; migrating the data to be migrated by using a pre-constructed spoken language understanding migration model to obtain a first migration result; performing culture migration on the first migration result to obtain spoken language understanding data of a target language;
wherein, the method for constructing the spoken language understanding migration model comprises the following steps:
training a general machine translation model by using a preset general parallel corpus to obtain a first optimized translation model; labeling a small amount of bilingual spoken language understanding corpora, then performing semantic slot replacement on the bilingual spoken language understanding corpora, and taking the replaced corpora as bilingual category marking corpora; carrying out supervised incremental training on the first optimized translation model by using the bilingual category labeled corpus to obtain a second optimized translation model; performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the data to be migrated with the category label to obtain a spoken language understanding migration model; and the data to be migrated with the category labels is obtained by replacing semantic slots of the source language spoken language understanding data with the category labels.
Further, in a preferred technical solution provided by the present invention, the semantic slots include semantic tags and semantic slot values, and the step of replacing the semantic slots in the labeled bilingual spoken language understanding corpus with category labels includes: and replacing the labeled semantic labels and semantic groove values in the bilingual spoken language understanding corpus with the specified category labels.
Further, in a preferred technical solution provided by the present invention, the step of performing source-side supervised reinforced translation incremental training on the second optimized translation model by using the to-be-migrated data with the category label to obtain the spoken language understanding migration model includes:
step 41, inputting the monolingual data of the data to be migrated with the category labels to obtain the output of the softmax layer of the second optimized translation model, and obtaining the probability distribution with the dimension being the size of the vocabulary; sampling according to the probability distribution to obtain a sampling sentence; performing local optimal decoding according to the probability distribution to obtain a decoding sentence; respectively calculating the semantic slot retention rates of the sampling sentence and the decoding sentence to serve as a reward item and a reward regular item for training the second optimized translation model; and using the sampling sentence as a label, and optimizing the parameters of the second optimized translation model by using a policy gradient method.
Further, in a preferred technical solution provided by the present invention, the step of "calculating the semantic slot retention rates of the sampling sentence and the decoding sentence respectively" includes: and calculating the semantic slot retention rate of the sampling sentence and the decoding sentence by using the following formula according to the missing and wrong translation information of the semantic slot in the decoding sentence.
Figure BDA0001719341650000031
Wherein g (c)iS) is a statistical semantic slot s in a sample sentence ciFunction of the number of occurrences in g (e)iS) is a statistical semantic slot s in the decoded sentence eiAs a function of the number of occurrences in (a).
Further, in a preferred embodiment of the present invention, the step of "optimizing the parameters of the second optimized translation model by using the policy gradient method with the sampling sentence as a tag" includes: optimizing said second optimized translation model using said reward function as a loss function according to a mathematical expectation of maximizing a reward:
Figure BDA0001719341650000032
wherein x is x1,x2,.. is an input sentence,
Figure BDA0001719341650000033
is a sentence generated by decoding, wgP (θ) represents a distribution in which the generated sentence obeys probability p (θ), and r (w)gX) represents the bonus item, i.e., the SKR value, the entire equation is a mathematical expectation of the bonus item,
Figure BDA0001719341650000035
is a mathematical expectation.
Further, in a preferred embodiment of the present invention, the step of "using the sampling sentence as a label and optimizing the parameter of the second optimized translation model by using a policy gradient method" further includes: approximating the mathematical expectation using monte carlo sampling, the variance of the reward function is reduced by optimizing the parameters of the second optimized translation model by:
Figure BDA0001719341650000034
wherein, wbTranslation result corresponding to greedy decoding, wsCorresponding to the translation result of the sample decoding, ytIs the input of the decoder softmax function, htIs the input vector of the fully-connected layer before the softmax function,
Figure BDA0001719341650000041
is that
Figure BDA0001719341650000042
The unique heat vector of (a).
Further, in a preferred embodiment of the present invention, the step of performing culture migration on the first migration result to obtain the spoken language understanding data of the target language includes: determining a semantic slot for cultural dependence in the bilingual category mark corpus, and collecting a target language semantic slot value based on the semantic slot for cultural dependence; constructing a semantic slot database of the cultural dependence according to the semantic slot of the cultural dependence and the semantic slot value of the target language; and replacing the category labels in the bilingual category label corpus by using the semantic slots in the semantic slot database depending on the culture according to a database query mode to obtain the spoken language understanding data of the target language after the culture migration.
In a second aspect, the present application also provides a storage device carrying one or more programs adapted to be loaded by a processor to perform a method implementing any of the above aspects.
In a third aspect, the present application also provides a processing apparatus comprising a processor storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded by a processor to perform any of the methods of the above summary.
Compared with the closest prior art, the technical scheme at least has the following beneficial effects:
according to the cross-language migration method for the spoken language understanding in the voice system, the cross-language migration is performed on the existing monolingual spoken language understanding data through the pre-constructed spoken language understanding migration model, so that the acquisition of the labeled data for the spoken language understanding of the target language is realized, and the cost of collecting and labeling the data of the target language is reduced. In the process of training the spoken language understanding migration model, the migration effect of the semantic labels, namely the quality of the migration result, is effectively improved through a source-end supervised reinforced translation increment training mode. And moreover, through culture migration, the culture difference caused by direct migration is weakened, and the quality of the spoken language understanding corpus of the target language is improved.
Drawings
FIG. 1 is a diagram illustrating the main steps of a cross-language migration method for spoken language understanding in a dialog system according to an embodiment of the present invention;
FIG. 2a is an attention weight heatmap of a spoken language understanding migration model based on a cross-language migration method of spoken language understanding in a dialog system of the present invention;
FIG. 2b is an attention weight heatmap of a second optimized translation model in an embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to FIG. 1, FIG. 1 illustrates the main steps of a cross-language migration method of spoken language in a dialog system. As shown in fig. 1, the main steps of the cross-language migration method for spoken language in the dialog system in this embodiment are:
step 101, obtaining source language spoken language understanding data to be migrated.
In this embodiment, the method and the device can be applied to an electronic device or an application platform of a cross-language migration method for spoken language understanding in a dialog system, and source language spoken language understanding data to be subjected to cross-language migration is obtained. The electronic device may be a server for performing spoken language understanding data processing or task control in a task-based dialog system, and may obtain spoken language understanding data to be migrated from a terminal device communicatively connected to the server or an application platform; specifically, in a human-computer interactive dialogue system, a module with a spoken language understanding function acquires spoken language understanding data input by a user through a speech recognition module, and extracts the spoken language understanding data input by understanding in a context defined by the dialogue system. The terminal equipment can obtain the spoken language understanding data to be subjected to cross-language migration through the sound pickup equipment in the voice recognition module connected with the terminal equipment. In particular, in some interaction domains (e.g., restaurant ordering, hotel services, or airline ticket ordering domains), human-computer interaction systems that assist users in completing tasks through the form of natural language interactions. In a human-computer interactive task-based dialog system, it is desirable that the dialog system can support different languages for different markets, and migrate a single-language system to other languages to become a multi-language system. The language data can be collected through the terminal equipment, or the language data can be obtained from a storage unit in the system; the language data includes source language data and target language data.
And 102, migrating the to-be-migrated data with the category labels by using a pre-constructed spoken language understanding migration model to obtain a first migration result with the category labels.
In this embodiment, based on the source language spoken language understanding data to be migrated acquired in step 101, performing cross-language migration on the spoken language understanding data by using a pre-constructed spoken language understanding migration model to obtain translation data with a category label in a target language, which is referred to as a first migration result. Here, the spoken language understanding migration model may be a model constructed based on a deep neural network, and may be, for example, a Transformer network model, and the Transformer network model is used to complete the cross-language migration of the spoken language understanding data to be migrated, so as to obtain the translation data with the target language and the category label. The spoken language understanding migration model inputs spoken language understanding data with category labels of a source language, outputs the probability of the spoken language understanding data with the category labels of target language translation data, and determines the corresponding translation of the spoken language understanding data to be migrated by utilizing the probability distribution output by a decoder. I.e., determining the corresponding target language spoken language understanding data of the source language spoken language understanding data.
The method for constructing the spoken language understanding migration model comprises the following sub-steps of:
and a substep 1021, training a general machine translation model by using a preset general parallel corpus to obtain a first optimized translation model.
In sub-step 1021, the parallel corpora may be corpora obtained from a predetermined bilingual/multilingual corpus, including source language corpora and target language corpora.
The text of the source language corpus and the translated text of the target language corresponding to the source language corpus in parallel form parallel corpora; the alignment degree of the parallel corpora can be word-level, sentence-level, paragraph-level and chapter-level alignment. And training a translation model by using the source language corpus and the target language corpus in the parallel corpus as sample data, wherein the translation model can be translated by using the first optimized translation model obtained by training.
And a substep 1022 of labeling the bilingual parallel spoken language understanding corpus in the bilingual parallel corpus, replacing the semantic slots in the labeled bilingual parallel spoken language understanding corpus with category labels, and using the replaced bilingual spoken language understanding corpus as the bilingual category label corpus.
Here, a small number of bilingual parallel spoken-understanding corpora may be labeled manually. The labeling can be to extract a small amount of spoken language understanding labeling from the target language data in the general bilingual parallel corpus obtained in the sub-step 1021, and then manually translate the spoken language understanding labeling into the source language, so as to obtain the bilingual spoken language understanding corpus with the labeling; or extracting a small number of samples from the source language spoken language understanding data to be migrated obtained in the step 1 to perform spoken language understanding labeling, and translating the samples into a target language to obtain labeled bilingual spoken language understanding data.
The above-mentioned label to the bilingual parallel spoken language understanding corpus can be the field, semantic slot and user intention of the labeled data, and the labeled spoken language understanding data to be migrated is obtained after the label. The data is then replaced with a class label. Here, the category label replacement means that the semantic slot as a whole is replaced with a category label. For example, spoken language understands data as: "song" in the Play album < album > lute phase "", < song > < "> is a semantic tag, and < album > </album > is a semantic tag, < album > lute phase" < "> and < song > <" > are two different semantic slots, respectively, the result after replacement is "Play < song > in < album >", where < song > is a category mark, and < album > is a category mark.
By way of example, the source language text in annotated bilingual spoken language understanding data is: "I want to dial the telephone number of < contact _ name > white Xiaoxiana </contact _ name >; the corresponding text of the target language is: "I world love to make a call to < contact _ name > Xaoxaiia Bai [ contact _ name > ] s number lease". The source language text in the bilingual parallel spoken language understanding data finally obtained through semantic slot replacement is as follows: "i want to dial the phone number of < contact _ name >, the target language text is: "I world love to make a call to < contact _ name >'s number lease".
Further, in a preferred technical solution provided in this embodiment, the semantic slots include semantic tags and semantic slot values, and the step of "replacing semantic slots in the labeled bilingual spoken language understanding corpus with category labels" includes: and replacing the labeled semantic labels and semantic groove values in the bilingual spoken language understanding corpus with the specified category labels. Here, the semantic slots in the bilingual spoken comprehension corpus, i.e., the semantic tags and semantic slot values, may be uniformly replaced with designated category labels, for example, "playing < song > happy > -in album < album > lute >" is replaced with "playing < song > in album < album >.
And a substep 1023 of performing supervised incremental training on the first optimized translation model by using the bilingual category labeled corpus to obtain a second optimized translation model.
Here, the bilingual category label corpus is used as training data, and all category labels are added into a source language vocabulary and a target language vocabulary of a translation model at the same time to train the first optimized translation model; enabling the first optimized translation model to further learn how to translate the data with the category labels; and the trained second optimized translation model can translate the linguistic data with the category marks.
And a substep 1024 of performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the data to be migrated with the category label to obtain a spoken language understanding migration model.
Here, the language data acquired in the step 101 may be used to perform source-supervised augmented translation incremental training on the second optimized translation model. Generally, the source language spoken language understanding data to be migrated is label-free data, and spoken language understanding labeling can be performed on the source language data manually, namely, the field, the semantic groove and the user intention of the labeled data are labeled, and the labeled spoken language understanding data to be migrated with the category label is obtained after labeling. Then, the data is replaced by class labels as sample data for training.
Further, in a preferred technical solution provided in this embodiment, the step of performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the to-be-migrated data with the category label to obtain the spoken language understanding migration model includes:
inputting the monolingual data to be migrated with the category labels to obtain the output of the softmax layer of the second optimized translation model, and obtaining the probability distribution with the dimension being the size of the vocabulary; sampling according to the probability distribution to obtain a sampling sentence; performing local optimal decoding according to the probability distribution to obtain a decoding sentence; respectively calculating the semantic slot retention rates of the sampling sentence and the decoding sentence to serve as a reward item and a reward regular item for training the second optimized translation model; and taking the sampling sentence as a label, and optimizing the parameters of the second optimized translation model by using a policy gradient method to obtain a spoken language understanding migration model.
The semantic slots are slots that analyze user input into different semantics predefined according to different scenes in Natural Language Understanding (NLU) for performing Language processing. For example, in the corpus "play album < song > < playing > cheer >" in the album < album > lute phase </album, the semantic slot consists of two parts: semantic tags and content labeled by the semantic tags; the < album > lute phase </album > is a semantic groove, and the lute phase is a semantic groove value; < song > joy < \\ song > is a semantic slot, and joy is a semantic slot value.
Further, in a preferred technical solution provided in this embodiment, the step of "calculating the semantic slot retention rates of the sampling sentence and the decoding sentence respectively" includes:
according to the missing and wrong information of the semantic slots in the decoding sentences, calculating the semantic slot retention rates of the sampling sentences and the decoding sentences by using the following formulas:
Figure BDA0001719341650000081
wherein g (c)iS) is a statistical semantic slot s in a sample sentence ciFunction of the number of occurrences in g (e)iS) is a statistical semantic slot s in decoding eiThe function of (1).
In order to directly optimize the spoken language understanding migration model, the existing translation problem needs to be migrated to the framework of reinforcement learning: and taking the second optimized translation model as an agent in reinforcement learning, and using the second optimized translation model to interact with an external environment, wherein the external environment is a word. The network parameters of the second optimized translation model define a strategy in reinforcement learning, and the execution result of the strategy, namely the action of the agent, refers to the prediction of the next word in the generated sequence by the model at each moment. Each time an action is taken, the agent updates its internal state, i.e., attention weight. Once the agent generates the terminator EOS, i.e., a sentence translation is complete, the agent observes the reward signal. The reward signal in this task is the semantic slot retention of the spoken language understanding migration model. During the training process, agent will select actions according to the current strategy, and only after the terminator is generated will the reward be observed; the reward is calculated based on the sampling sentence and the decoding sentence.
Here, the policy is an abstract concept, and the internal parameters of the whole translation model jointly define the policy; the parameters of the model are understood as that the independent variable is input x, the dependent variable is a function y (f) (x) of the output y, the function f is the strategy, the output y is the action, the input x is the current state, the translation model determines the output according to the parameters of the translation model and the current input (including the hidden variables of the translated part), namely, the agent for strengthening the learning determines the action according to the strategy and the state. The size of the action space is the size of the whole word list, and different words are selected as translation words at the current moment, so that different actions are meant.
Further, in a preferred technical solution provided in this embodiment, the step of "using the sampling sentence as a label and optimizing the parameter of the second optimized translation model by using a policy gradient method" includes:
optimizing said second optimized translation model using said reward function as a loss function according to a mathematical expectation of maximizing a reward:
Figure BDA0001719341650000091
wherein x is x1,x2,.. is an input sentence,
Figure BDA0001719341650000092
is a sentence generated by decoding, wgP (θ) represents a distribution in which the generated sentence obeys probability p (θ), and r (w)gX) represents a bonus item, the above equation (2) is a mathematical expectation of the bonus item, wherein the symbols
Figure BDA0001719341650000093
Is a mathematical expectation, representing a mathematical operation.
Further, the step of optimizing the parameters of the second optimized translation model using a policy gradient method using the sampling sentence as a label further includes:
approximating the mathematical expectation with a monte carlo sample, reducing the variance of the reward function, optimizing the parameters of the second optimized translation model by:
Figure BDA0001719341650000094
wherein, wbTranslation result corresponding to greedy decoding, wsCorresponding to the translation result of the sample decoding, ytIs the input of the scrambler softmax function, htIs the input vector of the fully-connected layer before the softmax function,
Figure BDA0001719341650000101
is that
Figure BDA0001719341650000102
The unique heat vector of (a).
The expected value in the formula (2) can be approximated by using Monte Carlo sampling, meanwhile, a regular term is introduced to reduce the variance caused by the Monte Carlo sampling, the regular term selects the semantic slot retention rate value corresponding to the translation result of the model greedy decoding, and the formula (3) can be obtained by derivation according to the gradient chain rule. The second optimized translation model is optimized using equation (3) as a loss function.
In the loss function, a second optimized translation model is utilized to decode the currently input sampling sentence according to a greedy algorithm to obtain a sentence wbAccording to the sentence wbCalculating r of the second optimized translation modelb(ii) a Sampling and decoding the current input sampling sentence by using the current second optimized translation model to obtain a sentence wsSimultaneously calculating r of the second optimized translation models(ii) a Using the above rbAnd rsAnd wsAnd performing gradient optimization on the second optimized translation model, and updating parameters of the second optimized translation model to obtain the spoken language understanding migration model.
And 103, performing culture migration on the first migration result to obtain the spoken language understanding data of the target language.
In this embodiment, the object of the culture migration is a semantic slot. And performing culture migration on the semantic slot in the first migration result to obtain the spoken language understanding data of the target language. Performing culture migration on the first migration result to obtain spoken language understanding data of the target language, wherein the step of implementing the culture migration comprises the following steps of:
and determining a semantic slot for cultural dependence in the parallel corpus, and collecting a target language semantic slot value based on the semantic slot for cultural dependence.
Constructing a semantic slot database according to the semantic slots and the semantic slot values; and replacing the category labels in the bilingual category label corpus by using the semantic slots in the semantic slot database depending on the culture according to a database query mode to obtain the spoken language understanding data of the target language after the culture migration.
The category marks in the first migration result can be randomly replaced according to the query result of the database, that is, the category marks are replaced by corresponding semantic slots, for example, "Play < song > in < album >" is replaced by "Play < song > Thunder < \\\ song > in < album > Evolve </album >.
By way of example, comparing the technical scheme of the invention with a translation method for aligning words after translation and a category translation method for directly translating semantic tags, the evaluation indexes comprise the retention ratio SKR of a semantic slot, the F1 value of a semantic slot filling task and the accuracy of field classification. The test data is 1500 pieces of English data, the data to be migrated is 3000 pieces of Chinese data, and the attached table 1 and the attached table 2 are results of incremental training using 1200 pieces of SLU bilingual data and 90 pieces of SLU bilingual data, respectively. The technical scheme at least has the following beneficial effects:
attached table 1 experimental results of the present invention and interpretation and class translation under 1200 SLU bilingual data:
Figure BDA0001719341650000111
attached table 2 experimental results of the present invention and translation and category translation under 90 SLU bilingual data:
Figure BDA0001719341650000112
in the above attached tables 1 and 2, SKR, Slot _ F1, and Dom _ Acc respectively indicate evaluation indexes in spoken language understanding data translation, retention rates of semantic slots, semantic Slot filling task F1 values, and accuracy rates of domain classification.
Referring to fig. 2a and 2b, fig. 2a and 2b illustrate a comparison of the present invention with category-based translations on an attention weight heatmap during translation. It can be seen that after the method of the present invention is adopted, the spoken language understanding migration model can focus on the category labels at the correct positions.
The present application also provides a storage device carrying one or more programs adapted to be loaded and executed by a processor, which when executed by the device may be adapted to carry out any of the methods of the embodiments described above.
The present application further provides a processing apparatus comprising a processor adapted to execute various programs; and a storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded and executed by a processor to implement any of the methods in the above embodiments.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (9)

1. A method for cross-language migration of spoken language understanding in a dialog system, the method comprising:
obtaining source language spoken language understanding data to be migrated;
migrating the source language spoken language understanding data to be migrated by using a pre-constructed spoken language understanding migration model to obtain a first migration result;
performing culture migration on the first migration result to obtain spoken language understanding data of a target language; wherein the construction of the spoken language understanding migration model comprises the following steps:
training a general machine translation model by using a preset general parallel corpus to obtain a first optimized translation model; wherein, the text of the source language corpus and the translated text of the target language corresponding to the source language corpus in parallel form a parallel corpus;
marking bilingual parallel spoken language understanding corpora in the parallel corpora, replacing the whole semantic slots in the marked bilingual spoken language understanding corpora with category marks, and taking the replaced bilingual spoken language understanding corpora as bilingual category mark corpora;
carrying out supervised incremental training on the first optimized translation model by using the bilingual category labeled corpus to obtain a second optimized translation model;
performing source-end supervised reinforced translation incremental training on the second optimized translation model by using the data to be migrated with the category label to obtain a spoken language understanding migration model; and the data to be migrated with the category labels is obtained by replacing semantic slots of the source language spoken language understanding data with the category labels.
2. The method for cross-language migration of spoken language understanding in a dialog system according to claim 1, characterized in that the semantic slots comprise semantic tags and semantic slot values, and
the step of replacing the whole semantic slots in the annotated bilingual spoken language understanding corpus with category labels comprises: and replacing the labeled semantic labels and semantic groove values in the bilingual spoken language understanding corpus with the specified category labels.
3. The method for cross-language migration of spoken language understanding in dialog systems according to claim 1, wherein the step of performing source-side supervised augmented translation incremental training on the second optimized translation model using data to be migrated with class labels to obtain a spoken language understanding migration model comprises:
inputting the monolingual data of the data to be migrated with the category labels to obtain the output of the softmax layer of the second optimized translation model, and obtaining the probability distribution with the dimension being the size of the vocabulary;
sampling according to the probability distribution to obtain a sampling sentence;
performing local optimal decoding according to the probability distribution to obtain a decoding sentence;
respectively calculating semantic slot retention rates SKRs of the sampling sentence and the decoding sentence to serve as reward items and reward regular items for training the second optimized translation model;
and using the sampling sentence as a label, and optimizing parameters of the second optimized translation model by using a strategy gradient method.
4. The method for cross-language migration of spoken language understanding in dialog systems according to claim 3, characterized in that the step of "calculating semantic slot retention SKR of the sampled sentence and the decoded sentence, respectively" comprises:
according to the missing and wrong information of the semantic slots in the decoding sentences, calculating the semantic slot retention rates of the sampling sentences and the decoding sentences by using the following formulas:
Figure FDA0002950643790000021
wherein g (c)iS) is a statistical semantic slot s in a sample sentence ciFunction of the number of occurrences in g (e)iS) is a statistical semantic slot s in decoding eiAs a function of the number of occurrences in (a).
5. The method for trans-language migration of spoken language understanding in dialog system according to claim 4, characterized in that the step of optimizing the parameters of the second optimized translation model using policy gradient method with the sampling sentence as a label comprises:
optimizing the second optimized translation model according to the mathematical expectation of maximizing reward using the following reward function as a loss function:
Figure FDA0002950643790000022
wherein x is x1,x2,.. is an input sentence,
Figure FDA0002950643790000023
is a sentence generated by decoding, wgP (theta) represents that the probability of obeying a generated sentence is pDistribution of (theta), r (w)gX) represents the bonus item, i.e., the SKR value, the entire equation is mathematically expecting the bonus item, and E is mathematically expecting.
6. The method for cross-language migration of spoken language understanding in dialog systems according to claim 5, wherein the step of optimizing the parameters of the second optimized translation model using a policy gradient method with the sample sentence as a label further comprises:
approximating the mathematical expectation with a monte carlo sample, reducing the variance of the reward function, optimizing the parameters of the second optimized translation model by:
Figure FDA0002950643790000031
wherein, wbTranslation result corresponding to greedy decoding, wsCorresponding to the translation result of the sample decoding, ytIs the input of the scrambler softmax function, htIs the input vector of the fully-connected layer before the softmax function,
Figure FDA0002950643790000032
is that
Figure FDA0002950643790000033
The unique heat vector of (a).
7. The method for cross-language migration of spoken language understanding in dialog system according to claim 1, wherein the step of performing cultural migration of the first migration result to obtain spoken language understanding data of the target language comprises:
determining a semantic slot for cultural dependence in the bilingual category label corpus, and collecting a target language semantic slot value based on the semantic slot for cultural dependence;
constructing a semantic slot database of the cultural dependence according to the semantic slot of the cultural dependence and the semantic slot value of the target language;
and replacing the category labels in the bilingual category label corpus by using the semantic slots in the semantic slot database depending on the culture according to a database query mode to obtain the spoken language understanding data of the target language after the culture migration.
8. A storage means having stored therein a plurality of programs, characterized in that said programs are adapted to be loaded by a processor for performing a cross-language migration method for spoken language understanding in a dialog system according to any of claims 1-7.
9. A processing apparatus comprising a processor storage device adapted to store a plurality of programs; characterized in that said program is adapted to be loaded by a processor to execute a cross-language migration method for spoken language understanding in a dialog system as claimed in any of the claims 1 to 7.
CN201810724523.9A 2018-07-04 2018-07-04 Cross-language migration method for spoken language understanding in dialog system Active CN109213851B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810724523.9A CN109213851B (en) 2018-07-04 2018-07-04 Cross-language migration method for spoken language understanding in dialog system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810724523.9A CN109213851B (en) 2018-07-04 2018-07-04 Cross-language migration method for spoken language understanding in dialog system

Publications (2)

Publication Number Publication Date
CN109213851A CN109213851A (en) 2019-01-15
CN109213851B true CN109213851B (en) 2021-05-25

Family

ID=64990157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810724523.9A Active CN109213851B (en) 2018-07-04 2018-07-04 Cross-language migration method for spoken language understanding in dialog system

Country Status (1)

Country Link
CN (1) CN109213851B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200364303A1 (en) * 2019-05-15 2020-11-19 Nvidia Corporation Grammar transfer using one or more neural networks
CN110188182B (en) 2019-05-31 2023-10-27 中国科学院深圳先进技术研究院 Model training method, dialogue generating method, device, equipment and medium
CN110298391B (en) * 2019-06-12 2023-05-02 同济大学 Iterative incremental dialogue intention type recognition method based on small sample
CN110472252B (en) * 2019-08-15 2022-12-13 昆明理工大学 Method for translating Hanyue neural machine based on transfer learning
CN113312453B (en) * 2021-06-16 2022-09-23 哈尔滨工业大学 Model pre-training system for cross-language dialogue understanding
CN113919368B (en) * 2021-10-11 2024-05-24 北京大学 Low-resource dialogue generation method and system based on multi-language modeling
CN116595999B (en) * 2023-07-17 2024-04-16 深圳须弥云图空间科技有限公司 Machine translation model training method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010003032A (en) * 2008-06-19 2010-01-07 Brother Ind Ltd Translation support method, translation support device, and translation support program
CN101937677A (en) * 2010-09-21 2011-01-05 陈宝祥 Voice comparison integrated translation processing system
CN105550174A (en) * 2015-12-30 2016-05-04 哈尔滨工业大学 Adaptive method of automatic machine translation field on the basis of sample importance
CN106844352B (en) * 2016-12-23 2019-11-08 中国科学院自动化研究所 Word prediction method and system based on neural machine translation system
CN107066455B (en) * 2017-03-30 2020-07-28 唐亮 Multi-language intelligent preprocessing real-time statistics machine translation system
CN107656997B (en) * 2017-09-20 2021-01-15 Oppo广东移动通信有限公司 Natural language processing method and device, storage medium and terminal equipment
CN107729327A (en) * 2017-09-30 2018-02-23 联想(北京)有限公司 A kind of interpretation method and a kind of lexical or textual analysis device

Also Published As

Publication number Publication date
CN109213851A (en) 2019-01-15

Similar Documents

Publication Publication Date Title
CN109213851B (en) Cross-language migration method for spoken language understanding in dialog system
JP7122341B2 (en) Method and apparatus for evaluating translation quality
US8903707B2 (en) Predicting pronouns of dropped pronoun style languages for natural language translation
CN107423363B (en) Artificial intelligence based word generation method, device, equipment and storage medium
CN108124477B (en) Improving word segmenters to process natural language based on pseudo data
CN108932226A (en) A kind of pair of method without punctuate text addition punctuation mark
US8874433B2 (en) Syntax-based augmentation of statistical machine translation phrase tables
TW201921267A (en) Method and system for generating a conversational agent by automatic paraphrase generation based on machine translation
Xu et al. A deep neural network approach for sentence boundary detection in broadcast news.
JP7335300B2 (en) Knowledge pre-trained model training method, apparatus and electronic equipment
CN110119510B (en) Relationship extraction method and device based on transfer dependency relationship and structure auxiliary word
Razumovskaia et al. Crossing the conversational chasm: A primer on natural language processing for multilingual task-oriented dialogue systems
CN111062217A (en) Language information processing method and device, storage medium and electronic equipment
JP2009151777A (en) Method and apparatus for aligning spoken language parallel corpus
KR100918338B1 (en) Third language text generating method by multi-lingual text inputting and device and storage medium storing program therefor
Li et al. Improving text normalization using character-blocks based models and system combination
CN116187282B (en) Training method of text review model, text review method and device
CN113743101A (en) Text error correction method and device, electronic equipment and computer storage medium
CN117273026A (en) Professional text translation method, device, electronic equipment and storage medium
Winiwarter Learning transfer rules for machine translation from parallel corpora
CN114580446A (en) Neural machine translation method and device based on document context
CN111090720B (en) Hot word adding method and device
Neubarth et al. A hybrid approach to statistical machine translation between standard and dialectal varieties
CN113327579A (en) Speech synthesis method, speech synthesis device, storage medium and electronic equipment
Sibeko et al. An overview of Sesotho BLARK content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant