CN113312453B - Model pre-training system for cross-language dialogue understanding - Google Patents

Model pre-training system for cross-language dialogue understanding Download PDF

Info

Publication number
CN113312453B
CN113312453B CN202110667409.9A CN202110667409A CN113312453B CN 113312453 B CN113312453 B CN 113312453B CN 202110667409 A CN202110667409 A CN 202110667409A CN 113312453 B CN113312453 B CN 113312453B
Authority
CN
China
Prior art keywords
module
dialogue
word
language
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110667409.9A
Other languages
Chinese (zh)
Other versions
CN113312453A (en
Inventor
车万翔
李祺欣
覃立波
刘挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202110667409.9A priority Critical patent/CN113312453B/en
Publication of CN113312453A publication Critical patent/CN113312453A/en
Application granted granted Critical
Publication of CN113312453B publication Critical patent/CN113312453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3337Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a model pre-training system for cross-language dialogue understanding, and relates to a model pre-training system for cross-language dialogue understanding. The invention aims to solve the problems that in the existing cross-language dialogue understanding scene, due to the fact that corpus of a small language is scarce, the model training effect is limited, an accurate dialogue understanding system cannot be obtained, and accurate reply cannot be completed to user words. A model pre-training system for cross-language dialogue understanding comprises: the system comprises a data acquisition module, a dialogue field label sorting and merging module, a training corpus sorting module, a target language type determining module, a static dictionary determining module, a word replacing module, a coding module, a word replacing and predicting module, a sample belonging dialogue field predicting module, an integral model acquiring module, a training module and a cross-language dialogue understanding field downstream task fine-tuning module. The invention is used in the field of cross-language dialog understanding.

Description

Model pre-training system for cross-language dialogue understanding
Technical Field
The invention relates to a cross-language dialogue understanding-oriented model pre-training system, relates to a cross-language model pre-training system in the field of natural language processing, and relates to a dialogue understanding model training system in the field of natural language processing.
Background
Currently, man-machine dialogue systems have become a leading research hotspot in the industry due to their great use value and prospect. Indeed, in the last 60 s, professor Joseph Weizonbaum of the university of Engineers in Massachusetts has begun to develop a human-machine dialog system Eliza (Weizonbaum J. ELIZA-a computer program for the student of natural language communication between human and machine [ J ]. Communications of the ACM,1966,9(1):36-45.) that is able to mimic the responses of psychotherapists and provide assistance to patients with psychological illnesses. In the years that follow, man-machine dialog systems for various purposes have also been developed due to the rapid development of natural language processing (Chowdhury G. Natural language processing [ J ]. Annual review of information science and technology,2003,37(1):51-89.) and deep learning (LeCun Y, Bengio Y, Hinton G. deep learning [ J ]. nature,2015,521(7553): 436-. The most prominent module behind these human-machine dialog systems is the dialog understanding system.
The dialog understanding system is able to understand the user's intentions and give corresponding replies and help, such as weather inquiries, airline reservations, ordering, device control for smart homes, voice control for car-mounted devices, etc. At present, the industry has many conversation understanding systems applied to mobile phones or smart home devices, but most of them are only adapted to languages such as chinese and english, which have wide application range. Similarly, Pre-training of dialog understanding system models by researchers in academia (Wu C S, Hoi S, Socher R, et al. Tod-bert: Pre-trained natural language understanding for task-oriented dialogues [ J ]. arXiv preprinting arXiv:2004.06871,2020.) is also limited to English, and is rarely studied in cross-language scenarios. The important reason for this situation is that because the corpus is scarce in the field of conversational understanding labeled in the Chinese language, how to effectively utilize the existing conversational understanding corpus to assist training in the cross-language scene is a problem that needs to be solved at present.
Disclosure of Invention
The invention aims to solve the problems that in the existing cross-language dialogue understanding scene, due to the fact that corpus of a small language is scarce, model training effect is limited, an accurate dialogue understanding system cannot be obtained, and accurate reply cannot be completed to user words, and provides a cross-language dialogue understanding-oriented model pre-training system.
A model pre-training system for cross-language dialogue understanding comprises:
the system comprises a data acquisition module, a dialogue field label sorting and merging module, a training corpus sorting module, a target language type determining module, a static dictionary determining module, a word replacing module, a coding module, a word replacing and predicting module, a sample belonging dialogue field predicting module, an integral model acquiring module, a training module and a cross-language dialogue understanding field downstream task fine-tuning module;
the data acquisition module is used for collecting English data sets in the labeled dialogue understanding field;
the dialogue domain label sorting and merging module is used for sorting dialogue domain labels marked on all data sets in the data acquisition module and merging dialogue domain labels with the same meaning on different data sets;
the training corpus sorting module is used for dividing conversation corpuses in all data sets collected by the data acquisition module, taking user utterances and system replies in one round of conversation as a sample, performing word segmentation on the user utterances and the system replies respectively, and labeling conversation field labels on each sample by using conversation field label information combined in the conversation field label sorting and combining module;
the target language determining module is used for determining a target language;
the static dictionary determining module is used for respectively collecting static dictionaries translated from English vocabulary to all target languages according to the target languages determined by the target language determining module;
the word replacing module is used for randomly selecting a certain proportion of English words on each sample marked with the dialogue field labels in the training corpus sorting module, randomly selecting a language from the target language determined in the target language determining module for each randomly selected word, translating each randomly selected word to a word corresponding to the target language by using a static dictionary collected by the static dictionary determining module, replacing the English word with the word corresponding to the target language, and simultaneously keeping the original English word as a label to be predicted;
the coding module obtains a coding representation of the processed sample in the word replacement module by using a cross-language coding model;
the word replacement prediction module uses a fully-connected neural network, the encoding expression of each word in the sample obtained by the encoding module calculates the probability of the word which is possibly replaced in the dictionary, and the cross entropy loss is calculated through the label to be predicted in the word replacement module;
the dialogue domain prediction module to which the sample belongs uses a fully-connected neural network, the dialogue domain to which the sample belongs is judged by the coding expression of the whole sentence of the sample obtained by the coding module, and the cross entropy loss is calculated through the dialogue domain label marked in the training corpus sorting module;
the integral model acquisition module adds the cross entropy loss obtained by the word replacement prediction module and the cross entropy loss obtained by the dialogue field prediction module to which the sample belongs to obtain the final loss;
through the final loss, performing back propagation on the integral model and updating parameters of the integral model;
the overall model in the overall model acquisition module is a cross-language coding model in the coding module, and a word replaces the whole of a fully-connected neural network in the prediction module and a fully-connected neural network in the dialogue field prediction module to which the sample belongs;
the training module trains an integral model in the integral model acquisition module by using the processed data in the training corpus sorting module and the word replacement module;
and the downstream task fine-tuning module in the cross-language dialogue understanding field uses the whole model trained by the training module as a pre-training model, and completes the tasks in the cross-language dialogue understanding field based on the pre-training model.
The invention has the beneficial effects that:
the invention provides a model pre-training system for cross-language dialogue understanding, which does not depend on cross-language labeled dialogue understanding data and can pre-train a dialogue understanding model in a cross-language scene only by utilizing the existing English data. In addition, the invention designs a self-supervision task, and utilizes a dictionary to automatically label, so that the model can learn the mapping relation between English words and words of other languages which are translation pairs mutually in the pre-training process, thereby improving the integral expression between other languages and English on the pre-training model. Particularly, the invention also summarizes the dialogue domain labels in different English dialogue understanding data sets, and trains the model by using the labeled information, so that the model can learn the special knowledge of the dialogue understanding domain in the pre-training process. The problem of current cross-language dialogue understanding scene because the chinese language corpus is scarce leads to the model training effect limited, can't obtain accurate dialogue understanding system, can't accomplish accurate reply to user's words is solved.
The invention evaluates the data set of a dialogue language understanding task in ten languages of Arabic, German, Spanish, French, Italian, Malaysia, Polish, Russian, Thai and Turkish, and the task comprises two most classical subtasks in the field of dialogue understanding: intent recognition and slot extraction. Experimental results show that the model pre-trained by the method can obtain a better result than a baseline model when a downstream task is trained.
The invention trains the dialogue language understanding data set on the ten languages by using five random seeds respectively, takes the average result under the five random seeds as the current result, and compares the average results on the ten languages. The model pre-trained by the method has the intention recognition accuracy rate of 93.73 percent, is improved by 4.17 percent compared with a baseline model, has the slot position extraction F1 value of 66.80 percent, is improved by 3.03 percent compared with the baseline model, has the intention and slot position prediction overall accuracy rate of 38.01 percent, and is improved by 3.6 percent compared with the baseline model. The system has great improvement on each index, which also shows that the system provided by the invention is very effective for pre-training cross-language dialogue understanding models.
Drawings
FIG. 1 is a mulberry diagram of conversation domain label summary categorization results for multiple conversation understanding datasets.
Detailed Description
The first specific implementation way is as follows: the model pre-training system for cross-language dialogue understanding includes:
the system comprises a data acquisition module, a dialogue field label sorting and merging module, a training corpus sorting module, a target language type determining module, a static dictionary determining module, a word replacing module, a coding module, a word replacing and predicting module, a sample belonging dialogue field predicting module, an integral model acquiring module, a training module and a cross-language dialogue understanding field downstream task fine-tuning module;
the data acquisition module is used for collecting an English data set in the labeled dialogue understanding field;
8 industry comparative classical public English dialog understanding datasets were collected including CamRest676(Tsung-Hsien Wen, David Vandyke, Nikola Mrksic, Milicasic, Lina M Rojas-Barahona, Pei-Hao Su, Stefan Ultes, and Steve Young, 2016.A network end-to-end reliable task-oriented dialog system. arXiv preprintiv: 1604.04562.), WOZ (Nikola)
Figure BDA0003117425670000041
Diarmuid O Séaghdha,Tsung-Hsien Wen,Blaise Thomson,and Steve Young.2016.Neural belief tracker:Data-driven dialogue state tracking.arXiv preprint arXiv:1606.03777.)、SMD(Mihail Eric and Christopher D Manning.2017.Keyvalue retrieval networks for task-oriented dialogue.arXiv preprint arXiv:1705.05414.)、MSR-E2E(Xiujun Li,Sarah Panda,JJ(Jingjing)Liu,and Jianfeng Gao.2018.Microsoft dialogue challenge:Building end-to-end task-completion dialogue systems.In SLT 2018.)、Taskmaster(Bill Byrne,Karthik Krishnamoorthi,Chinnadhurai Sankar,Arvind Neelakantan,Daniel Duckworth,Semih Yavuz,Ben Goodrich,Amit Dubey,Andy Cedilnik,and Kyu-Young Kim.2019.Taskmaster-1:Toward a realistic and diverse dialog dataset.arXiv preprint arXiv:1909.05358.)、Schema(Abhinav Rastogi,Xiaoxue Zang,Srinivas Sunkara,Raghav Gupta,and Pranav Khaitan.2019.Towards scalable multi-domain conversational agents:The schema-guided dialogue dataset.arXiv preprint arXiv:1909.05855.)、MetaLWOZ(Sungjin Lee,Hannes Schulz,Adam Atkinson,Jianfeng Gao,Kaheer Suleman,Layla El Asri,Mahmoud Adada,Minlie Huang,Shikhar Sharma,Wendy Tay,and Xiujun Li.2019.Multi-domain task-completion dialog challenge.In Dialog System Technology Challenges 8.)、MultiWOZ(
Figure BDA0003117425670000043
Budzianowski,Tsung-Hsien Wen,Bo-Hsiang Tseng,Inigo Casanueva,Stefan Ultes,Osman Ramadan,and
Figure BDA0003117425670000042
2018.Multiwoz-a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling.arXiv preprint arXiv:1810.00278.)。
The dialogue domain tag sorting and merging module is used for sorting dialogue domain tags marked on all data sets in the data acquisition module (such as weather query, flight reservation, meal ordering, equipment control of intelligent home, voice control of vehicle-mounted equipment and the like), and merging the dialogue domain tags with the same meaning on different data sets (such as weather query, flight reservation, meal ordering, equipment control of intelligent home, voice control of vehicle-mounted equipment and the like);
the training corpus sorting module is used for dividing conversation corpuses in all data sets collected by the data acquisition module, taking user words and system replies in a round of conversation as a sample, respectively segmenting the user words and the system replies according to blank spaces, and labeling a conversation field label for each sample by utilizing conversation field label information combined in the conversation field label sorting and combining module;
the target language determining module is used for determining a target language in the pre-training process according to the research current situations in the academic world and the industry and the use range and frequency of each international language;
we manually selected 10 representative languages among the languages used in various countries, including: arabic, German, Spanish, French, Italian, Malaysia, Polish, Russian, Thai, Turkish;
the static dictionary determining module is used for respectively collecting static dictionaries translated from English vocabulary to various target languages according to the target languages determined by the target language determining module;
the word replacing module is used for randomly selecting a certain proportion of English words on each sample marked with the dialogue field labels in the training corpus sorting module, randomly selecting a language from the target language determined in the target language determining module for each randomly selected word, translating each randomly selected word to a word corresponding to the target language by using a static dictionary collected by the static dictionary determining module, replacing the English words with the words corresponding to the target language, and simultaneously keeping original English words (randomly selecting a certain proportion of English words on each sample marked with the dialogue field labels in the training corpus sorting module) as labels to be predicted;
the coding module obtains a coded representation of the processed sample in the word replacement module by using a cross-language coding model;
the word replacement prediction module uses a full-connection neural network (the full-connection neural networks of the word replacement prediction module and the dialogue domain prediction module to which the sample belongs are different and have different parameters), the coding of each word in the sample obtained by the coding module represents the probability of the word which is possibly replaced in the calculation dictionary, and the cross entropy loss is calculated through the label to be predicted in the word replacement module;
the sample affiliated dialogue field prediction module uses a full-connection neural network (the full-connection neural networks of the word replacement prediction module and the sample affiliated dialogue field prediction module are different and have different parameters), the whole sentence of the sample obtained by the coding module (one sample is a user utterance and a system reply in a round of dialogue, one sample comprises a plurality of words, the plurality of words form a sentence, and one sample is a whole sentence) is coded and expressed to judge the dialogue field to which the sample belongs, and the cross entropy loss is calculated through dialogue field labels marked in the training corpus sorting module;
the integral model acquisition module adds the cross entropy loss obtained by the word replacement prediction module and the cross entropy loss obtained by the dialogue field prediction module to which the sample belongs to obtain the final loss;
performing back propagation on the integral model through the final loss and updating the parameters of the integral model;
the overall model in the overall model acquisition module is a cross-language coding model in the coding module, and a word replaces the whole of a fully-connected neural network in the prediction module and a fully-connected neural network in the dialogue domain prediction module to which the sample belongs;
the training module trains an integral model in the integral model acquisition module by using the processed data in the training corpus sorting module and the word replacement module;
the downstream task fine tuning module in the cross-language dialogue understanding field uses the whole model trained by the training module as a pre-training model, and completes tasks in the cross-language dialogue understanding field based on the pre-training model;
tasks within the domain of cross-language dialog understanding include: cross-language dialog language understanding (Cross-language dialog language understanding), Cross-language Intent recognition (Cross-language Intent detection), Cross-language dialog state tracking (Cross-language dialog state tracking), Cross-language dialog behavior prediction (Cross-language dialog behavior prediction), Cross-language reply selection (Cross-language Response selection), and the like. The parameters of the pre-training model are respectively used as the initialization parameters of a BERT architecture-based cross-language dialogue language understanding model, a BERT architecture-based cross-language intention recognition model, a BERT architecture-based cross-language dialogue state tracking model, a BERT architecture-based cross-language dialogue behavior prediction model, a BERT architecture-based cross-language reply selection model and other models, the BERT architecture-based cross-language dialogue language understanding model, the BERT architecture-based cross-language intention recognition model, the BERT architecture-based cross-language dialogue state tracking model, the BERT architecture-based cross-language dialogue behavior prediction model, the BERT architecture-based cross-language reply selection model and other models are respectively trained to respectively obtain a trained BERT architecture-based cross-language dialogue language understanding model, a BERT architecture-based cross-language intention recognition model, a BERT architecture-based cross-language state tracking model, a BERT architecture-based cross-language reply selection model and other models, The method comprises the steps of generating a model of cross-language dialogue behavior prediction based on a BERT architecture, generating a model of cross-language reply selection based on the BERT architecture and the like, and accordingly completing tasks of cross-language dialogue language understanding, cross-language intention recognition, cross-language dialogue state tracking, cross-language dialogue behavior prediction, cross-language reply selection and the like.
The second embodiment is as follows: the difference between the present embodiment and the specific embodiment is that the dialogue domain tag sorting and merging module is configured to sort dialogue domain tags marked on all data sets in the data acquisition module (for example, weather inquiry, scheduled flight, meal ordering, device control of smart home, voice control of vehicle-mounted devices, and the like), and merge dialogue domain tags having the same meaning on different data sets (for example, weather inquiry, scheduled flight, meal ordering, device control of smart home, voice control of vehicle-mounted devices, and the like); the specific process is as follows:
step two, arranging dialogue domain labels marked on all data sets in the data acquisition module, wherein 1 dialogue domain label is arranged in CamRest676, 1 dialogue domain label is arranged in WOZ, 3 dialogue domain labels are arranged in SMD, 3 dialogue domain labels are arranged in MSR-E2E, 6 dialogue domain labels are arranged in Taskmaster, 17 dialogue domain labels are arranged in Schema, 47 dialogue domain labels are arranged in MetaLWOZ, and 6 dialogue domain labels are arranged in MultiWOZ;
and step two, classifying the conversation field labels with the same meaning on different data sets into the same category through manual screening, wherein the classification result is shown in figure 1. In fig. 1, the left text represents the name of the data set and the number of samples therein, the right text represents the name of the classified dialogue area and the number of samples contained therein, the sum of the numbers of the two sides is equal, the arc line connecting the left side and the right side represents that a part of the samples in the data set shown on the left side corresponds to the dialogue area label shown on the right side, and the width of the arc line represents the ratio of the part of the samples in all the samples. The 8 data sets are collated and gathered to form 59 conversation field labels.
Other steps and parameters are the same as those in the first embodiment.
The third concrete implementation mode: the difference between the embodiment and the specific embodiment is that the corpus training arrangement module is used for segmenting dialogue corpora in all data sets collected by the data acquisition module, taking user words and system replies in a round of dialogue as a sample, segmenting the user words and the system replies according to blank spaces, and labeling dialogue domain labels on each sample by using dialogue domain label information combined in the step two; the specific process is as follows:
step three, the dialogue understanding corpus in the data set collected in the data acquisition module is multi-turn dialogue, and each dialogue can be expressed as D ═ U 1 ,R 1 ,…,U N ,R N };
Wherein N represents the number of dialogue rounds, U 1 And R 1 User utterances and system replies, U, representing the 1 st round of dialogue, respectively N And R N Representing user utterances and system replies, respectively, for the nth round of dialog;
taking the user's words and system replies in one turn as a sample, and dividing words for them according to blank space, inserting separator [ SEP ] between them]And inserting an identifier [ CLS ] at the beginning of the sentence]Is used to represent global information, resulting in a sample S { [ CLS { [],u 1 ,u 2 ,…,u i ,[SEP],r 1 ,r 2 ,…,r j };
Wherein u is 1 And r 1 Representing the 1 st word, u, in the user utterance and the system reply, respectively 2 And r 2 Representing the 2 nd word, u, in the user utterance and the system reply, respectively i Representing the i-th word in the user utterance, r j Representing the jth word in the system reply, i representing the length after word segmentation for the user utterance; j represents the length after the system is replied with participles;
step two, marking a conversation field label on each sample by utilizing the conversation field label information combined in the conversation field label sorting and combining module (because part of the samples collected in the step 1 are marked to a plurality of conversation field labels, the invention only considers the condition of a single conversation field label, therefore, if the sample belongs to a plurality of conversation fields after being marked, the sample is ignored), and each sample marked with the conversation field label is expressed as:
S={S tokens =[CLS],u 1 ,u 2 ,…,u i ,[SEP],r 1 ,r 2, …,r j ;S domain =d},
wherein d is the dialog domain label corresponding to the sample, S tokens For each sample the sequence of processed input characters, tokens (the words are those u 1 ,u 2 ,r 1 ,r 2 The characters, in addition to words, also include [ CLS]And [ SEP ]]),S domain A dialogue domain label for each sample;
the finished samples amounted to 457555;
other steps and parameters are the same as those in the first or second embodiment.
The fourth concrete implementation mode: the difference between this embodiment and the first to third embodiments is that the static dictionary determining module is configured to collect, according to the target language determined by the target language determining module, static dictionaries translated from english vocabulary to target languages respectively; the specific process is as follows:
and downloading via a website https:// github. com/facebook/MUSE to obtain a dictionary translated from English to a target language.
Other steps and parameters are the same as those in one of the first to third embodiments.
The fifth concrete implementation mode: the embodiment is different from the first to the fourth specific embodiments in that the word replacement module is configured to randomly select an english word with a certain proportion on each sample labeled with a dialogue field tag in the training corpus sorting module, randomly select a language from the target language determined by the target language determination module for each randomly selected word, translate each randomly selected word to a word corresponding to the target language by using a static dictionary collected by the static dictionary determination module, replace the english word with a word corresponding to the target language, and simultaneously retain an original english word (an english word with a certain proportion is randomly selected on each sample labeled with a dialogue field tag in the training corpus sorting module) as a tag to be predicted; the specific process is as follows:
setting the randomly selected proportion as p% (15%);
at the same time, create S goldens Array for protectingStoring the model to be predicted (gold) with S tokens Of the same length with [ PAD]Array pair S as placeholder goldens Carry out initialization, i.e. S goldens =[PAD],…,[PAD]。
In addition, create S masks Array for storing position information of model-replaced words, and S tokens All 0 array pairs S of the same length masks Carry out initialization, i.e. S masks =0,…,0;
S on each sample after labeling dialogue field labels in training corpus sorting module tokens Generating a random number of 0-1 for each t, if the random number is less than p%, randomly selecting target language from 10 languages determined in the target language determining module at equal probability, and translating t to the corresponding word t in the randomly selected target language by using the static dictionary collected in the static dictionary determining module x Let t be t in the sample x Replacing the position, and storing the replaced t in S goldens As a label to be predicted, and simultaneously using S as a label masks The value of this position is 1;
t∈{t|t∈S tokens ,t≠[CLS],t≠[SEP]}
an example of a sample after word substitution is
Figure BDA0003117425670000081
S goldens =[PAD],…,u k ,…,r l ,…,r m ,…,[PAD];S masks =0,…,1,…,1,…,1,…,0}
Wherein the content of the first and second substances,
Figure BDA0003117425670000082
u representing k position in user utterance k The vocabulary of the target language after the word replacement,
Figure BDA0003117425670000083
r representing the position of l in the system reply l Target after word replacementThe words and phrases of a language such as,
Figure BDA0003117425670000084
r representing m position in system recovery m And (5) performing word replacement on the target language vocabulary.
Other steps and parameters are the same as in one of the first to fourth embodiments.
The sixth specific implementation mode: this embodiment is different from one of the first to fifth embodiments in that S goldens 、S tokens 、S masks All have the same length, but S goldens There is a replaced t only at the position of the replaced word, and the other positions are [ PAD ]]Meaning that no prediction is required.
Other steps and parameters are the same as those in one of the first to fifth embodiments.
The seventh concrete implementation mode: the difference between the present embodiment and one of the first to sixth embodiments is that the encoding module uses a cross-language encoding model to obtain an encoded representation of a processed sample in the word replacement module; the specific process is as follows:
XLM-RoBERTA-base (Conneau A, Khandelwal K, Goyal N, et al, Unstupervised cross-linear representation learning at scale [ J ] is selected]arXiv preprinting arXiv:1911.02116,2019.) as a cross-language coding model for S processed by word replacement in the word replacement module tokens Coding is carried out to obtain the coded representation of each token
Figure BDA0003117425670000091
Wherein Cross _ Lingual _ Encoder is a Cross-language coding model, h [CLS] 、h [SEP] Respectively represent [ CLS]And [ SEP ]]The tag is represented by a coded representation after being coded by a cross-language coding model,
Figure BDA0003117425670000092
denotes u 1 The coded representation after being coded by the cross-language coding model,
Figure BDA0003117425670000093
is represented by r 1 And (4) representing the coded representation after the coding of the cross-language coding model.
Other steps and parameters are the same as those in one of the first to sixth embodiments.
The specific implementation mode eight: the embodiment is different from the first to seventh embodiments in that the word replacement prediction module uses a fully-connected neural network (the fully-connected neural networks of the word replacement prediction module and the dialogue domain prediction module to which the sample belongs are different and have different parameters), the encoding of each word in the sample obtained by the encoding module represents the probability of the word which is possibly replaced in the calculation dictionary, and the cross entropy loss is calculated through the label to be predicted in the word replacement module; the specific process is as follows:
eighthly, using a fully-connected neural network, and calculating the probability of possibly replaced words in the dictionary according to the coded representation of each word in the sample obtained by the coding module
Figure BDA0003117425670000094
Wherein
Figure BDA0003117425670000095
Is the weight of the fully-connected neural network, b is the bias of the fully-connected neural network, h i For the coded representation of the i-th position obtained in the coding module, z i Is the predicted probability of the word in the ith position (the representation of the word in the 1 st position is z 1 At the x-th position is z x The following word replacement task is to predict the replaced word for each position word respectively);
step eight two, through S constructed in the word replacement module goldens And S masks Calculating cross entropy loss of Word Replacement task (WR)
Figure BDA0003117425670000101
Wherein V is the size of the vocabulary, z i,k Indicates at the ith position to the kth positionThe probability of the prediction of a word is,
Figure BDA0003117425670000102
a true tag (0 or 1, where 1 denotes the kth word and S) to the kth word at the ith position goldens The words in the ith position are consistent, otherwise, the words are 0),
Figure BDA0003117425670000103
cross entropy loss for i position;
Figure BDA0003117425670000104
wherein i is S masks The position of the replaced word stored in, S masks [i]Denotes S masks The value of the (i) th position in (c),
Figure BDA0003117425670000105
is the sum of the penalties at the positions of all the replaced words.
Other steps and parameters are the same as those in one of the first to seventh embodiments.
The specific implementation method nine: the embodiment is different from one to eight of the specific embodiments in that the sample belonging dialogue domain prediction module uses a fully-connected neural network (the fully-connected neural networks of the word replacement prediction module and the sample belonging dialogue domain prediction module are different and have different parameters), the coding expression of a sample whole sentence obtained by the coding module (one sample is a user utterance and a system reply in a round of dialogue, one sample comprises a plurality of words, the plurality of words form a sentence, and one sample is a whole sentence) judges the dialogue domain to which the sample belongs, and the cross entropy loss is calculated through a dialogue domain label marked in the training corpus sorting module; the specific process is as follows:
step nine, using full-connection neural network, and obtaining identifier [ CLS ] in sample by coding module]Is expressed in terms of a code of [CLS] Calculating the probability of the dialogue area to which the sample belongs
Figure BDA0003117425670000108
Wherein
Figure BDA0003117425670000109
B' is the bias of the fully-connected neural network;
and step nine, calculating the cross entropy loss of a dialogue Domain classification task (Domain Classifier, DC for short) through the dialogue Domain label marked in the training corpus sorting module.
Other steps and parameters are the same as those in one to eight of the embodiments.
The detailed implementation mode is ten: the difference between this embodiment and the first to ninth embodiments is that, in the ninth step, the cross entropy loss of the Domain classification task (Domain Classifier, abbreviated as DC) is calculated by the dialog Domain label marked in the corpus sorting module:
Figure BDA0003117425670000106
wherein D is the number of the conversation field tags collected in the conversation field tag sorting and merging module, z i Is the predicted probability in z' for the ith dialogue domain label,
Figure BDA0003117425670000107
true tag for ith dialogue domain for current sample (0 or 1, where 1 denotes the ith dialogue domain tag and S domain Consistent, otherwise 0).
Other steps and parameters are the same as those in one of the first to ninth embodiments.
The following examples were used to demonstrate the beneficial effects of the present invention:
the first embodiment is as follows:
the embodiment selects a dialogue language understanding task as a downstream task in the dialogue understanding field, and gives a dialogue language understanding data set in the cross-language field, the task is to classify the intention of a dialogue language and extract corresponding slots in a sentence, and the task is specifically prepared according to the following steps:
collecting English dialogue language understanding data, translating the English dialogue language understanding data into a cross-language field, and labeling translated texts for training, verifying and testing models;
we downloaded the SNIPS dataset (Alice Cocke, AlaasAade, Adrien Ball, Th 'eodoreBluche, Alexandre Caulier, David Leroy, Cl' ementdouumouro, Thibault Gisslbercht, France co Caltagarone, Thibaut Lavril, et al.2018.SNIPS voice platform: an embedded spoke mapping arrangement system for private-by-design voice interface. arXiv print arXiv:1805.10190.), split 700 bar samples from its validation set (100 bars per intent, total 7 intents), split into two halves (350 bars for total 7 intents, 50 bars per intent), and simultaneously extract a random set of test bars from its validation set (350 bars per test bar, total 50 bars per test bar).
For the extracted training, verification and test set (1050 samples in total), the expert is requested to respectively translate the extracted training, verification and test set into Arabic, German, Spanish, French, Italian, Malaysia, Polish, Russian, Thai and Turkish languages, 10 languages in total, and the slot positions in the extracted training, verification and test set are marked again while the original intention labels of the extracted training, verification and test set are kept for the model.
Setting a baseline cross-language pre-training model;
XLM-RoBERTA-base was chosen as the baseline cross-language pre-training model for this example.
Step three, setting a dialogue language understanding task model architecture;
our model uses a whole model with a pipeline as an architecture. The overall model is composed of two models, namely an intention classification model and a slot extraction model.
Step four, training an intention classification model;
step four, obtaining the coding representation of the sample by using the cross-language pre-training model
Figure BDA0003117425670000111
Where Input is the Input sample, k is the sample length, h [CLS] Is a sample [ CLS]The coded representation at the label is represented by,
Figure BDA0003117425670000112
for the coded representation of the first word in the sample,
Figure BDA0003117425670000121
is the coded representation of the kth word in the sample;
step four, using a full-connection neural network to calculate the probability of the intention label of the current sample
Figure BDA0003117425670000122
Wherein
Figure BDA0003117425670000123
B is the weight of the fully-connected neural network, b is the bias of the fully-connected neural network;
step four and step three, calculating cross entropy loss through the prediction probability in the step four and the step two
Figure BDA0003117425670000124
Wherein I is the number of intents summarized in step two,
Figure BDA0003117425670000125
for the true label of the current sample to the i-th intention (0 or 1, where 1 means that the i-th intention label is the golden label of the sample, and vice versa is 0), z i A predicted probability for the model for the ith intention tag;
fourthly, performing back propagation through the loss calculated in the fourth step and the third step and updating model parameters;
step five, training a slot position extraction model;
step five, obtaining coded representation of samples by using cross-language pre-training model
Figure BDA0003117425670000126
Where Input is the Input sample, k is the sample length, h [CLS] Is a sample [ CLS]The coded representation at the label is represented by,
Figure BDA0003117425670000127
for the coded representation of the first word in the sample,
Figure BDA0003117425670000128
is the coded representation of the kth word in the sample;
step two, a full-connection neural network is respectively created for each intention, and the probability of the slot position label of each token position of the current sample is predicted through the golden label of the intention in the sample
Figure BDA0003117425670000129
Wherein
Figure BDA00031174256700001210
Weight of the fully-connected neural network corresponding to the ith intention label, b i Bias of fully-connected neural network for ith intention tag, h k Is a coded representation of the word at the k position in the sample, z k Predicting the probability of the slot position of the word after the word passes through the model;
step five and step three, calculating cross entropy loss through the prediction probability in the step five and step two
Figure BDA00031174256700001211
WhereinL is the sample length, S i The number of slot tags corresponding to the ith intention tag,
Figure BDA00031174256700001212
for the true tag of the current sample k position to the s-th slot (0 or 1, where 1 represents that the s-th slot tag is the golden slot tag of the sample k position, and vice versa is 0), z k,s Predicting probability of the current sample k position model to the s-th slot position label;
fifthly, performing back propagation through the loss calculated in the fifth step and the third step and updating model parameters;
the training processes of the intention classification model and the slot extraction model in the fourth step and the fifth step are mutually independent, and the two trained models form the integral model in the third step.
Predicting a final result and calculating an index;
sixthly, predicting a final result;
and secondly, predicting the slot position label on the sample by using the fully-connected neural network corresponding to the slot position extraction model trained in the fifth step by using the prediction result of the intention classification model.
Sixthly, calculating indexes;
let the number of mean predictions for all samples be C Intent If the total number of samples is A, the Intent recognition accuracy (Intent Acc) is
Figure BDA0003117425670000131
Assuming that the number of correct Slot tag predictions is TP, the number of incorrect predictions is FP, and the number of unpredicted slots is FN in all tokens of all samples, the calculation method of Slot extraction F1 value (Slot F1) is as follows:
Figure BDA0003117425670000132
Figure BDA0003117425670000133
Figure BDA0003117425670000134
let C be the number of all samples for which the intent and all slots are predicted correctly Overall If the total number of samples is A, the Overall recognition accuracy (Overall Acc) is
Figure BDA0003117425670000135
In order to balance result fluctuation caused by less test data, 5 different random seeds are selected from a training set for experiment, the average value of each index of each language under the 5 random seeds is counted, and finally the average experiment result of 10 languages is reported.
The final experimental results on the test set are shown in table 1.
TABLE 1 average experimental results of conversational language understanding tasks in Ten-door languages
Figure BDA0003117425670000136
Figure BDA0003117425670000141
The best results are shown in bold in the table.
Where the first row of experimental results shows our experimental results on the baseline model.
The second row shows the experimental results of a model pre-training system oriented to cross-language dialogue understanding according to the present invention.
The third row shows the experimental results of the word replacement method in the above-described scheme of the present invention into the Masked Language Model (Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformations for Language interpretation [ J ]. arXiv preprintiv: 1810.04805,2018.).
The fourth row shows the experimental results after the dialogue domain classification fully-connected neural network in the scheme of the invention is removed.
The fifth row shows the experimental results after the word replacement method in the above scheme of the present invention is changed into Masked Language Model and the classified fully-connected neural network in the dialogue domain is removed.
As can be seen from the experimental results of the ablation experiments in the third, fourth and fifth rows of Table 1, all parts in the scheme of the present invention are indispensable, and the combined training of the word replacement model and the classification model in the dialogue domain can make the model effect better.
As can be seen from Table 1, the intention recognition accuracy of the cross-language dialogue understanding pre-training model trained by the method is improved by 4.17% compared with that of the baseline model, the slot extraction F1 value is improved by 3.03% compared with that of the baseline model, and the accuracy of the whole intention and slot prediction is improved by 3.60% compared with that of the baseline model. The method also proves that the overall effect of the cross-language dialogue understanding model can be remarkably improved.
The present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and it is therefore intended that all such changes and modifications be considered as within the spirit and scope of the appended claims.

Claims (9)

1. A model pre-training system for cross-language conversational understanding, characterized by: the system comprises:
the system comprises a data acquisition module, a dialogue field label sorting and merging module, a training corpus sorting module, a target language type determining module, a static dictionary determining module, a word replacing module, a coding module, a word replacing and predicting module, a sample belonging dialogue field predicting module, an integral model acquiring module, a training module and a cross-language dialogue understanding field downstream task fine-tuning module;
the data acquisition module is used for collecting an English data set in the labeled dialogue understanding field;
the dialogue domain label sorting and merging module is used for sorting dialogue domain labels marked on all data sets in the data acquisition module and merging dialogue domain labels with the same meaning on different data sets;
the training corpus sorting module is used for dividing conversation corpuses in all data sets collected by the data acquisition module, taking user words and system replies in a round of conversation as a sample, respectively segmenting the user words and the system replies, and simultaneously labeling a conversation field label for each sample by utilizing conversation field label information combined in the conversation field label sorting and combining module;
the target language determining module is used for determining a target language;
the static dictionary determining module is used for respectively collecting static dictionaries translated from English vocabulary to various target languages according to the target languages determined by the target language determining module;
the word replacing module is used for randomly selecting a certain proportion of English words on each sample marked with the dialogue field labels in the training corpus sorting module, randomly selecting a language from the target language determined in the target language determining module for each randomly selected word, translating each randomly selected word to a word corresponding to the target language by using a static dictionary collected by the static dictionary determining module, replacing the English word with the word corresponding to the target language, and simultaneously keeping the original English word as a label to be predicted;
the coding module obtains a coded representation of the processed sample in the word replacement module by using a cross-language coding model;
the word replacement prediction module uses a fully-connected neural network, the encoding expression of each word in the sample obtained by the encoding module calculates the probability of the word which is possibly replaced in the dictionary, and the cross entropy loss is calculated through the label to be predicted in the word replacement module;
the dialogue domain prediction module to which the sample belongs uses a fully-connected neural network, the dialogue domain to which the sample belongs is judged by the coding expression of the whole sentence of the sample obtained by the coding module, and the cross entropy loss is calculated through the dialogue domain label marked in the training corpus sorting module;
the integral model acquisition module adds the cross entropy loss obtained by the word replacement prediction module and the cross entropy loss obtained by the dialogue field prediction module to which the sample belongs to obtain the final loss;
through the final loss, performing back propagation on the integral model and updating parameters of the integral model;
the overall model in the overall model acquisition module is a cross-language coding model in the coding module, and a word replaces the whole of a fully-connected neural network in the prediction module and a fully-connected neural network in the dialogue domain prediction module to which the sample belongs;
the training module trains an integral model in the integral model acquisition module by using the processed data in the training corpus sorting module and the word replacement module;
and the downstream task fine-tuning module in the cross-language dialogue understanding field uses the whole model trained by the training module as a pre-training model, and completes the tasks in the cross-language dialogue understanding field based on the pre-training model.
2. The model pre-training system for cross-language dialogue understanding according to claim 1, wherein: the dialogue domain label sorting and merging module is used for sorting dialogue domain labels marked on all data sets in the data acquisition module and merging dialogue domain labels with the same meaning on different data sets; the specific process is as follows:
step two, sorting all data sets in the data acquisition module to have marked conversation field labels;
and step two, classifying the conversation field labels with the same meaning on different data sets into the same category through manual screening.
3. The model pre-training system for cross-language dialogue understanding according to claim 2, wherein: the training corpus sorting module is used for dividing conversation corpuses in all data sets collected by the data acquisition module, taking user words and system replies in a round of conversation as a sample, segmenting words of the user words and the system replies respectively, and labeling a conversation field label for each sample by using the conversation field label information combined in the step two; the specific process is as follows:
step three, the dialogue understanding corpus in the data set collected in the data acquisition module is multi-turn dialogue, and each dialogue can be expressed as D ═ U 1 ,R 1 ,...,U N ,R N };
Wherein N represents the number of dialogue rounds, U 1 And R 1 User utterances and system replies, U, representing the 1 st round of dialogue, respectively N And R N Representing user utterances and system replies, respectively, for the nth round of dialog;
taking the user utterance and the system reply in a round of conversation as a sample, performing word segmentation on the user utterance and the system reply respectively, and inserting a separator [ SEP ] between the user utterance and the system reply]And insert an identifier [ CLS ] at the beginning of the sentence]Is used to represent global information to obtain a sample S { [ CLS { [],u 1 ,u 2 ,...,u i ,[SEP],r 1 ,r 2 ,...,r j };
Wherein u 1 And r 1 Representing the 1 st word, u, in the user utterance and the system reply, respectively 2 And r 2 Representing the 2 nd word, u, in the user utterance and the system reply, respectively i Representing the i-th word, r, in the user utterance j Representing the jth word in the system reply, i representing the length after word segmentation for the user utterance; j represents the length of the system after the word segmentation is replied;
step two, marking a conversation field label on each sample by utilizing the conversation field label information combined in the conversation field label sorting and combining module, wherein each sample marked with the conversation field label is represented as follows:
S={S tokens =[CLS],u 1 ,u 2 ,...,u i ,[SEP],r 1 ,r 2 ,...,r j ;S domain =d},
wherein d is the dialog domain label corresponding to the sample, S tokens For the sequence of processed input characters, S, in each sample domain Dialog realm tags for each sample.
4. A model pre-training system for cross-language dialogue understanding according to claim 3, wherein: the word replacing module is used for randomly selecting a certain proportion of English words on each sample marked with the dialogue field labels in the training corpus sorting module, randomly selecting a language from the target language determined in the target language determining module for each randomly selected word, translating each randomly selected word to a word corresponding to the target language by using a static dictionary collected by the static dictionary determining module, replacing the English word with the word corresponding to the target language, and simultaneously keeping the original English word as a label to be predicted; the specific process is as follows:
setting the randomly selected proportion as p%;
creation S goldens Array for storing labels to be predicted, [ PAD ]]Array pair S as placeholder goldens Carry out initialization, i.e. S goldens =[PAD],...,[PAD];
Creation S masks Array is used for storing position information of replaced words, and all 0 array pairs S are used masks Carry out initialization, i.e. S masks =0,...,0;
S on each sample after labeling dialogue field labels in training corpus sorting module tokens Generating a random number of 0-1 for each t, if the random number is less than p%, translating t into the corresponding word t in the randomly selected target language by using the static dictionary collected from the static dictionary determining module x Let t be t in the sample x Replacing the position, and storing the replaced t in S goldens As the label to be predicted, and simultaneously using S as the label masks The value of this position is 1;
t∈{t|t∈S tokens ,t≠[CLS],t≠[SEP]}
an example of a sample after word substitution is
Figure FDA0003802208330000031
S goldens =[PAD],…,u k ,…,r l ,…,r m ,…,[PAD];S masks =0,…,1,…,1,…,1,…,0}
Wherein the content of the first and second substances,
Figure FDA0003802208330000032
u representing k position in user utterance k The vocabulary of the target language after the word replacement,
Figure FDA0003802208330000033
r indicating the position of l in the system reply l The vocabulary of the target language after the word replacement,
Figure FDA0003802208330000034
r representing m position in system recovery m And (5) performing word replacement on the target language vocabulary.
5. The model pre-training system for cross-language dialogue understanding according to claim 4, wherein: said S goldens 、S tokens 、S masks All have the same length, but S goldens There is replaced t only at the position of replaced word, and the other positions are [ PAD ]]Meaning that no prediction is required.
6. The model pre-training system for cross-language dialogue understanding according to claim 5, wherein: the encoding module obtains an encoded representation of the processed samples in the word replacement module using a cross-language encoding model; the specific process is as follows:
selecting XLM-RoBERTA-base as a cross-language coding model, and replacing the processed S with the words in the word replacing module tokens Coding is carried out to obtain the coded representation of each token
Figure FDA00038022083300000410
Figure FDA00038022083300000411
Wherein Cross _ Lingual _ Encoder is a Cross-language coding model, h [CLS] 、h [SEP] Respectively represent [ CLS]And [ SEP ]]The tag is represented by a coded representation after being coded by a cross-language coding model,
Figure FDA0003802208330000041
represents u 1 The coded representation after being coded by the cross-language coding model,
Figure FDA0003802208330000042
is represented by r 1 And (4) representing the coded representation after the coding of the cross-language coding model.
7. The cross-language dialogue understanding-oriented model pre-training system of claim 6, wherein: the word replacement prediction module uses a fully-connected neural network, the encoding expression of each word in the sample obtained by the encoding module calculates the probability of the word which is possibly replaced in the dictionary, and the cross entropy loss is calculated through the label to be predicted in the word replacement module; the specific process is as follows:
eighthly, using a fully connected neural network, calculating the probability of possibly replaced words in the dictionary according to the coded representation of each word in the sample obtained by the coding module
Figure FDA0003802208330000043
Wherein
Figure FDA0003802208330000044
Is the weight of the fully-connected neural network, b is the bias of the fully-connected neural network, h i For the coded representation of the i-th position obtained in the coding module, z i A predicted probability of a word being at the ith position;
eighthly, replacing the S constructed in the module by the word goldens And S masks Computing cross-entropy loss for word replacement tasks
Figure FDA0003802208330000045
Wherein V is the size of the vocabulary, z i,k Representing the predicted probability of the kth word at the ith position,
Figure FDA0003802208330000046
the true label representing the k word at the ith position,
Figure FDA0003802208330000047
cross entropy loss for the i position;
Figure FDA0003802208330000048
wherein i is S masks The position of the replaced word stored in, S masks [i]Denotes S masks The value of the i-th position in (c),
Figure FDA0003802208330000049
is the sum of the losses over the positions of all replaced words.
8. The cross-language dialogue understanding-oriented model pre-training system of claim 7, wherein: the dialogue domain prediction module to which the sample belongs uses a fully-connected neural network, the dialogue domain to which the sample belongs is judged by the coding expression of the whole sentence of the sample obtained by the coding module, and the cross entropy loss is calculated through the dialogue domain label marked in the training corpus sorting module; the specific process is as follows:
step nine, using a fully-connected neural network, and obtaining an identifier [ CLS ] in a sample by an encoding module]Is represented by a code of [CLS] Calculating the probability of the dialogue area to which the sample belongs
Figure FDA0003802208330000051
Wherein
Figure FDA0003802208330000052
B' is the bias of the fully-connected neural network;
and step nine, calculating the cross entropy loss of the dialogue field classification task through the dialogue field labels marked in the training corpus sorting module.
9. The cross-language dialogue understanding-oriented model pre-training system of claim 8, wherein: in the ninth step, the cross entropy loss of the dialogue field classification task is calculated through the dialogue field labels marked in the training corpus sorting module:
Figure FDA0003802208330000053
wherein D is the number of the dialogue domain tags summarized in the dialogue domain tag arrangement and combination module, z i 'is the predicted probability for the ith dialogue domain label in z',
Figure FDA0003802208330000054
the true label for the ith dialogue domain for the current sample.
CN202110667409.9A 2021-06-16 2021-06-16 Model pre-training system for cross-language dialogue understanding Active CN113312453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110667409.9A CN113312453B (en) 2021-06-16 2021-06-16 Model pre-training system for cross-language dialogue understanding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110667409.9A CN113312453B (en) 2021-06-16 2021-06-16 Model pre-training system for cross-language dialogue understanding

Publications (2)

Publication Number Publication Date
CN113312453A CN113312453A (en) 2021-08-27
CN113312453B true CN113312453B (en) 2022-09-23

Family

ID=77379146

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110667409.9A Active CN113312453B (en) 2021-06-16 2021-06-16 Model pre-training system for cross-language dialogue understanding

Country Status (1)

Country Link
CN (1) CN113312453B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114005549B (en) * 2021-11-09 2024-06-18 哈尔滨理工大学 Enhanced automatic medical diagnosis dialogue system based on deep learning
CN115455981B (en) * 2022-11-11 2024-03-19 合肥智能语音创新发展有限公司 Semantic understanding method, device and equipment for multilingual sentences and storage medium
CN116628160B (en) * 2023-05-24 2024-04-19 中南大学 Task type dialogue method, system and medium based on multiple knowledge bases
CN116805004B (en) * 2023-08-22 2023-11-14 中国科学院自动化研究所 Zero-resource cross-language dialogue model training method, device, equipment and medium
CN117149987B (en) * 2023-10-31 2024-02-13 中国科学院自动化研究所 Training method and device for multilingual dialogue state tracking model
CN117648430B (en) * 2024-01-30 2024-04-16 南京大经中医药信息技术有限公司 Dialogue type large language model supervision training evaluation system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960317A (en) * 2018-06-27 2018-12-07 哈尔滨工业大学 Across the language text classification method with Classifier combination training is indicated based on across language term vector
CN109213851A (en) * 2018-07-04 2019-01-15 中国科学院自动化研究所 Across the language transfer method of speech understanding in conversational system
CN111326138A (en) * 2020-02-24 2020-06-23 北京达佳互联信息技术有限公司 Voice generation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960317A (en) * 2018-06-27 2018-12-07 哈尔滨工业大学 Across the language text classification method with Classifier combination training is indicated based on across language term vector
CN109213851A (en) * 2018-07-04 2019-01-15 中国科学院自动化研究所 Across the language transfer method of speech understanding in conversational system
CN111326138A (en) * 2020-02-24 2020-06-23 北京达佳互联信息技术有限公司 Voice generation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《INJECTING WORD INFORMATION WITH MULTI-LEVEL WORD ADAPTER FOR CHINESE SPOKEN LANGUAGE UNDERSTANDING》;Dechuan Teng等;《2021 IEEE International Conference on Acoustics, Speech and Signal Processing》;20210513;全文 *

Also Published As

Publication number Publication date
CN113312453A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
CN113312453B (en) Model pre-training system for cross-language dialogue understanding
CN107729309B (en) Deep learning-based Chinese semantic analysis method and device
CN108614875B (en) Chinese emotion tendency classification method based on global average pooling convolutional neural network
CN106202010B (en) Method and apparatus based on deep neural network building Law Text syntax tree
CN110969020B (en) CNN and attention mechanism-based Chinese named entity identification method, system and medium
CN108009148B (en) Text emotion classification representation method based on deep learning
CN110851599B (en) Automatic scoring method for Chinese composition and teaching assistance system
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN106980608A (en) A kind of Chinese electronic health record participle and name entity recognition method and system
CN113033438B (en) Data feature learning method for modal imperfect alignment
Horn Context encoders as a simple but powerful extension of word2vec
CN113128232B (en) Named entity identification method based on ALBERT and multiple word information embedding
CN108563725A (en) A kind of Chinese symptom and sign composition recognition methods
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN111523420A (en) Header classification and header list semantic identification method based on multitask deep neural network
CN109977402A (en) A kind of name entity recognition method and system
CN110222338A (en) A kind of mechanism name entity recognition method
CN110807323A (en) Emotion vector generation method and device
CN110472245A (en) A kind of multiple labeling emotional intensity prediction technique based on stratification convolutional neural networks
CN112699685A (en) Named entity recognition method based on label-guided word fusion
CN113312918B (en) Word segmentation and capsule network law named entity identification method fusing radical vectors
CN113705222A (en) Slot recognition model training method and device and slot filling method and device
CN117033423A (en) SQL generating method for injecting optimal mode item and historical interaction information
CN109919657A (en) Acquisition methods, device, storage medium and the speech ciphering equipment of user demand information
CN115905527A (en) Priori knowledge-based method for analyzing aspect-level emotion of BERT model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant