CN112818097A - Off-task training system based on dialog box state tracking model - Google Patents

Off-task training system based on dialog box state tracking model Download PDF

Info

Publication number
CN112818097A
CN112818097A CN202110104849.3A CN202110104849A CN112818097A CN 112818097 A CN112818097 A CN 112818097A CN 202110104849 A CN202110104849 A CN 202110104849A CN 112818097 A CN112818097 A CN 112818097A
Authority
CN
China
Prior art keywords
module
task
auxiliary
dst
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110104849.3A
Other languages
Chinese (zh)
Inventor
潘晓光
焦璐璐
令狐彬
宋晓晨
韩丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Sanyouhe Smart Information Technology Co Ltd
Original Assignee
Shanxi Sanyouhe Smart Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi Sanyouhe Smart Information Technology Co Ltd filed Critical Shanxi Sanyouhe Smart Information Technology Co Ltd
Priority to CN202110104849.3A priority Critical patent/CN112818097A/en
Publication of CN112818097A publication Critical patent/CN112818097A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention belongs to the field of natural language data processing, and particularly relates to a task external training system based on a dialog box state tracking model. The invention is beneficial to supporting model training through auxiliary task data, especially the use of MTL, and greatly improves the performance of processing high-difficulty tasks. Meanwhile, the method opens a door for a large number of irrelevant natural language processing corpora, and the corpora are defined in a wide non-dialogue task to relieve the data sparsity problem in the DST. The method is used for off-task training of the tracking model.

Description

Off-task training system based on dialog box state tracking model
Technical Field
The invention belongs to the field of natural language data processing, and particularly relates to an off-task training system based on a dialog box state tracking model.
Background
In task-oriented dialog systems today, the role of the dialog state tracker is to summarize the dialog history up to now and to extract the user goals. Dialog State Tracking (DST) is severely affected by data sparsity. While many Natural Language Processing (NLP) tasks benefit from migratory learning and multitask learning, these methods are limited by the amount of data available and the specificity of the dialog application in the dialog, and there are serious data sparsity issues with dialog state tracking and problems with natural language processing that are not solved or do not work well in the processing of the dialog concerned.
Disclosure of Invention
Aiming at the technical problem that the dialogue state tracking is seriously influenced by data sparsity, the invention provides the off-task training system based on the dialogue state tracking model, which has high efficiency, small error and strong stability.
In order to solve the technical problems, the invention adopts the technical scheme that:
a task external training system based on a dialog box state tracking model comprises a DST module, an auxiliary task module, an ITFT module and an MTL module, wherein the ITFT module is connected with the MTL module, the MTL module is connected with the DST module, and the MTL module is connected with the auxiliary task module;
the DST module is used for extracting meaning and intention from user input and keeping and updating the information in the continuous process of the conversation;
the auxiliary task module is used for supporting model training;
the ITFT module is used for guiding the parameters of the encoder to a favorable direction so that subsequent fine adjustment can find better local optimization;
the MTL module is used to train the same model between the auxiliary task and the target task simultaneously.
In the DST module, DST (session state tracking) processes a data set by using a DST model Trippy, and a Roberta compiler gives bert adaptability to fragment differentiation in a session.
The auxiliary task module comprises sentences and classification tasks of the sentences to layers, and adopts the following training constraints: the auxiliary task is a classification problem or a span prediction problem; only one auxiliary task can be used at a time.
The ITFT module is a task fine-tuning module, and continuously trains the same model on two unrelated tasks, wherein the two unrelated tasks are an auxiliary task and a DST task respectively.
The MTL module is a multi-task learning module, DST training is carried out on each step, additional training is carried out on auxiliary tasks, the training is carried out alternately between the auxiliary tasks and target tasks on the step level, the auxiliary tasks and the target tasks share one optimizer, and continuous two updating are carried out.
Compared with the prior art, the invention has the following beneficial effects:
the invention is beneficial to supporting model training through auxiliary task data, especially the use of MTL, and greatly improves the performance of processing high-difficulty tasks. Meanwhile, the method opens a door for a large number of irrelevant natural language processing corpora, and the corpora are defined in a wide non-dialogue task to relieve the data sparsity problem in the DST.
Drawings
FIG. 1 is a flow chart of the main steps of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A task external training method based on a dialog box state tracking model is disclosed, as shown in figure 1, and comprises a DST module, an auxiliary task module, an ITFT module and an MTL module, wherein the ITFT module is connected with the MTL module, the MTL module is connected with the DST module, and the MTL module is connected with the auxiliary task module;
further, the DST module is used for extracting meaning and intention from the user input and keeping and updating the information in the process of continuing the conversation;
further, the auxiliary task module is used for effectively supporting model training;
further, the ITFT module is used to direct the encoder parameters to a favorable direction for subsequent fine tuning to find better local optima;
further, the MTL module is used to train the same model between the auxiliary task and the target task simultaneously.
Further, in the DST module, DST (dialog state tracking) is dialog state tracking, and the task of DST is to extract meaning and intention from user input, and to retain and update such information during the continuation of dialog. With the disclosed DST model Trippy, which depends on the individual performance of the context encoder, slot gate and span prediction, i.e. any of these parts may benefit from out-of-task training, an advanced performance in processing data sets is possible, and a Roberta compiler was chosen because the differentiation of fragments by bert is not adaptive in the dialog.
Further, in the module auxiliary task module, auxiliary tasks irrelevant to the DST are specifically considered. The method comprises a classification task of sentences and sentence pair levels, and aims to find language phenomena. The following training constraints were found to be applicable: one is that the auxiliary task can be a classification problem or a span prediction problem; secondly, only one auxiliary task can be used at a time. The latter enables then to clearly identify the effect of a specific auxiliary task.
Further, in the module ITFT module, ITFT (Intermediate Task Fine-tuning), i.e. Intermediate Task Fine-tuning, continuously trains the same model, i.e. auxiliary Task and DST Task, on two unrelated tasks. The purpose of the ITFT is to direct the encoder parameters to a favorable direction so that subsequent fine tuning can find better local optima.
Further, in the module MTL module, MTL (Multi-task Learning), i.e. Multi-task Learning, we train the same model on two unrelated tasks simultaneously using MTL. DST training is performed for each step and additional training is performed on the auxiliary task, i.e. at the step level, training alternates between the auxiliary task and the target task, both tasks sharing an optimizer and performing two updates in succession.
All the modules can be packaged into an application program, and the off-task training technical function based on the dialog box state tracking model is completed cooperatively through mutual calling interfaces.
Although only the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art, and all changes are encompassed in the scope of the present invention.

Claims (5)

1. An off-task training system based on a dialog state tracking model, characterized in that: the system comprises a DST module, an auxiliary task module, an ITFT module and an MTL module, wherein the ITFT module is connected with the MTL module, the MTL module is connected with the DST module, and the MTL module is connected with the auxiliary task module;
the DST module is used for extracting meaning and intention from user input and keeping and updating the information in the continuous process of the conversation;
the auxiliary task module is used for supporting model training;
the ITFT module is used for guiding the parameters of the encoder to a favorable direction so that subsequent fine adjustment can find better local optimization;
the MTL module is used to train the same model between the auxiliary task and the target task simultaneously.
2. The off-task training system based on the dialog state tracking model of claim 1, wherein: in the DST module, DST (session state tracking) processes a data set by using a DST model Trippy, and a Roberta compiler gives bert adaptability to fragment differentiation in a session.
3. The off-task training system based on the dialog state tracking model of claim 1, wherein: the auxiliary task module comprises sentences and classification tasks of the sentences to layers, and adopts the following training constraints: the auxiliary task is a classification problem or a span prediction problem; only one auxiliary task can be used at a time.
4. The off-task training system based on the dialog state tracking model of claim 1, wherein: the ITFT module is a task fine-tuning module, and continuously trains the same model on two unrelated tasks, wherein the two unrelated tasks are an auxiliary task and a DST task respectively.
5. The off-task training system based on the dialog state tracking model of claim 1, wherein: the MTL module is a multi-task learning module, DST training is carried out on each step, additional training is carried out on auxiliary tasks, the training is carried out alternately between the auxiliary tasks and target tasks on the step level, the auxiliary tasks and the target tasks share one optimizer, and continuous two updating are carried out.
CN202110104849.3A 2021-01-26 2021-01-26 Off-task training system based on dialog box state tracking model Pending CN112818097A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110104849.3A CN112818097A (en) 2021-01-26 2021-01-26 Off-task training system based on dialog box state tracking model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110104849.3A CN112818097A (en) 2021-01-26 2021-01-26 Off-task training system based on dialog box state tracking model

Publications (1)

Publication Number Publication Date
CN112818097A true CN112818097A (en) 2021-05-18

Family

ID=75859417

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110104849.3A Pending CN112818097A (en) 2021-01-26 2021-01-26 Off-task training system based on dialog box state tracking model

Country Status (1)

Country Link
CN (1) CN112818097A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357838A (en) * 2017-06-23 2017-11-17 上海交通大学 Dialog strategy canbe used on line method based on multi-task learning
CN109885668A (en) * 2019-01-25 2019-06-14 中译语通科技股份有限公司 A kind of expansible field interactive system status tracking method and apparatus
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more
US20200152184A1 (en) * 2018-11-08 2020-05-14 PolyAI Limited Dialogue system, a dialogue method, a method of generating data for training a dialogue system, a system for generating data for training a dialogue system and a method of training a dialogue system
CN111241279A (en) * 2020-01-07 2020-06-05 华东师范大学 Natural language relation extraction method based on multi-task learning mechanism
CN112164476A (en) * 2020-09-28 2021-01-01 华南理工大学 Medical consultation conversation generation method based on multitask and knowledge guidance

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357838A (en) * 2017-06-23 2017-11-17 上海交通大学 Dialog strategy canbe used on line method based on multi-task learning
US20200152184A1 (en) * 2018-11-08 2020-05-14 PolyAI Limited Dialogue system, a dialogue method, a method of generating data for training a dialogue system, a system for generating data for training a dialogue system and a method of training a dialogue system
CN109885668A (en) * 2019-01-25 2019-06-14 中译语通科技股份有限公司 A kind of expansible field interactive system status tracking method and apparatus
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more
CN111241279A (en) * 2020-01-07 2020-06-05 华东师范大学 Natural language relation extraction method based on multi-task learning mechanism
CN112164476A (en) * 2020-09-28 2021-01-01 华南理工大学 Medical consultation conversation generation method based on multitask and knowledge guidance

Similar Documents

Publication Publication Date Title
US7529671B2 (en) Block synchronous decoding
CN112509564A (en) End-to-end voice recognition method based on connection time sequence classification and self-attention mechanism
Mitić et al. The relationship between CO2 emissions, industry, services and gross fixed capital formation in the Balkan countries
CN104376842A (en) Neural network language model training method and device and voice recognition method
CN110046562A (en) A kind of wind power system health monitor method and device
CN107085743A (en) A kind of deep learning algorithm implementation method and platform based on domestic many-core processor
CN113380237A (en) Unsupervised pre-training speech recognition model for enhancing local dependency relationship and training method
CN112818097A (en) Off-task training system based on dialog box state tracking model
LU502689B1 (en) Unconstrained lip-to-speech synthesis method, system and storage medium
CN116911419A (en) Long time sequence prediction method based on trend correlation feature learning
CN112992144B (en) Intelligent voice regulation and control method applied to electric power field
CN113705079A (en) Model compression method based on layer number sampling and deep neural network model
Jiang et al. Research on high-precision lightweight speech recognition model with small training set in Multi-person conversation scenario
CN110647988A (en) Accelerated calculation method of SSD (solid State disk) target detection convolutional neural network
Yang et al. Lico-net: Linearized convolution network for hardware-efficient keyword spotting
Chen et al. A light-weighted one-stage framework for speech enhancement
CN117555306B (en) Digital twinning-based multi-production-line task self-adaptive scheduling method and system
CN107943795B (en) Method for improving translation accuracy of neural machine, translation method, translation system and translation equipment
Kim et al. Efficient Adaptation of Spoken Language Understanding based on End-to-End Automatic Speech Recognition
Kan et al. The CEEMDAN-ps-gcGRU Model On Water Pressure Prediction With Strong Irregularity
WO2024077002A3 (en) Natural intelligence for natural language processing
Yang et al. Using Large Corpus N-gram Statistics to Improve Recurrent Neural Language Models
Hamdi et al. Are investment and saving cointegrated? Evidence from Middle East and North African countries
CN113095362A (en) Intermediate layer improvement technical method based on BERT
CN116229944A (en) Two-stage voice awakening method based on cascade structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210518