CN111724767A - Spoken language understanding method based on Dirichlet variational self-encoder and related equipment - Google Patents

Spoken language understanding method based on Dirichlet variational self-encoder and related equipment Download PDF

Info

Publication number
CN111724767A
CN111724767A CN201911247568.2A CN201911247568A CN111724767A CN 111724767 A CN111724767 A CN 111724767A CN 201911247568 A CN201911247568 A CN 201911247568A CN 111724767 A CN111724767 A CN 111724767A
Authority
CN
China
Prior art keywords
corpus
sampling
dirichlet
encoder
spoken language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911247568.2A
Other languages
Chinese (zh)
Other versions
CN111724767B (en
Inventor
高望
朱珣
邓宏涛
王煜炜
曾凡琮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jianghan University
Original Assignee
Jianghan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jianghan University filed Critical Jianghan University
Priority to CN201911247568.2A priority Critical patent/CN111724767B/en
Publication of CN111724767A publication Critical patent/CN111724767A/en
Application granted granted Critical
Publication of CN111724767B publication Critical patent/CN111724767B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a spoken language understanding method based on a Dirichlet variational self-encoder, which belongs to the technical field of computers and comprises the following steps: sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set; according to the sampling corpus, data enhancement is carried out; and generating a training corpus. The invention realizes that the semi-supervised learning method based on the Dirichlet variational self-encoder is introduced into the modeling process of spoken language understanding, the latent semantic features of original data are learned and high-quality new data are generated, the labeling cost is reduced, and the beneficial effect of improving the spoken language understanding model is achieved.

Description

Spoken language understanding method based on Dirichlet variational self-encoder and related equipment
Technical Field
The invention relates to the technical field of computers, in particular to a spoken language understanding method based on a Dirichlet variational self-encoder and related equipment.
Background
The task-based dialog system is a human-computer interaction system which helps a user to complete a specific task through multiple rounds of dialog, and is a research direction which is widely concerned and has a wide application prospect. Currently, many research institutes and technology companies have been involved in the field of task-based dialog systems, such as maruazid, Siri, microsoft, xianna, by ariza. Spoken language understanding is a core technology for constructing task-based dialog systems for parsing natural language originally input by a user into computer-understandable structured semantic expressions. The expression comprises a semantic unit which can represent the intention of a user most, and is important for the development of a human-computer interaction system.
In recent years, great progress has been made in spoken language understanding models based on deep neural networks, in particular joint learning models of semantic Slot Filling (Slot Filling) and Intent recognition (Intent Classifier). The basic idea of this model is to use a neural network to learn semantic information of an input sentence and then output an intention category of the whole sentence and semantic slot labels corresponding to each word. In the model, the generation processes of the intention category and the semantic slot label can learn each other, and the performance is improved together. Compared with the traditional machine learning and rule method-based method, the combined learning model has the advantages of higher accuracy, no need of handwriting templates, strong adaptability and the like.
However, the joint learning model is similar to most natural language processing tasks and faces serious data scarcity problems. Furthermore, the sparsity problem is exacerbated by the near infinite domain space in the spoken language understanding dataset and the labor intensive labeling task. While the traditional data enhancement and generation method relies on enhancement/generation functions, the generated sentences are generally poor in robustness and diversity. The problems of overfitting and lacking generalization capability of the joint learning model and the like exist, so that the spoken language understanding effect is influenced, and the key problem to be solved by the invention is also solved.
Disclosure of Invention
The invention provides a spoken language understanding method based on a Dirichlet variational self-encoder and related equipment, which are used for solving the technical problems in the prior art.
In a first aspect, an embodiment of the present invention provides a method for spoken language understanding based on a dirichlet variational self-encoder, where the method includes: sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set; and according to the sampling corpus set, performing data enhancement to generate training corpuses.
Further, in the first aspect, the sampling the training corpus by using the dirichlet variational self-encoder, and generating the sampling corpus specifically includes: giving the number n of the sampled corpuses, and initializing a null corpus set M; when the number of the corpora in M is less than n, looping S1121-S1124: s1121 selecting a real word sequence w; s1122 deduces the approximate posterior parameter by the inverse gamma distribution function approximation method
Figure BDA0002308096470000011
S1123 distribution of q by variationφ(w | z) sampling
Figure BDA0002308096470000021
S1124 will sample the corpus
Figure BDA0002308096470000022
Adding into M; and generating the sampling corpus set.
Further, in the first aspect, the generating the corpus specifically includes the following steps: firstly sampling z-qφ(z) then approximating p with a Dirichlet variational self-encoderη(w | z); by pη(w | z) sampling to obtain a sequence of generated words
Figure BDA0002308096470000023
Generating word sequences using generated words
Figure BDA0002308096470000024
Training joint models for spoken language understanding, inference
Figure BDA0002308096470000025
Generating slot fill and intent recognition results
Figure BDA0002308096470000026
Will be provided with
Figure BDA0002308096470000027
And
Figure BDA0002308096470000028
together form a new corpus
Figure BDA0002308096470000029
And adding the data to the generated corpus set.
Further, in the first aspect, the performing data enhancement specifically includes: by latent variable z and sample corpus
Figure BDA00023080964700000210
Data enhancement is performed on semantic slot filling and intent recognition tasks.
In a second aspect, an embodiment of the present invention provides a spoken language understanding system based on a dirichlet variational self-encoder, where the system includes: the system comprises a sampling corpus generating module, a data processing module and a data processing module, wherein the sampling corpus generating module is configured to sample a training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus; the data enhancement module is configured to enhance data according to the sampling corpus; and the corpus generating module is configured to generate corpus.
Further, in the second aspect, the sampling corpus generating module specifically includes: a first sub-module configured to initialize a null corpus M given a sampling corpus number n; a second sub-module configured to loop S1121-S1124 when the number of corpuses in M is less than n: selecting a real word sequence w; s1122, deducing approximate posterior parameters by an inverse gamma distribution function approximation method
Figure BDA00023080964700000211
S1123, distributing q by variationφ(w | z) sampling
Figure BDA00023080964700000212
S1124, corpus of samples
Figure BDA00023080964700000213
Adding into M; a third sub-module configured to generate the corpus of samples.
Further, in a second aspect, the corpus generating module specifically includes: a first subunit configured to first sample z-qφ(z) then approximating p with a Dirichlet variational self-encoderη(w | z); a second subunit configured to utilize pη(w | z) sampling to obtain a sequence of generated words
Figure BDA00023080964700000214
A third subunit configured to generate a word sequence using the generated word
Figure BDA00023080964700000215
Training joint models for spoken language understanding, inference
Figure BDA00023080964700000216
A fourth subunit configured to generate slot filling and intention recognition results
Figure BDA00023080964700000217
A fifth subunit configured to connect
Figure BDA00023080964700000218
And
Figure BDA00023080964700000219
together form a new corpus
Figure BDA00023080964700000220
And adding the data to the generated corpus set.
Further, in a second aspect, the data enhancement module is further specifically configured to: by latent variable z and sample corpus
Figure BDA00023080964700000221
Data enhancement is performed on semantic slot filling and intent recognition tasks.
In a third aspect, the present invention further provides an apparatus for spoken language understanding based on a dirichlet variational self-encoder, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the following steps when executing the program: sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set; according to the sampling corpus, data enhancement is carried out; and generating a training corpus.
In a fourth aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of: sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set; according to the sampling corpus, data enhancement is carried out; generating a training corpus.
One or more technical schemes provided in the embodiment of the invention have at least the following technical effects or advantages:
the invention provides a spoken language understanding method based on a Dirichlet variational self-encoder, which comprises the steps of firstly, sampling a training corpus by using the Dirichlet variational self-encoder to generate a sampling corpus set; then, data enhancement is carried out according to the sampling corpus set; and finally, a training corpus is generated, so that the semi-supervised learning method based on the Dirichlet variational self-encoder is introduced into the modeling process of spoken language understanding, potential semantic features of original data are learned, high-quality new data are generated, the labeling cost is reduced, and the beneficial effect of improving the spoken language understanding model is achieved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flowchart of a spoken language understanding method based on a dirichlet variational auto-encoder in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of DirVAE-SLU model in the embodiment of the present application;
FIG. 3 is a schematic view of another embodiment of the present application;
fig. 4 is a schematic structural diagram of a computer-readable storage medium in an embodiment of the present application.
Detailed Description
The spoken language understanding method based on the Dirichlet variational self-encoder provided by the invention realizes that a semi-supervised learning method based on the Dirichlet variational self-encoder is introduced into a modeling process of spoken language understanding, potential semantic features of original data are learned and high-quality new data are generated, the labeling cost is reduced, and the beneficial effect of improving a spoken language understanding model is achieved.
Referring to fig. 1-2, the technical solution in the embodiment of the present invention is as follows:
s11, sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set;
s12, enhancing data according to the sampling corpus;
and S13, generating a training corpus.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The term "and/or" in the description and claims of the present invention and the above drawings is only one kind of association relationship describing the associated object, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Example one
An embodiment of the present invention provides a spoken language understanding method based on a dirichlet variational self-encoder, please refer to fig. 1, where the method includes:
s11, sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set;
s12, enhancing data according to the sampling corpus;
and S13, generating a training corpus.
According to the research of the inventor, the joint learning model is similar to most natural language processing tasks and faces the problem of serious data scarcity. Furthermore, sparsity problems are exacerbated by the close infinite domain space in spoken language understanding datasets and the labeling task that consumes a lot of manpower. While the traditional data enhancement and generation method relies on enhancement/generation functions, the generated sentences are generally poor in robustness and diversity. This will cause the problem of over-fitting and lacking generalization ability of the joint learning model, thereby affecting the understanding effect of the spoken language. Based on this, the present invention provides a spoken language understanding method and related device based on dirichlet variational auto-encoder, so as to solve the above technical problems.
In the following, a mouth language understanding method based on a dirichlet variational self-encoder according to an embodiment of the present invention is described in detail with reference to fig. 1:
s11, sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set;
the standard spoken language understanding model is a discriminant model highly related to a data set, and the data set understood by the spoken language at least comprises an input word sequence w, a tag sequence s filled with semantic slots and a tag y recognized by an intention. For the training data set (w, s, y), the loss function is shown in equation (1):
L(θ;w,s,y)=-logpθ(s,y|w) (1)
where θ represents the parameters that the model needs to solve. Given an input word sequence w, the joint model can simultaneously predict the semantic bin sequence s% and recognition intent using a method that maximizes log-likelihood
Figure BDA0002308096470000042
As shown in equation (2):
Figure BDA0002308096470000041
the sampling process is a key step of the Dirichlet variational self-encoder, and the training corpus can be sampled through the sampling process, so that the semantic features of sentences or vocabularies are obtained. A good sampling process can effectively improve the performance of the data-enhanced spoken language understanding model. Assuming that the corpus x is sampled from a real but unknown probability distribution P (x) e P, the exploratory sampling process is a sampling process that approximates the real distribution P (x) by introducing a latent variable z. Specifically, the dirichlet variational self-encoder approximates the true distribution p (x) by using the variational posterior distribution q (z | x) and the parameters (h, f), and measures the difference between the variational posterior distribution q (z | x) and the true posterior distribution p (z | x) by the KL divergence (KL divergence), and the loss function of the model is shown in formula (3):
Figure BDA0002308096470000051
applying equation (3) to the spoken language understanding task for data enhancement, then:
Figure BDA0002308096470000052
when solving the optimized parameters of the model
Figure BDA0002308096470000053
Then, a new word sequence can be obtained through variational distribution sampling of w
Figure BDA0002308096470000054
Data enhancement is performed on the spoken language understanding model, as shown in formula (5):
Figure BDA0002308096470000055
the conventional variational auto-coding assumes that the prior distribution of latent variables is a continuous random variable, while the dirichlet variational auto-coder uses the dirichlet distribution conjugated with the polynomial distribution as the prior distribution of latent variables, which is more suitable for the spoken language understanding model, as shown in equation (6):
z~p(z)=Dirichlet(α),w~pη(w|z) (6)
wherein α denotes the Dirichlet super parameter the approximate variational posterior distribution q in the encoderφ(z | w) sampling to
Figure BDA0002308096470000056
Approximate posterior parameters
Figure BDA0002308096470000057
The method is not directly sampling z from Dirichlet distribution, but utilizes the characteristic that the Dirichlet distribution can be composed of a plurality of independent gamma distributions, and potential variables are sampled by using a gamma synthesis methodK) Wherein MultiGamma (a, β,1)K) Representing K random variables subject to a gamma distribution then, using the sum term for v ∑ viV is normalized. The loss function is:
Figure BDA0002308096470000058
for equation (7), an inverse Gamma Distribution Function Approximation (inverse Gamma Distribution Function) method may enable a back-propagating flow to an input through a stochastic gradient method to infer model parameters-1(u;α,β)≈β-1(ua(α))1/αTherefore, the invention replaces the randomness of v by introducing auxiliary variables u-Uniform (0,1), and takes the Gamma sampled v as the determined values of α and β. the exploratory sampling process of DirVAE-SLU may specifically include the following steps:
s111, giving the number n of the sampled corpuses, and initializing a null corpus set M;
s112, when the number of the corpora in the M is less than n, circulating S1121-S1124:
s1121, selecting a real word sequence w;
s1122, deducing approximate posterior parameters by an inverse gamma distribution function approximation method
Figure BDA0002308096470000061
S1123, distributing q by variationφ(w | z) sampling
Figure BDA0002308096470000062
S1124, corpus of samples
Figure BDA0002308096470000063
Adding into M;
and S13, generating the sampling corpus set.
Then executing S12, and performing data enhancement according to the sampling corpus;
specifically, after obtaining the sample corpus, the DirVAE-SLU passes the latent variable z and the sample corpus
Figure BDA0002308096470000064
For data enhancement of semantic slot filling and intent recognition tasks, equation (1) can be transformed into:
Figure BDA0002308096470000065
where, phi represents the parameter of the original corpus w, and ζ is the sampled corpus
Figure RE-GDA0002486770120000066
Fill semantic slots and identify parameters of intent. Considering the Dirichlet variational autocoder for data enhancement and spoken language understanding together, the joint training loss function of DirVAE-SLU is as follows:
Figure BDA0002308096470000067
structurally, the DirVAE-SLU model can be divided into two parts, namely a data enhancement part for performing latent variable inference by using a Dirichlet variate autocoder and generating a sampling corpus, and a part for realizing spoken language understanding through the sampling corpus. Wherein the data enhancement partThe DirVAE-SLU model uses a two-way Long-Short term memory (LSTM) network in the encoder part and three one-way LSTM networks in the decoder part, and the training process of the model is to solve the optimal parameters (η) by minimizing the loss function (formula (9))***)
η***=argminL(η,φ,ζ;w,s,y) (10)
Finally, S13 is executed to generate corpus.
In detail, in the process of generating the training corpus by the DirVAE-SLU model, sampling is performed by using an inverse gamma distribution function approximation method. The method can comprehensively consider the factors such as corpus balance degree and computational resource overhead in the real data set to select data. When elected, DirVAE-SLU generates enough corpora using the following process:
1. first, sampling z to qφ(z) then approximating p with a Dirichlet variational self-encoderη(w|z);
2. By pη(w | z) sampling to obtain a sequence of generated words
Figure BDA0002308096470000071
3. Generating word sequences using generated words
Figure BDA0002308096470000072
Training joint models for spoken language understanding, inference
Figure BDA0002308096470000073
4. Generating slot fill and intent recognition results
Figure BDA0002308096470000074
5. Will be provided with
Figure BDA0002308096470000075
And
Figure BDA0002308096470000076
together form a new corpus
Figure BDA0002308096470000077
And adding the data to the generated corpus set.
The method provided by the invention can verify the high efficiency of the method by performing data enhancement experiment comparison on the reference model. The present invention uses two open source evaluation datasets: the aviation Information system data sets atis (Airlinetravel Information systems) and the virtual assistant corpus Snips are used as data sets for experiments. In the experiment, a is 0.99.1100And β is 1, the input layer uses a Glove 300-dimensional word vector, the hidden layer dimension of the bi-directional LSTM in the encoder is 256, the hidden layer dimensions of the three unidirectional LSTM in the decoder are 1024, and a Slot-Gated model is used as a reference model.
TABLE 1 comparison of data enhancement effects on different datasets
Figure BDA0002308096470000078
As can be seen from the experimental results in Table 1, after data enhancement is performed by using DirVAE-SLU, the spoken language understanding performance of the reference model on two data sets is improved, so that the advancement of the invention is verified.
That is, the embodiment of the invention realizes that the semi-supervised learning method based on the dirichlet variational self-encoder is introduced into the modeling process of the spoken language understanding, the latent semantic features of the original data are learned and high-quality new data are generated, the labeling cost is reduced, and the beneficial effect of improving the spoken language understanding model is achieved.
Based on the same inventive concept, the embodiment of the invention also provides a device corresponding to the method in the first embodiment, which is shown in the second embodiment.
Example two
An embodiment of the present invention provides a system, where the system includes:
the sampling corpus generating module is configured to sample the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus;
the data enhancement module is configured to enhance data according to the sampling corpus;
and the corpus generating module is configured to generate corpus.
In the second embodiment of the present invention, the sampling corpus generating module specifically includes:
the first submodule is configured to give a sampling corpus number n and initialize a null corpus set M; a second sub-module configured to loop S1121-S1124 when the number of corpuses in M is less than n: s1121, selecting a real word sequence w; s1122, deducing approximate posterior parameters by an inverse gamma distribution function approximation method
Figure BDA0002308096470000081
S1123, distributing q by variationφ(w | z) sampling
Figure BDA0002308096470000082
S1124, corpus of samples
Figure BDA0002308096470000083
Adding into M; a third sub-module configured to generate the corpus of samples.
In the second embodiment of the present invention, the corpus generating module specifically includes: a first subunit configured to first sample z-qφ(z) then approximating p with a Dirichlet variational self-encoderη(w | z); a second subunit configured to utilize pη(w | z) sampling to obtain a sequence of generated words
Figure BDA0002308096470000084
A third subunit configured to generate a word sequence using the generated word
Figure BDA0002308096470000085
Training joint models for spoken language understanding, inference
Figure BDA0002308096470000086
A fourth subunit configured to generate slot filling and intention recognition results
Figure BDA0002308096470000087
A fifth subunit configured to connect
Figure BDA0002308096470000088
And
Figure BDA0002308096470000089
together form a new corpus
Figure BDA00023080964700000810
And adding the data to the generated corpus set.
In a second embodiment of the present invention, the data enhancement module is further specifically configured to: by latent variable z and sample corpus
Figure BDA00023080964700000811
Data enhancement is performed on semantic slot filling and intent recognition tasks.
Since the system described in the second embodiment of the present invention is a device used for implementing the method of the first embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the device, and thus the detailed description is omitted here. All the devices adopted by the method of the first embodiment of the invention belong to the protection scope of the invention.
EXAMPLE III
Based on the same invention communication between the first embodiment and the second embodiment, a third embodiment of the present invention provides an apparatus, including: radio Frequency (RF) circuitry 310, memory 320, input unit 330, display unit 340, audio circuitry 350, WiFi module 360, processor 370, and power supply 380. Wherein, the memory 320 stores a computer program operable on the processor 370, and the processor 370 implements the steps S110, S120, S130, S140, and S150 described in the first embodiment when executing the computer program; or implementing step S210, step S220, step S230, step S240, step S250 and step S260 described in embodiment two; or step S301, step S302, step S303, and step S304 described in the third embodiment are implemented.
In a specific implementation process, when the processor executes the computer program, either implementation manner of the first embodiment or the second embodiment can be realized.
Those skilled in the art will appreciate that the device configuration shown in fig. 3 is not intended to be limiting of the device itself and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes the components of the computer device in detail with reference to fig. 3:
RF circuitry 310 may be used for receiving and transmitting signals, and in particular, for receiving downlink information from base stations and processing the received downlink information to processor 370. In general, the RF circuit 310 includes, but is not limited to, at least one Amplifier, transceiver, coupler, Low Noise Amplifier (LNA), duplexer, and the like.
The memory 320 may be used to store software programs and modules, and the processor 370 may execute various functional applications of the computer device and data processing by operating the software programs and modules stored in the memory 320. The memory 320 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 320 may include a high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 330 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the computer apparatus. Specifically, the input unit 330 may include a keypad 331 and other input devices 332. The keyboard 331 can collect the input operation of the user thereon and drive the corresponding connection device according to a preset program. The keyboard 331 collects the output information and sends it to the processor 370. The input unit 330 may include other input devices 332 in addition to the keyboard 331. In particular, other input devices 332 may include, but are not limited to, one or more of a touch panel, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 340 may be used to display information input by a user or information provided to the user and various menus of the computer device. The Display unit 340 may include a Display panel 341, and optionally, the Display panel 341 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the keyboard 331 may cover the display panel 341, and when the keyboard 331 detects a touch operation thereon or nearby, the keyboard 331 transmits to the processor 370 to determine the type of the touch event, and then the processor 370 provides a corresponding visual output on the display panel 341 according to the type of the input event. Although the keyboard 331 and the display panel 341 are shown in fig. 3 as two separate components to implement input and output functions of the computer device, in some embodiments, the keyboard 331 and the display panel 341 may be integrated to implement input and output functions of the computer device.
Audio circuitry 350, speaker 351, microphone 352 may provide an audio interface between a user and a computer device. The audio circuit 350 may transmit the electrical signal converted from the received audio data to the speaker 351, and convert the electrical signal into a sound signal by the speaker 351 for output;
WiFi belongs to short-distance wireless transmission technology, and computer equipment can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 360, and provides wireless broadband internet access for the user. Although fig. 3 shows the WiFi module 360, it is understood that it does not belong to the essential constitution of the computer device, and can be omitted entirely within the scope not changing the essence of the invention as needed.
The processor 370 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, performs various functions of the computer device and processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory 320, thereby monitoring the computer device as a whole. Alternatively, processor 370 may include one or more processing units; preferably, the processor 370 may be integrated with an application processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like.
The computer device also includes a power supply 380 (such as a power adapter) for powering the various components, which may preferably be logically connected to the processor 370 through a power management system.
Example four
Based on the same inventive concept, as shown in fig. 4, the fifth embodiment provides a computer-readable storage medium 400, on which a computer program 411 is stored, and when the computer program 411 is executed by a processor, the steps S110, S120, S130, S140 and S150 described in the first embodiment are realized; or implementing step S210, step S220, step S230, step S240, step S250 and step S260 described in embodiment two; or step S301, step S302, step S303, and step S304 described in the third embodiment are implemented.
In a specific implementation, the computer program 411 may implement any one of the first, second, and third embodiments when executed by a processor.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The technical scheme provided by the embodiment of the invention at least has the following technical effects or advantages:
the semi-supervised learning method based on the Dirichlet variational self-encoder is introduced into the modeling process of spoken language understanding, potential semantic features of original data are learned, high-quality new data are generated, the labeling cost is reduced, and the beneficial effect of improving the spoken language understanding model is achieved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention are within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (10)

1. A method for spoken language understanding based on a dirichlet variational auto-encoder, the method comprising:
s11, sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set;
s12, enhancing data according to the sampling corpus;
and S13, generating a training corpus.
2. The method for spoken language understanding based on a dirichlet variational auto-encoder as claimed in claim 1, wherein the sampling the corpus with the dirichlet variational auto-encoder to generate the corpus of samples specifically comprises:
s111, giving the number n of the sampled corpuses, and initializing a null corpus set M;
s112, when the number of the corpora in the M is less than n, circulating S1121-S1124:
s1121, selecting a real word sequence w;
s1122, deducing approximate posterior parameters by an inverse gamma distribution function approximation method
Figure FDA0002308096460000011
S1123, distributing q by variationφ(w | z) sampling
Figure FDA0002308096460000012
S1124, corpus of samples
Figure FDA0002308096460000013
Adding into M;
and S13, generating the sampling corpus set.
3. The dirichlet-dependent spoken language understanding method of claim 2, wherein the generating the corpus specifically includes the steps of:
s131, first sampling z to qφ(z) then approximating p with a Dirichlet variational self-encoderη(w|z);
S132, use of pη(w | z) sampling to obtain a sequence of generated words
Figure FDA0002308096460000014
S133, generating word sequence by using generated words
Figure FDA0002308096460000015
Training joint models for spoken language understanding, inference
Figure FDA0002308096460000016
S134, generating groove filling and intention recognition results
Figure FDA0002308096460000017
S135, mixing
Figure FDA0002308096460000018
And
Figure FDA0002308096460000019
together form a new corpus
Figure FDA00023080964600000110
And adding the data to the generated corpus set.
4. The dirichlet-dependent spoken language understanding method of claim 3, wherein said performing data enhancement specifically comprises:
by latent variable z and sample corpus
Figure FDA00023080964600000111
Data enhancement is performed on semantic slot filling and intent recognition tasks.
5. A spoken language understanding system based on a dirichlet variational auto-encoder, the system comprising:
the system comprises a sampling corpus generating module, a data processing module and a data processing module, wherein the sampling corpus generating module is configured to sample a training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus;
the data enhancement module is configured to enhance data according to the sampling corpus;
and the corpus generating module is configured to generate corpus.
6. The spoken language understanding system of claim 5, wherein the sample corpus generating module specifically comprises:
the first submodule is configured to give a sampling corpus number n and initialize a null corpus set M;
a second sub-module configured to loop S1121-S1124 when the number of corpuses in M is less than n:
s1121, selecting a real word sequence w;
s1122, deducing approximate posterior parameters by an inverse gamma distribution function approximation method
Figure FDA0002308096460000021
S1123, distributing q by variationφ(w | z) sampling
Figure FDA0002308096460000022
S1124, corpus of samples
Figure FDA0002308096460000023
Adding into M;
a third sub-module configured to generate the corpus of samples.
7. The system according to claim 6, wherein the corpus generating module specifically comprises:
a first subunit configured to first sample z-qφ(z) then approximating p with a Dirichlet variational self-encoderη(w|z);
A second subunit configured to generate the word sequence using p η (w | z) sampling
Figure FDA0002308096460000024
A third subunit configured to generate a word sequence using the generated word
Figure FDA0002308096460000025
Training joint models for spoken language understanding, inference
Figure FDA0002308096460000026
A fourth subunit configured to generate slot filling and intention recognition results
Figure FDA0002308096460000027
A fifth subunit configured to connect
Figure FDA0002308096460000028
And
Figure FDA0002308096460000029
together form a new corpus
Figure FDA00023080964600000210
And adding the data to the generated corpus set.
8. The system of claim 6, wherein the data enhancement module is further specifically configured to:
by latent variable z and sample corpus
Figure FDA00023080964600000211
Data enhancement is performed on semantic slot filling and intent recognition tasks.
9. An apparatus for spoken language understanding based on a dirichlet variational auto-encoder, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of:
s11, sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set;
s12, enhancing data according to the sampling corpus;
and S13, generating a training corpus.
10. A computer-readable storage medium, on which a computer program is stored, which program, when executed by a processor, carries out the steps of:
s11, sampling the training corpus by using a Dirichlet variational self-encoder to generate a sampling corpus set;
s12, enhancing data according to the sampling corpus;
and S13, generating a training corpus.
CN201911247568.2A 2019-12-09 2019-12-09 Spoken language understanding method based on Dirichlet variation self-encoder and related equipment Active CN111724767B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911247568.2A CN111724767B (en) 2019-12-09 2019-12-09 Spoken language understanding method based on Dirichlet variation self-encoder and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911247568.2A CN111724767B (en) 2019-12-09 2019-12-09 Spoken language understanding method based on Dirichlet variation self-encoder and related equipment

Publications (2)

Publication Number Publication Date
CN111724767A true CN111724767A (en) 2020-09-29
CN111724767B CN111724767B (en) 2023-06-02

Family

ID=72563990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911247568.2A Active CN111724767B (en) 2019-12-09 2019-12-09 Spoken language understanding method based on Dirichlet variation self-encoder and related equipment

Country Status (1)

Country Link
CN (1) CN111724767B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597769A (en) * 2020-12-15 2021-04-02 中山大学 Short text topic identification method based on Dirichlet variational self-encoder

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886388A (en) * 2019-01-09 2019-06-14 平安科技(深圳)有限公司 A kind of training sample data extending method and device based on variation self-encoding encoder
US10373055B1 (en) * 2016-05-20 2019-08-06 Deepmind Technologies Limited Training variational autoencoders to generate disentangled latent factors
CN110134951A (en) * 2019-04-29 2019-08-16 淮阴工学院 A kind of method and system for analyzing the potential theme phrase of text data
US20190258937A1 (en) * 2016-11-04 2019-08-22 Google Llc Training neural networks using a variational information bottleneck
CN110211575A (en) * 2019-06-13 2019-09-06 苏州思必驰信息科技有限公司 Voice for data enhancing adds method for de-noising and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10373055B1 (en) * 2016-05-20 2019-08-06 Deepmind Technologies Limited Training variational autoencoders to generate disentangled latent factors
US20190258937A1 (en) * 2016-11-04 2019-08-22 Google Llc Training neural networks using a variational information bottleneck
CN109886388A (en) * 2019-01-09 2019-06-14 平安科技(深圳)有限公司 A kind of training sample data extending method and device based on variation self-encoding encoder
CN110134951A (en) * 2019-04-29 2019-08-16 淮阴工学院 A kind of method and system for analyzing the potential theme phrase of text data
CN110211575A (en) * 2019-06-13 2019-09-06 苏州思必驰信息科技有限公司 Voice for data enhancing adds method for de-noising and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597769A (en) * 2020-12-15 2021-04-02 中山大学 Short text topic identification method based on Dirichlet variational self-encoder
CN112597769B (en) * 2020-12-15 2022-06-03 中山大学 Short text topic identification method based on Dirichlet variational self-encoder

Also Published As

Publication number Publication date
CN111724767B (en) 2023-06-02

Similar Documents

Publication Publication Date Title
US10937416B2 (en) Cross-domain multi-task learning for text classification
CN111755078B (en) Drug molecule attribute determination method, device and storage medium
US9965465B2 (en) Distributed server system for language understanding
EP3320490B1 (en) Transfer learning techniques for disparate label sets
WO2019022842A1 (en) Domain addition systems and methods for a language understanding system
EP3259713A1 (en) Pre-training and/or transfer learning for sequence taggers
WO2018039049A1 (en) Multi-turn cross-domain natural language understanding systems, building platforms, and methods
CN108920666A (en) Searching method, system, electronic equipment and storage medium based on semantic understanding
CN111428520A (en) Text translation method and device
US20220374776A1 (en) Method and system for federated learning, electronic device, and computer readable medium
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
CN107112009B (en) Method, system and computer-readable storage device for generating a confusion network
US20210248498A1 (en) Method and apparatus for training pre-trained knowledge model, and electronic device
CN113641830B (en) Model pre-training method, device, electronic equipment and storage medium
TWI741877B (en) Network model quantization method, device, and electronic apparatus
CN108415939A (en) Dialog process method, apparatus, equipment and computer readable storage medium based on artificial intelligence
EP4057283A2 (en) Method for detecting voice, method for training, apparatuses and smart speaker
US20220238098A1 (en) Voice recognition method and device
CN112528654A (en) Natural language processing method and device and electronic equipment
CN115062617A (en) Task processing method, device, equipment and medium based on prompt learning
US11361031B2 (en) Dynamic linguistic assessment and measurement
CN111724767A (en) Spoken language understanding method based on Dirichlet variational self-encoder and related equipment
WO2023174189A1 (en) Method and apparatus for classifying nodes of graph network model, and device and storage medium
US20220300717A1 (en) Method and apparatus for generating dialogue state
CN115687934A (en) Intention recognition method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant