CN116127051A - Dialogue generation method based on deep learning, electronic equipment and storage medium - Google Patents
Dialogue generation method based on deep learning, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116127051A CN116127051A CN202310428793.6A CN202310428793A CN116127051A CN 116127051 A CN116127051 A CN 116127051A CN 202310428793 A CN202310428793 A CN 202310428793A CN 116127051 A CN116127051 A CN 116127051A
- Authority
- CN
- China
- Prior art keywords
- response
- interference
- generator
- skeleton
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a dialogue generation method based on deep learning, electronic equipment and a storage medium, wherein the method comprises the following steps: 1. constructing a dialogue generation data set based on retrieval editing; 2. constructing a dialogue generating model consisting of a skeleton generator, a skeleton response generator, an interference response generator and a response fusion module, and training; 3. and generating corresponding replies to any queries input by the user by using the trained models. According to the invention, the template response is obtained through retrieval and the response skeleton is constructed so as to eliminate the interference of useless information in the template response, and then the response skeleton is edited so as to generate a final response, so that the dialogue system can generate a response which is more attached to the context and has richer semantics, and the problem of 'safety response' is relieved.
Description
Technical Field
The invention belongs to the field of natural language processing, relates to the technical fields of dialogue systems, deep learning and the like, and particularly relates to a dialogue generation method based on deep learning, electronic equipment and a storage medium.
Background
With the rapid development of artificial intelligence and man-machine interaction, a dialogue system or a dialogue robot is applied to more and more service scenes, and artificial services are replaced to a certain extent. Current dialog systems can be classified into an open domain dialog system and a task dialog system according to the usage scenario. Task dialogue systems are designed to accomplish a specific task or goal, such as customer service robot nectar and intelligent assistant siri; open domain conversations are generally based on boring conversations, which are not aimed at performing specific tasks, but rather for natural and fluent communication with humans.
Compared with the task type dialogue system, the dialogue theme of the open domain dialogue system is open, and a wider topic and a more complex sentence pattern are covered. According to the construction method, existing open domain dialog systems can be classified into two types, a generation-based dialog system and a retrieval-based dialog system. Wherein the search-based approach selects a response from the existing corpus, and thus its performance is severely limited by predefined indexing rules. With the development of deep learning, dialog systems based on the generation have become increasingly popular in recent years. Deep learning models based on sequence-to-sequence (seq 2 seq) have found wide application in single-round dialog generation. However, conventional sequence-to-sequence based dialog generation models often fail to generate responses that are word-rich, content-rich, and information-rich. In practice, such models often tend to generate popular but tedious replies, such as "I don't know" or "I consider as such". This problem is also known as the "safety response" problem.
Recent efforts have attempted to use information retrieval techniques to fill in the deficiencies of insufficient information in dialog generation. In conventional search-based dialog systems, the data sets are constructed based on human dialog, from which replies are retrieved that are generally grammatically correct and semantically rich. For a given context, similar dialogs are retrieved from the corpus and considered as additional sources of information in the generative dialog system, introducing more rich semantics and sentence patterns to a certain extent, so that the generated replies can improve the "safety response" problem of the generative model to a certain extent. However, when the retrieved reply is similar to the original reply, the generative model tends to replicate the reference reply without making necessary modifications to the reply. In the opposite case, i.e. when the retrieved reply is independent of the original reply, a lot of information is acquired and interference independent of the current dialog context is introduced, resulting in non-ideal model performance.
Disclosure of Invention
The invention aims to solve the defects of the prior art, and provides a dialogue generating method, electronic equipment and a storage medium based on deep learning, which are used for combining a search dialogue model and a generation dialogue model to introduce external information into dialogue generation so as to relieve the problem of safety response of the generation dialogue system, thereby obtaining a smooth response generating result with rich information.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention relates to a dialogue generating method based on deep learning, which is characterized by comprising the following steps:
step 1, constructing a dialogue generation data set based on retrieval editing;
step 1.1, acquiring a query text setQCorresponding response text setROrder-makingqRepresenting a set of query textQAny one of the queries, orderrRepresenting queriesqA corresponding response;
step 1.2, retrieving and respondingrSimilar template responser’And get and connectr’Corresponding template queryq’Thereby composing a dialogue data setDA four-element group @ inr, q, r’, q’);
Step 2, constructing a skeleton generatorGSkeleton response generatorG T Interference response generatorG S And a dialogue generating model formed by the response fusion module and training;
step 2.1, use skeleton GeneratorGResponse from templater’Middle separation response frameworktVocabulary of interferencesThereby obtainingInterfering vocabulary to all template responses and forming a vector representation setSThe method comprises the steps of carrying out a first treatment on the surface of the And from the vector representation setSRandomly selecting interference wordss’Vector representation of (a)Hs ’ ;
Step 2.2 response-basedrVocabulary of interferences’QueryqUsing an interference response generatorG S Obtaining a response generation result interference responser s’ ;
Step 2.3 response-basedrResponse skeletontQueryqUsing a generator of interference responses with the transmitterG S Skeletal response generator with identical structureG T Obtaining a response generation result skeleton responser t ;
Step 2.4, the response fusion module utilizes (8) to respond to interferencer s’ And skeleton responser t Fusion is carried out to obtain fusion responser s,t :
r s,t = r t ⊙σ(r s’ ) (8)
In the formula (8), the amino acid sequence of the compound,σindicates a sigmoid function, ++indicates a dot product;
step 2.5, constructing a skeleton generator by using the formula (9)GSkeleton response generatorG T Is a loss function of (2)R DIR :
R DIR = E[L(r s,t , r)] +λ Var[L(r s,t , r)] (9)
In the formula (9), the amino acid sequence of the compound,Eit is indicated that the desire is to be met,Varthe variance is represented as a function of the variance,λis the parameter of the ultrasonic wave to be used as the ultrasonic wave,L() represents cross entropy loss;
step 2.6, constructing an interference response generator using (10)G S Is a loss function of (2)R S :
R S = E[L(r s’ , r)] (10)
Step 2.7 training the dialog generation model by random gradient descent method and calculating the loss functionR S A kind of electronic device with high-pressure air-conditioning systemR DIR And when the loss function converges or reaches the maximum training times, stopping training and obtaining a dialogue generating model of the optimal parameters for generating corresponding replies to any query input by the user.
The dialogue generating method based on deep learning is also characterized in that the skeleton generator in the step 2.1GConsists of a transducer encoder and a cross-attention layer, and separates response frameworks according to the following processtVocabulary of interferences;
Step 2.1.1 query of the Transformer encoder pairqAnd template responser’Respectively processing to obtain inquiryqVector representation of (a)H q ={h q 1 ,…,h q i , …,h q m Sum template responser’Vector representation of (a)H r’ ={h 1 r’ ,…,h j r’ , …,h n r’ And } wherein,h q i for inquiring aboutqThe hidden vector of the i-th character in (c),mfor inquiring aboutqThe number of characters in (a) is set,h j r’ representing template responsesr’Middle (f)jThe hidden vector of the individual character is used to determine,nresponding to templatesr’The number of characters in (a);
step 2.1.2, calculating a template response by the cross-attention layer using (1)r’Middle (f)jIndividual character pair queryqAttention weight of the i-th character in (b)M i,j :
In the formula (1), the components are as follows,h q k for inquiring aboutqMiddle (f)kThe hidden vector of the individual character is used to determine,score(.) is the attention score, and has:
score (h j r’ ,h q k ) = (h j r’ ) T W att h q k (2)
in the formula (2), the amino acid sequence of the compound,W att as a parameter to be learned of the cross-attention layer,Tis a transposition;
step 2.1.3, the Cross-attention layer calculates a response skeleton using equation (3) and equation (4)tVector representation of (a)H t = {h t 1 ,…,h t j , …,h t n ' interference vocabularysVector representation of (a)H s = {h s 1 ,…,h s j , …,h s n }:
In the formulas (3) and (4),h t j andh s j respectively represent response skeletonstInterference vocabularysMiddle (f)jHidden vectors of the individual characters.
The interference response generator in said step 2.2G S Consists of a transducer decoder and a controller; wherein the transducer decoder is composed of an encoding layer, a position encoding layer, a self-attention layer, a cross-attention layer, two normalization layers, a controller and a response generatorForming device; the transform decoder generates a result interference response by obtaining a response according to the following processr s’ :
Step 2.2.1 the transducer decoder uses the coding layer, the position coding layer, the self-attention layer and the first normalization layer pair responserProcessing to obtain responserVector representation of (a)H r ;
Step 2.2.2 the Transformer decoder will respond by crossing the attention layer and the second normalization layerrVector representation of (a)H r And queryqFusing to obtain a fused response query vector representationH r,q ;
Step 2.2.3, the controller uses (5) to process the interference vocabularys’Vector representation of (a)H s’ And (3) withH r,q Fusing to obtain fused interference fusion vector representationH s’ r,q :
H s’ r,q =β•LN(H s’ ) + (1-β)•H r,q (5)
In the formula (5), the amino acid sequence of the compound,LN(.) represents a standardized layer in the controller;βrepresents the fusion weight and is obtained by the formula (6);
β = σ(W s • [W s’ ;H r,q ]) (6)
in the formula (6), the amino acid sequence of the compound,W s representing the parameters to be learned of the controller,σrepresenting a sigmoid function;
step 2.2.4, the response generator of the transducer decoder obtaining a response generating a resulting interference response using equation (7)r s’ :
r s’ = Linear(LN’ (FFN (H s’ r,q ) +H s’ r,q )) (7)
In the formula (7), the amino acid sequence of the compound,Linear(.) represents a linear layer in the response generator,LN’(.) represents a normalization layer in the response generator,FFN(.) represents the forward propagation layer in the response generator.
The electronic device of the invention comprises a memory and a processor, wherein the memory is used for storing a program for supporting the processor to execute any one of the dialog generating methods, and the processor is configured to execute the program stored in the memory.
The invention provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and the computer program is executed by a processor to execute the steps of any dialog generating method.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a new dialogue generating method, namely, template response is obtained through retrieval and then response generation is carried out on the basis of the template response, the introduction of the template response enables a dialogue system to utilize external information, and human history dialogue is used as a reference, so that the fluency and information quantity of a generated result are improved, and the problem of safety response is relieved to a certain extent.
2. The invention provides a two-stage dialogue generation model based on search editing, namely, a response framework is constructed by searching to obtain template response so as to eliminate interference of useless information in the response framework, and then the response framework is edited to generate a final reply. Compared with the past work, the method and the device have the advantages that the flexibility of the generated model is maintained on the basis of inheriting the fluency and rich information quantity of the search result. Meanwhile, the response framework is generated, so that the generated model can eliminate the interference of irrelevant information in the retrieved template response, and a response which is more fit with the context is generated.
3. The invention introduces a causal intervention method, so that the model can learn a causal mode with environmental invariance, can extract response skeleton help response generation from the retrieved template response, improves the defect that the model can not properly utilize information in the template response in the past research, and can eliminate the influence of interference vocabulary in the template response, thereby better utilizing external information and improving the correlation between the generated response and the query.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of an interference response generator according to the present invention;
FIG. 3 is a causal graph used by the response fusion module of the present invention.
Detailed Description
In this embodiment, as shown in fig. 1, a dialogue generating method based on deep learning is performed according to the following steps:
step 1, constructing a dialogue generation data set based on retrieval editing;
step 1.1, acquiring a query text setQCorresponding response text setROrder-makingqRepresenting a set of query textQAny one of the queries, orderrRepresenting queriesqA corresponding response; in the embodiment, the data sources are China larger network social platform bean and microblog;
step 1.2, retrieving and respondingrSimilar template responser’And get and connectr’Corresponding template queryq’Thereby composing a dialogue data setDA four-element group @ inr, q, r’, q’)。
Step 2, constructing a skeleton generatorGSkeleton response generatorG T Interference response generatorG S And a dialogue generating model formed by the response fusion module and training;
step 2.1, use skeleton GeneratorGResponse from templater’Middle separation response frameworktVocabulary of interferencesThereby obtaining the interference vocabulary of all template responses and forming a vector representation setSThe method comprises the steps of carrying out a first treatment on the surface of the And from a vector representation setSRandomly selecting interference wordss’Vector representation of (a)Hs ’ ;
Skeleton generatorGConsists of a transducer encoder and a cross-attention layer, and separates response frameworks according to the following processtVocabulary of interferences;
Step 2.1.1 transducer encoder pair queryqAnd template responser’Respectively processing to obtain inquiryqVector representation of (a)H q ={h q 1 ,…,h q i , …,h q m Sum template responser’Vector representation of (a)H r’ ={h 1 r’ ,…,h j r’ , …,h n r’ And } wherein,h q i for inquiring aboutqThe hidden vector of the i-th character in (c),mfor inquiring aboutqThe number of characters in (a) is set,h j r’ representing template responsesr’Middle (f)jThe hidden vector of the individual character is used to determine,nresponding to templatesr’The number of characters in (a);
step 2.1.2 calculating the template response by the Cross attention layer using (1)r’Middle (f)jIndividual character pair queryqAttention weight of the i-th character in (b)M i,j :
In the formula (1), the components are as follows,h q k for inquiring aboutqMiddle (f)kThe hidden vector of the individual character is used to determine,score(.) is the attention score, and has:
score (h j r’ ,h q k ) = (h j r’ ) T W att h q k (2)
in the formula (2), the amino acid sequence of the compound,W att as a parameter to be learned of the cross-attention layer,Tis a transposition;
step 2.1.3 Cross injectionThe force sense layer calculates a response skeleton by using the formula (3) and the formula (4)tVector representation of (a)H t = {h t 1 ,…,h t j , …,h t n ' interference vocabularysVector representation of (a)H s = {h s 1 ,…,h s j , …,h s n }:
In the formulas (3) and (4),h t j andh s j respectively represent response skeletonstInterference vocabularysMiddle (f)jHidden vectors of the individual characters.
Step 2.2, response based as shown in FIG. 2rVocabulary of interferences’QueryqUsing an interference response generatorG S Obtaining a response generation result interference responser s’ ;
Interference response generatorG S Consists of a transducer decoder and a controller; the transducer decoder consists of a coding layer, a position coding layer, a self-attention layer, a cross-attention layer, two standardization layers, a controller and a response generator; the transducer decoder generates a resulting interference response by obtaining the response as followsr s’ ;
Step 2.2.1, transformer decoder Using coding layer, position coding layer, self-attention layer and first normalization layer responserProcessing to obtain responserVector representation of (a)H r ;
Step 2.2.2 the transducer decoder will respond through the cross-attention layer and the second normalization layerrVector table of (a)Showing theH r And queryqFusing to obtain a fused response query vector representationH r,q ;
Step 2.2.3, the controller uses (5) to process the interference vocabularys’Vector representation of (a)H s’ And (3) withH r,q Fusing to obtain fused interference fusion vector representationH s’ r,q :
H s’ r,q =β•LN(H s’ ) + (1-β)•H r,q (5)
In the formula (5), the amino acid sequence of the compound,LN(.) represents a standardized layer in the controller;βrepresents the fusion weight and is obtained by the formula (6);
β = σ(W s • [W s’ ;H r,q ]) (6)
in the formula (6), the amino acid sequence of the compound,W s representing the parameters to be learned of the controller,σrepresenting a sigmoid function;
step 2.2.4, response generator of the transducer decoder obtaining a response generating a resulting interference response using equation (7)r s’ :
r s’ = Linear(LN’ (FFN (H s’ r,q ) +H s’ r,q )) (7)
In the formula (7), the amino acid sequence of the compound,Linear(.) represents a linear layer in the response generator,LN’(.) represents a normalization layer in the response generator,FFN(.) represents the forward propagation layer in the response generator.
Step 2.3 response-basedrResponse skeletontQueryqUsing and interfering response generatorsG S Skeletal response generator with identical structureG T Obtaining a response generation result skeleton responser t 。
Step 2.4, the response fusion module responds to the skeleton based on the causal graph shown in fig. 3r t Causal intervention; specifically, the response fusion module utilizes (8) to respond to interferencer s’ And skeleton responser t Fusion is carried out to obtain fusion responser:
r s , t = r t ⊙σ(r s’ ) (8)
In the formula (8), the amino acid sequence of the compound,σindicates a sigmoid function, ++indicates a dot product; the causal graph is a directed acyclic graph, consisting of points representing variables and edges representing causal relationships between variables; causal graphs are generally used to describe the interaction mechanism between a set of variables, which show the causal relationships behind data; the dialog generation process of the present invention may be represented by a causal graph as shown in fig. 3; in step 2.4, the interference response is artificially given by causal interventionr s’ But otherwise still follow the original data generation process shown in fig. 3.
Step 2.5, constructing a skeleton generator by using the formula (9)GSkeleton response generatorG T Is a loss function of (2)R DIR :
R DIR = E[L(r s , t , r)] +λ Var[L(r s , t , r)] (9)
In the formula (9), the amino acid sequence of the compound,Eit is indicated that the desire is to be met,Varthe variance is represented as a function of the variance,λis the parameter of the ultrasonic wave to be used as the ultrasonic wave,L() represents cross entropy loss; loss functionR DIR Meaning that the model is reducing the response generatedr s , t Simultaneously with the error of response r, attempts are made to reduce the effect of external disturbance information on the generated result.
Step 2.6, constructing an interference response generator using (10)G S Is a loss function of (2)R S :
R S = E[L(r s’ , r)] (10)
The loss function obtained by using equation (10)R S For only interference response generatorsG S Updating the parameters in the step (a); by separating the training of this module from the training of other modules in the method, it is avoided that it is interfering with the presentation of learning; at the same time, the method of parameter updating also promotes the interference response generatorG S Only non-causal features based on a given interfering vocabulary are learned.
Step 2.7, training the dialogue generating model by utilizing a random gradient descent method, and calculating a loss functionR S A kind of electronic device with high-pressure air-conditioning systemR DIR And when the loss function converges or reaches the maximum training times, stopping training and obtaining a dialogue generating model of the optimal parameters for generating corresponding replies to any query input by the user.
The test results of the present invention are further described in connection with the following chart:
in order to verify the effectiveness of the method proposed by the present invention, a comparative test was performed. The experimental results are shown in table 1. Wherein, retrieval is a Retrieval system, seq2Seq is a basic sequence-to-sequence model, BART is a pre-training language model, BART-cat receives template response as input based on BART, ske Re is a dialogue generation model based on the Retrieval system, TSLF is a dialogue generation model based on a Transformer and introducing external knowledge, DG is a method provided by the invention. In Table 1, BLUE-1 is an evaluation index based on word overlap, and is used for comparing the character overlap ratio between the generated text and the reference text, and dist-1 and dist-2 are two evaluation indexes for measuring vocabulary diversity.
TABLE 1
As can be seen from the experimental results in Table 1, the method of the present invention is superior to other models in terms of various indexes, and can better acquire information from the retrieved template response types and generate more various texts.
Claims (5)
1. A dialogue generation method based on deep learning is characterized by comprising the following steps:
step 1, constructing a dialogue generation data set based on retrieval editing;
step 1.1, acquiring a query text setQCorresponding response text setROrder-makingqRepresenting a set of query textQAny one of the queries, orderrRepresenting queriesqA corresponding response;
step 1.2, retrieving and respondingrSimilar template responser’And get and connectr’Corresponding template queryq’Thereby composing a dialogue data setDA four-element group @ inr, q, r’, q’);
Step 2, constructing a skeleton generatorGSkeleton response generatorG T Interference response generatorG S And a dialogue generating model formed by the response fusion module and training;
step 2.1, use skeleton GeneratorGResponse from templater’Middle separation response frameworktVocabulary of interferencesThereby obtaining the interference vocabulary of all template responses and forming a vector representation setSThe method comprises the steps of carrying out a first treatment on the surface of the And from the vector representation setSRandomly selecting interference wordss’Vector representation of (a)Hs ’ ;
Step 2.2 response-basedrVocabulary of interferences’QueryqUsing an interference response generatorG S Obtaining a response generation result interference responser s’ ;
Step 2.3 response-basedrResponse skeletontQueryqUsing a noise with the interferenceStress generatorG S Skeletal response generator with identical structureG T Obtaining a response generation result skeleton responser t ;
Step 2.4, the response fusion module utilizes (8) to respond to interferencer s’ And skeleton responser t Fusion is carried out to obtain fusion responser s,t :
r s,t = r t ⊙σ(r s’ ) (8)
In the formula (8), the amino acid sequence of the compound,σindicates a sigmoid function, ++indicates a dot product;
step 2.5, constructing a skeleton generator by using the formula (9)GSkeleton response generatorG T Is a loss function of (2)R DIR :
R DIR = E[L(r s,t , r)] + λ Var[L(r s,t , r)] (9)
In the formula (9), the amino acid sequence of the compound,Eit is indicated that the desire is to be met,Varthe variance is represented as a function of the variance,λis the parameter of the ultrasonic wave to be used as the ultrasonic wave,L() represents cross entropy loss;
step 2.6, constructing an interference response generator using (10)G S Is a loss function of (2)R S :
R S = E[L(r s’ , r)] (10)
Step 2.7 training the dialog generation model by random gradient descent method and calculating the loss functionR S A kind of electronic device with high-pressure air-conditioning systemR DIR To update network parameters, stopping training and obtaining dialogue generation model of optimal parameters when the loss function converges or reaches maximum training times, for generating corresponding replies to any query input by user。
2. The deep learning based dialog generation method of claim 1, wherein the skeleton generator in step 2.1GConsists of a transducer encoder and a cross-attention layer, and separates response frameworks according to the following processtVocabulary of interferences;
Step 2.1.1, the transducer encoder pair queryqAnd template responser’Respectively processing to obtain inquiryqVector representation of (a)H q ={h q 1 , …, h q i , …, h q m Sum template responser’Vector representation of (a)H r’ ={h 1 r’ , …, h j r’ , …, h n r’ And } wherein,h q i for inquiring aboutqThe hidden vector of the i-th character in (c),mfor inquiring aboutqThe number of characters in (a) is set,h j r’ representing template responsesr’Middle (f)jThe hidden vector of the individual character is used to determine,nresponding to templatesr’The number of characters in (a);
step 2.1.2, calculating a template response by the cross-attention layer using (1)r’Middle (f)jIndividual character pair queryqAttention weight of the i-th character in (b)M i,j :
In the formula (1), the components are as follows,h q k for inquiring aboutqMiddle (f)kThe hidden vector of the individual character is used to determine,score(.) is the attention score, and has:
score (h j r’ , h q k ) = (h j r’ ) T W att h q k (2)
in the formula (2), the amino acid sequence of the compound,W att as a parameter to be learned of the cross-attention layer,Tis a transposition;
step 2.1.3, the Cross-attention layer calculates a response skeleton using equation (3) and equation (4)tVector representation of (a)H t = {h t 1 , …, h t j , …, h t n ' interference vocabularysVector representation of (a)H s = {h s 1 , …, h s j , …, h s n }:
In the formulas (3) and (4),h t j andh s j respectively represent response skeletonstInterference vocabularysMiddle (f)jHidden vectors of the individual characters.
3. The deep learning based dialog generation method of claim 1 wherein the interference response generator in step 2.2G S Consists of a transducer decoder and a controller; the transducer decoder consists of an encoding layer, a position encoding layer, a self-attention layer, a cross-attention layer, two standardization layers, a controller and a response generator; the transform decoder generates a result interference response by obtaining a response according to the following processr s’ :
Step 2.2.1, the Transformer decoder utilizationCoding layer, position coding layer, self-attention layer and first normalization layer pair responserProcessing to obtain responserVector representation of (a)H r ;
Step 2.2.2 the Transformer decoder will respond by crossing the attention layer and the second normalization layerrVector representation of (a)H r And queryqFusing to obtain a fused response query vector representationH r,q ;
Step 2.2.3, the controller uses (5) to process the interference vocabularys’Vector representation of (a)H s’ And (3) withH r,q Fusing to obtain fused interference fusion vector representationH s’ r,q :
H s’ r,q = β•LN(H s’ ) + (1-β)•H r,q (5)
In the formula (5), the amino acid sequence of the compound,LN(.) represents a standardized layer in the controller;βrepresents the fusion weight and is obtained by the formula (6);
β = σ(W s • [W s’ ; H r,q ]) (6)
in the formula (6), the amino acid sequence of the compound,W s representing the parameters to be learned of the controller,σrepresenting a sigmoid function;
step 2.2.4, the response generator of the transducer decoder obtaining a response generating a resulting interference response using equation (7)r s’ :
r s’ = Linear( LN’ ( FFN ( H s’ r,q ) + H s’ r,q )) (7)
In the formula (7), the amino acid sequence of the compound,Linear(-) represents a linear layer in a response generator,LN’(.) represents a normalization layer in the response generator,FFN(.) represents the forward propagation layer in the response generator.
4. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that supports the processor to perform the dialog generation method of any of claims 1-3, the processor being configured to execute the program stored in the memory.
5. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor performs the steps of the dialog generation method of any of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310428793.6A CN116127051B (en) | 2023-04-20 | 2023-04-20 | Dialogue generation method based on deep learning, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310428793.6A CN116127051B (en) | 2023-04-20 | 2023-04-20 | Dialogue generation method based on deep learning, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116127051A true CN116127051A (en) | 2023-05-16 |
CN116127051B CN116127051B (en) | 2023-07-11 |
Family
ID=86303166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310428793.6A Active CN116127051B (en) | 2023-04-20 | 2023-04-20 | Dialogue generation method based on deep learning, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116127051B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844335A (en) * | 2016-12-21 | 2017-06-13 | 海航生态科技集团有限公司 | Natural language processing method and device |
CN107506823A (en) * | 2017-08-22 | 2017-12-22 | 南京大学 | A kind of construction method for being used to talk with the hybrid production style of generation |
CN109829038A (en) * | 2018-12-11 | 2019-05-31 | 平安科技(深圳)有限公司 | Question and answer feedback method, device, equipment and storage medium based on deep learning |
US20200097814A1 (en) * | 2018-09-26 | 2020-03-26 | MedWhat.com Inc. | Method and system for enabling interactive dialogue session between user and virtual medical assistant |
US20200226475A1 (en) * | 2019-01-14 | 2020-07-16 | Cambia Health Solutions, Inc. | Systems and methods for continual updating of response generation by an artificial intelligence chatbot |
CN111858931A (en) * | 2020-07-08 | 2020-10-30 | 华中师范大学 | Text generation method based on deep learning |
WO2021077974A1 (en) * | 2019-10-24 | 2021-04-29 | 西北工业大学 | Personalized dialogue content generating method |
US20210141798A1 (en) * | 2019-11-08 | 2021-05-13 | PolyAI Limited | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system |
-
2023
- 2023-04-20 CN CN202310428793.6A patent/CN116127051B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844335A (en) * | 2016-12-21 | 2017-06-13 | 海航生态科技集团有限公司 | Natural language processing method and device |
CN107506823A (en) * | 2017-08-22 | 2017-12-22 | 南京大学 | A kind of construction method for being used to talk with the hybrid production style of generation |
US20200097814A1 (en) * | 2018-09-26 | 2020-03-26 | MedWhat.com Inc. | Method and system for enabling interactive dialogue session between user and virtual medical assistant |
CN109829038A (en) * | 2018-12-11 | 2019-05-31 | 平安科技(深圳)有限公司 | Question and answer feedback method, device, equipment and storage medium based on deep learning |
US20200226475A1 (en) * | 2019-01-14 | 2020-07-16 | Cambia Health Solutions, Inc. | Systems and methods for continual updating of response generation by an artificial intelligence chatbot |
WO2021077974A1 (en) * | 2019-10-24 | 2021-04-29 | 西北工业大学 | Personalized dialogue content generating method |
US20210141798A1 (en) * | 2019-11-08 | 2021-05-13 | PolyAI Limited | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system |
CN111858931A (en) * | 2020-07-08 | 2020-10-30 | 华中师范大学 | Text generation method based on deep learning |
Non-Patent Citations (2)
Title |
---|
CAI D等: "Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory", 《PROCEEDINGS OF THE 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS:HUMAN LANGUAGE TECHNOLOGIES》 * |
陆兴武: "基于深度学习的开放领域多轮对话系统研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Also Published As
Publication number | Publication date |
---|---|
CN116127051B (en) | 2023-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bakhtin et al. | Real or fake? learning to discriminate machine from human generated text | |
CN109992657B (en) | Dialogue type problem generation method based on enhanced dynamic reasoning | |
CN111651557B (en) | Automatic text generation method and device and computer readable storage medium | |
CN104598611B (en) | The method and system being ranked up to search entry | |
CN108628935A (en) | A kind of answering method based on end-to-end memory network | |
CN107679225A (en) | A kind of reply generation method based on keyword | |
KR102654480B1 (en) | Knowledge based dialogue system and method for language learning | |
CN110334196A (en) | Neural network Chinese charater problem based on stroke and from attention mechanism generates system | |
CN116186216A (en) | Question generation method and system based on knowledge enhancement and double-graph interaction | |
CN110516053A (en) | Dialog process method, equipment and computer storage medium | |
CN114387537A (en) | Video question-answering method based on description text | |
CN112463935B (en) | Open domain dialogue generation method and system with generalized knowledge selection | |
CN111046157B (en) | Universal English man-machine conversation generation method and system based on balanced distribution | |
CN116127051B (en) | Dialogue generation method based on deep learning, electronic equipment and storage medium | |
Lin et al. | A hierarchical structured multi-head attention network for multi-turn response generation | |
CN111414466A (en) | Multi-round dialogue modeling method based on depth model fusion | |
CN116561251A (en) | Natural language processing method | |
CN115796187A (en) | Open domain dialogue method based on dialogue structure diagram constraint | |
CN113051897B (en) | GPT2 text automatic generation method based on Performer structure | |
CN113065324A (en) | Text generation method and device based on structured triples and anchor templates | |
CN113626566B (en) | Knowledge dialogue cross-domain learning method based on synthetic data | |
Szymanski et al. | Semantic memory knowledge acquisition through active dialogues | |
CN116244419B (en) | Knowledge enhancement dialogue generation method and system based on character attribute | |
CN117035064B (en) | Combined training method for retrieving enhanced language model and storage medium | |
Ma et al. | Cascaded LSTMs based deep reinforcement learning for goal-driven dialogue |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |