CN110866103A - Sentence diversity generation method and system in dialog system - Google Patents

Sentence diversity generation method and system in dialog system Download PDF

Info

Publication number
CN110866103A
CN110866103A CN201911087246.6A CN201911087246A CN110866103A CN 110866103 A CN110866103 A CN 110866103A CN 201911087246 A CN201911087246 A CN 201911087246A CN 110866103 A CN110866103 A CN 110866103A
Authority
CN
China
Prior art keywords
sentence
answer sentence
feature vector
answer
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911087246.6A
Other languages
Chinese (zh)
Other versions
CN110866103B (en
Inventor
陈炳成
梁小丹
林倞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911087246.6A priority Critical patent/CN110866103B/en
Publication of CN110866103A publication Critical patent/CN110866103A/en
Application granted granted Critical
Publication of CN110866103B publication Critical patent/CN110866103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a sentence diversity generation method and a system in a dialogue system, wherein the method comprises the following steps: step S1, extracting the dependency tree of the answer sentence, and converting the dependency tree into an undirected graph; step S2, inputting the answer sentence and the undirected graph obtained in step S1 into a graph structure converter to obtain a feature vector of the answer sentence; step S3 of extracting feature vectors of the dialogue history of the answer sentence using the sequence structure converter; and step S4, inputting the feature vector of the answer sentence obtained in step S2 and the feature vector of the dialogue history obtained in step S3 into a condition variation automatic encoder to obtain a new answer sentence of the dialogue history.

Description

Sentence diversity generation method and system in dialog system
Technical Field
The invention relates to the technical field of a dialogue system, in particular to a sentence diversity generation method and system fusing sentence grammar structures in the dialogue system.
Background
The dialogue system is a research direction of natural language processing, and the research aim of the dialogue system is to generate a next sentence of dialogue history according to the dialogue history of a user and a dialogue robot. In the field of dialog systems, a large number of related technologies have been developed, mainly including a search-type dialog system, a generation-type dialog system, and a dialog system in which search-type and generation-type are mixed.
In reality, there are multiple different answers to the same dialog history, which is a sentence diversity generating question in a dialog system. However, in the dialog system of the prior art, the sentence generation does not use the grammar structure information of the answer sentence, so that the generated sentence has weak relevance and a good dialog effect cannot be realized.
Disclosure of Invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides a method and a system for generating sentence diversity in a dialog system, so as to increase the diversity of sentence generation in the dialog system.
To achieve the above and other objects, the present invention provides a sentence diversity generating method in a dialog system, comprising the steps of:
step S1, extracting the dependency tree of the answer sentence, and converting the dependency tree into an undirected graph;
step S2, inputting the answer sentence and the undirected graph obtained in step S1 into a graph structure converter to obtain a feature vector of the answer sentence;
step S3 of extracting feature vectors of the dialogue history of the answer sentence using the sequence structure converter;
step S4, inputting the feature vector of the answer sentence obtained in step S2 and the feature vector of the dialogue history obtained in step S3 into the condition-variant automatic encoder, and obtaining a new answer sentence of the dialogue history.
Preferably, the step S1 further includes:
step S100, extracting a dependency tree of the answer sentence by utilizing an open-source natural language processing tool;
step S101, representing the dependency tree by using a directed graph, wherein nodes in the dependency tree are words of sentences, and directed edges in the dependency tree represent syntactic relations among the words;
and step S102, changing the directed edge in the directed graph into an undirected edge to obtain an undirected graph of the answer sentence.
Preferably, in step S1, the undirected graph is represented by an adjacency matrix.
Preferably, if the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M with dimension n × n, and the value M in the ith row and jth column in the adjacency matrix MijIs determined by the following conditions:
Figure BDA0002265795350000021
preferably, step S2 further includes
Step S200, performing Graph Attention operation on the characteristic V of the answer sentence and the adjacent matrix M of the undirected Graph;
step S201, adding the result of the Graph Attention operation and the characteristic V, and performing layer normalization operation;
step S202, connecting the knot of step S201
Figure BDA0002265795350000022
Inputting a layer of feedforward neural network, and then carrying out layer normalization operation to further obtain the feature vector of the answer sentence.
Preferably, in step S3, m sentences of the dialogue history are obtained, the m sentences are arranged in sequence, the m sentences are sequentially spliced into a sentence C from the beginning, and the sentence C is input to the sequence structure converter, so as to obtain the feature vector of the dialogue history.
Preferably, the conditional variational automatic encoder is composed of an encoder and a decoder, and the feature vector E 'of the dialogue history obtained in step S3 is input to the encoder of the conditional variational automatic encoder to obtain a normal distribution z', from which a plurality of samples are sampled and then input to the decoder, respectively, to obtain a plurality of different answer sentences.
To achieve the above object, the present invention further provides a sentence diversity generating system in a dialog system, comprising:
the answer sentence processing unit is used for extracting a dependency tree of the answer sentence and converting the dependency tree into an undirected graph;
an answer sentence feature vector extraction unit operable to input the answer sentence and the undirected graph of the answer sentence obtained by the answer sentence processing unit into a graph structure converter to obtain a feature vector of the answer sentence;
a dialogue history feature extraction unit configured to extract a feature vector of a dialogue history of the answer sentence using a sequence structure converter;
a diversity sentence generating unit configured to obtain a new answer sentence of the dialogue history by an automatic encoder that varies the feature vector of the answer sentence feature vector extracting unit and the feature vector of the dialogue history feature extracting unit.
Preferably, in the answer sentence processing unit, the dependency tree is converted into an undirected graph by changing a directed edge into an undirected edge, and the undirected graph is represented by a adjacency matrix.
Preferably, in the dialog history feature extraction unit, m sentences of the dialog history are acquired, the m sentences are arranged in sequence, the m sentences are sequentially spliced into a sentence C from the beginning to the beginning, and the sentence C is input into the sequence structure converter to obtain the feature vector of the dialog history.
Compared with the prior art, the sentence diversity generation method and system in the dialogue system convert the dependency tree into the undirected graph by extracting the dependency tree of the answer sentence, then input the answer sentence and the undirected graph into the graph structure converter to obtain the feature vector of the answer sentence, extract the feature vector of the dialogue history of the answer sentence by using the sequence structure converter, and finally input the obtained feature vector of the answer sentence and the obtained feature vector of the dialogue history into the condition-variant automatic encoder to obtain the new answer sentence of the dialogue history, thereby achieving the purpose of improving the diversity of sentence generation in the dialogue system.
Drawings
FIG. 1 is a flowchart illustrating the steps of a sentence diversity generation method in a dialog system according to the present invention;
FIG. 2 is a diagram of a dependency tree in an embodiment of the present invention;
FIG. 3 is a diagram illustrating a dependency tree represented by a directed graph according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a dependency tree being transformed into an undirected graph according to an embodiment of the present invention;
FIG. 5 is a diagram of an adjacency matrix for an answer sentence in accordance with an embodiment of the present invention;
FIG. 6 is a block diagram of a Graph Transformer (Graph Transformer) according to an embodiment of the present invention;
FIG. 7 is a block diagram of a conditional variational auto-encoder in accordance with an embodiment of the present invention;
FIG. 8 is a system architecture diagram of a sentence diversity generation system in a dialog system in accordance with the present invention.
Detailed Description
Other advantages and capabilities of the present invention will be readily apparent to those skilled in the art from the present disclosure by describing the embodiments of the present invention with specific embodiments thereof in conjunction with the accompanying drawings. The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.
FIG. 1 is a flowchart illustrating the steps of a sentence diversity generation method in a dialog system according to the present invention. As shown in fig. 1, the method for generating sentence diversity in a dialog system of the present invention includes the following steps:
in step S1, the dependency tree of the answer sentence is extracted, and the dependency tree is converted into an undirected graph, which is represented by the adjacency matrix M.
Specifically, the answer sentence is an answer sentence to a question in a dialog system, and a dependency tree of the answer sentence can be extracted by using an open-source natural language processing tool, such as Stanford CoreNLP, allenlp, or the like. The dependency tree is a directed graph, the nodes in the dependency tree are words of sentences, and the directed edges in the dependency tree represent syntactic relations among the words. If there is a certain syntactic relation between words, there will be a directed edge between the nodes represented by these two words in the directed graph.
In the invention, the conversion method of the dependency tree into the undirected graph is to change the directed edge of the dependency tree into the undirected edge.
Specifically, assuming that the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M with dimension n × n, and the value M of the ith row and jth column in the adjacency matrix MijIs determined by the following conditions:
Figure BDA0002265795350000051
an example of extracting the sentence dependency tree and computing the adjacency matrix is as follows: for example, there is a sentence "the syntactic structure is fused in sentence feature extraction. "
First, using an open-source natural language processing tool to extract a dependency tree of the sentence, as shown in fig. 2;
then, the above dependency tree is represented by a directed graph, where the nodes in the dependency tree are words of a sentence, and the directed edges in the dependency tree represent the syntactic relationship between words, as shown in fig. 3.
And changing the directed edge in the directed graph into an undirected edge to obtain an undirected graph of the sentence, as shown in fig. 4.
And finally fusing the example sentence' into a syntactic structure in sentence feature extraction. "to the adjacency matrix M, as shown in fig. 5.
Step S2, inputting the answer sentence and the adjacency matrix M of the undirected graph of step S1 into a graph structure converter (GraphTransformer), to obtain a feature vector of the answer sentence;
fig. 6 is a structural diagram of a Graph Transformer (Graph Transformer) according to an embodiment of the present invention, and the following describes a feature extraction process of the Graph Transformer (Graph Transformer) according to the present invention with reference to fig. 6:
specifically, assume that the answer sentence is composed of n words, and the ith word is composed of a k-dimensional feature vector ViIf the answer sentence is a sentence, the answer sentence is characterized by V ═ V(V1,...,Vn). The features V of the answer sentence and the adjacency matrix M of the undirected Graph are input to the Graph structure Transformer (Graph Transformer).
The characteristic extraction process of the graph structure converter is as follows:
1. the Graph Attention operation is performed on the feature V of the answer sentence and the adjacency matrix M of the undirected Graph. Specifically, the feature vector V for the ith wordiThe Graph Attention calculates it as follows:
Figure BDA0002265795350000061
Figure BDA0002265795350000062
wherein M isijIs the value of the ith row and the jth column of the adjacency matrix M in step S1.
2. Will be provided with
Figure BDA0002265795350000063
And ViAdding and carrying out layer normalization operation, wherein the specific operation is as follows:
Figure BDA0002265795350000064
the LayerNorm is a layer normalization operation, which is not described herein since the layer normalization operation is prior art.
3. Will be provided with
Figure BDA0002265795350000065
Inputting a layer of feedforward neural network, and then performing layer normalization operation, wherein the specific operation is as follows:
Figure BDA0002265795350000066
wherein FFN is a layer of feedforward neural network.
So that the feature vector for the ith wordViObtaining the transformed feature vector after Graph transform processing
Figure BDA0002265795350000067
Thereby obtaining the characteristics of the answer sentence after the Graph Transformer transformation
Figure BDA0002265795350000068
Finally, the characteristics of the answer sentence
Figure BDA0002265795350000069
The following is performed, resulting in the final answer sentence characteristic V':
Figure BDA00022657953500000610
step S3, extracting features of the dialogue history of the answer sentence using a sequence structure converter (Transformer);
specifically, in a dialogue system, a dialogue sample is generally composed of a dialogue history and an answer sentence, and an example is as follows:
dialog history (m sentences):
1. today is the weather?
2. Today, the weather is good and sunny.
……
M, do you feel that the next week will not be rainstorm?
The answer sentence is then the next sentence of the dialog history, for example:
i feel that the next week rainstorm occurs.
It is assumed that, in the dialog system, the dialog history of the answer sentence is composed of m sentences, and the m sentences are arranged in order, the m sentences are spliced into a sentence C in order, and the sentence C is input to a sequence structure Transformer (Transformer), resulting in a feature vector of the dialog history.
Specifically, assume that sentence C is composed of r words, and the ith word in sentence C is composed of a k-dimensional feature vectorEiIf so, the feature of the sentence C is denoted as E ═ E (E)1,...,Er) After E is input into the Transformer, the characteristics of the transformed sentence C can be obtained
Figure BDA0002265795350000071
Features for sentence C
Figure BDA0002265795350000072
The following operations are carried out to obtain the characteristic E' of the final dialogue history:
Figure BDA0002265795350000073
in step S4, the feature vector V 'of the answer sentence in step S2 and the feature vector E' of the dialogue history in step S3 are input to the condition-variant automatic encoder, and an answer sentence is generated.
The structure diagram of the conditional variational automatic encoder is shown in fig. 7, the conditional variational automatic encoder is composed of an encoder and a decoder, the feature E ' of the dialogue history is input into the encoder of the conditional variational automatic encoder to obtain a normal distribution z ', only a plurality of samples need to be sampled from the normal distribution z ', and then the samples are respectively input into the decoder, so that a plurality of different answer sentences can be obtained. Specifically, after the feature vector E 'of the dialogue history is input to an encoder in the condition-variant automatic encoder, a normal distribution Z' can be obtained, and then, the feature vector E 'is sampled from the normal distribution Z' for a plurality of times and is input to a decoder, and then, the decoder generates different answer sentences, so that diversity generation of the answer sentences is realized.
FIG. 8 is a system architecture diagram of a sentence diversity generation system in a dialog system in accordance with the present invention. As shown in fig. 8, the present invention provides a sentence diversity generation system in a dialog system, including:
an answer sentence processing unit 201 for extracting a dependency tree of the answer sentence and converting the dependency tree into an undirected graph, the undirected graph being represented by a adjacency matrix;
specifically, the answer sentence is an answer sentence to a question in a dialog system, and a dependency tree of the answer sentence can be extracted by using an open-source natural language processing tool, such as Stanford CoreNLP, allenlp, or the like. The dependency tree is a directed graph, the nodes in the dependency tree are words of sentences, and the directed edges in the dependency tree represent syntactic relations among the words. If there is a certain syntactic relation between words, there will be a directed edge between the nodes represented by these two words in the directed graph.
In the present invention, the answer sentence extraction unit 201 converts the dependency tree into an undirected graph by changing the directed edge of the dependency tree into an undirected edge.
Specifically, assuming that the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M with dimension n × n, and the value M of the ith row and jth column in the adjacency matrix MijIs determined by the following conditions:
Figure BDA0002265795350000081
an answer sentence feature vector extraction unit 202 for inputting the answer sentence and the undirected Graph of the answer sentence obtained by the answer sentence processing unit 201 into a Graph Transformer (Graph Transformer) to obtain a feature vector of the answer sentence.
Assuming that the answer sentence is composed of n words, the ith word is composed of a k-dimensional feature vector ViIf the answer sentence is a sentence, the answer sentence is characterized by V ═ V (V)1,...,Vn). The features V of the answer sentence and the adjacency matrix M of the undirected Graph are input to the Graph structure Transformer (Graph Transformer).
The characteristic extraction process of the graph structure converter is as follows:
1. the Graph Attention operation is performed on the feature V of the answer sentence and the adjacency matrix M of the undirected Graph. Specifically, the feature vector V for the ith wordiThe Graph Attention calculates it as follows:
Figure BDA0002265795350000082
Figure BDA0002265795350000083
wherein M isijIs the value of the ith row and the jth column of the adjacency matrix M in step S1.
2. Will be provided with
Figure BDA0002265795350000084
And ViAdding and carrying out layer normalization operation, wherein the specific operation is as follows:
Figure BDA0002265795350000085
the LayerNorm is a layer normalization operation, which is not described herein since the layer normalization operation is prior art.
3. Will be provided with
Figure BDA0002265795350000091
Inputting a layer of feedforward neural network, and then performing layer normalization operation, wherein the specific operation is as follows:
Figure BDA0002265795350000092
wherein FFN is a layer of feedforward neural network.
So that the feature vector V for the ith wordiAfter the processing by the Graph structure converter (Graph Transformer), the transformed feature vector is obtained
Figure BDA0002265795350000093
Thereby obtaining the characteristics of the answer sentence transformed by the Graph Transformer
Figure BDA0002265795350000094
Finally, the characteristics of the answer sentence
Figure BDA0002265795350000095
By carrying out the following operationsFeature V' of the final answer sentence:
Figure BDA0002265795350000096
a dialogue history feature extraction unit 203, configured to acquire an answer history of the answer sentence, and extract a feature vector of the dialogue history using a sequence structure converter (Transformer);
specifically, it is assumed that, in the dialog system, the dialog history of the answer sentence is composed of m sentences, the m sentences are arranged in order, the m sentences are spliced into one sentence C in order, and the sentence C is input to a sequence structure Transformer (Transformer) to obtain a feature vector of the dialog history.
In particular, assume that sentence C is composed of r words, and the ith word in sentence C is composed of a k-dimensional feature vector EiIf so, the feature of the sentence C is denoted as E ═ E (E)1,...,Er) After E is input into a sequence structure converter (Transformer), the characteristics of the transformed sentence C can be obtained
Figure BDA0002265795350000097
Features for sentence C
Figure BDA0002265795350000098
The following operations are carried out to obtain the characteristic E' of the final dialogue history:
Figure BDA0002265795350000099
diversity sentence generating section 204 obtains the answer sentence of the dialogue history by inputting the feature vector of the answer sentence in answer sentence feature vector extracting section 202 and the feature vector of the dialogue history in dialogue history feature extracting section 203 into a condition-variant automatic encoder.
The conditional variational automatic encoder consists of an encoder and a decoder, the characteristic E ' of the conversation history is input into the encoder of the conditional variational automatic encoder to obtain a normal distribution z ', a plurality of samples are sampled from the normal distribution z ' only, and then the samples are respectively input into the decoder to obtain a plurality of different answer sentences
In summary, the method and system for generating sentence diversity in a dialog system of the present invention extracts a dependency tree of an answer sentence, converts the dependency tree into an undirected graph, then inputs the answer sentence and the undirected graph into a graph and hooks a converter to obtain a feature vector of the answer sentence, extracts a feature vector of a dialog history of the answer sentence using a sequence structure converter, and finally inputs the obtained feature vector of the answer sentence and the obtained feature vector of the dialog history into a condition-variant automatic encoder to obtain a new answer sentence of the dialog history, thereby achieving the purpose of improving the diversity of sentence generation in the dialog system.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Modifications and variations can be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the present invention. Therefore, the scope of the invention should be determined from the following claims.

Claims (10)

1. A sentence diversity generation method in a dialog system, comprising the steps of:
step S1, extracting the dependency tree of the answer sentence, and converting the dependency tree into an undirected graph;
step S2, inputting the answer sentence and the undirected graph obtained in step S1 into a graph structure converter to obtain a feature vector of the answer sentence;
step S3 of extracting feature vectors of the dialogue history of the answer sentence using the sequence structure converter;
step S4, inputting the feature vector of the answer sentence obtained in step S2 and the feature vector of the dialogue history obtained in step S3 into the condition-variant automatic encoder, and obtaining a new answer sentence of the dialogue history.
2. The method for generating sentence diversity in a dialog system of claim 1, wherein the step S1 further comprises:
step S100, extracting a dependency tree of the answer sentence by utilizing an open-source natural language processing tool;
step S101, representing the dependency tree by using a directed graph, wherein nodes in the dependency tree are words of sentences, and directed edges in the dependency tree represent syntactic relations among the words;
and step S102, changing the directed edge in the directed graph into an undirected edge to obtain an undirected graph of the answer sentence.
3. The method of sentence diversity generation in a dialog system of claim 2, wherein: in step S1, the undirected graph is represented by an adjacency matrix.
4. A method as claimed in claim 3, wherein if said answer sentence has n words, said adjacency matrix of said answer sentence is a matrix M with dimension n x n, and a value M in the ith row and jth column of said adjacency matrix MijIs determined by the following conditions:
Figure FDA0002265795340000011
5. the method of claim 4, wherein the step S2 further comprises
Step S200, performing GraphAttention operation on the characteristic V of the answer sentence and the adjacent matrix M of the undirected graph;
step S201, adding the result of the Graph Attention operation and the characteristic V, and performing layer normalization operation;
step S202, connecting the knot of step S201
Figure FDA0002265795340000021
Inputting a layer of feedforward neural network, and performing layer normalization operation to obtain the answer sentenceA feature vector.
6. The method of claim 5, wherein the sentence diversity generation method comprises: in step S3, m sentences of the dialog history are obtained, the m sentences are arranged in sequence, the m sentences are sequentially spliced into a sentence C from the beginning, and the sentence C is input to the sequence structure converter, so as to obtain the feature vector of the dialog history.
7. The method of claim 6, wherein the sentence diversity generation method comprises: the conditional variational automatic encoder is composed of an encoder and a decoder, and the feature vector E ' of the dialogue history obtained in step S3 is input to the encoder of the conditional variational automatic encoder to obtain a normal distribution z ', a plurality of samples are sampled from the normal distribution z ', and then the samples are respectively input to the decoder to obtain a plurality of different answer sentences.
8. A sentence diversity generation system in a dialog system, comprising:
the answer sentence processing unit is used for extracting a dependency tree of the answer sentence and converting the dependency tree into an undirected graph;
an answer sentence feature vector extraction unit operable to input the answer sentence and the undirected graph of the answer sentence obtained by the answer sentence processing unit into a graph structure converter to obtain a feature vector of the answer sentence;
a dialogue history feature extraction unit configured to extract a feature vector of a dialogue history of the answer sentence using a sequence structure converter;
a diversity sentence generating unit configured to obtain a new answer sentence of the dialogue history by an automatic encoder that varies the feature vector of the answer sentence feature vector extracting unit and the feature vector of the dialogue history feature extracting unit.
9. The system of claim 8, wherein: in the answer sentence processing unit, the dependency tree is converted into an undirected graph by changing a directed edge into an undirected edge, and the undirected graph is represented by a adjacency matrix.
10. The method of sentence diversity generation in a dialog system of claim 8, wherein: in the dialogue history feature extraction unit, m sentences of the dialogue history are obtained, the m sentences are arranged in sequence, the m sentences are spliced into a sentence C according to the sequence and the first position, and the sentence C is input into the sequence structure converter to obtain the feature vector of the dialogue history.
CN201911087246.6A 2019-11-08 2019-11-08 Sentence diversity generation method and system in dialogue system Active CN110866103B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911087246.6A CN110866103B (en) 2019-11-08 2019-11-08 Sentence diversity generation method and system in dialogue system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911087246.6A CN110866103B (en) 2019-11-08 2019-11-08 Sentence diversity generation method and system in dialogue system

Publications (2)

Publication Number Publication Date
CN110866103A true CN110866103A (en) 2020-03-06
CN110866103B CN110866103B (en) 2023-07-07

Family

ID=69654516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911087246.6A Active CN110866103B (en) 2019-11-08 2019-11-08 Sentence diversity generation method and system in dialogue system

Country Status (1)

Country Link
CN (1) CN110866103B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543020A (en) * 2018-11-27 2019-03-29 科大讯飞股份有限公司 Inquiry handles method and system
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109726276A (en) * 2018-12-29 2019-05-07 中山大学 A kind of Task conversational system based on depth e-learning
CN110309287A (en) * 2019-07-08 2019-10-08 北京邮电大学 The retrieval type of modeling dialog round information chats dialogue scoring method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109543020A (en) * 2018-11-27 2019-03-29 科大讯飞股份有限公司 Inquiry handles method and system
CN109726276A (en) * 2018-12-29 2019-05-07 中山大学 A kind of Task conversational system based on depth e-learning
CN110309287A (en) * 2019-07-08 2019-10-08 北京邮电大学 The retrieval type of modeling dialog round information chats dialogue scoring method

Also Published As

Publication number Publication date
CN110866103B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN108334487B (en) Missing semantic information completion method and device, computer equipment and storage medium
CN110418210B (en) Video description generation method based on bidirectional cyclic neural network and depth output
CN109492113B (en) Entity and relation combined extraction method for software defect knowledge
CN112069811A (en) Electronic text event extraction method with enhanced multi-task interaction
CN111460807A (en) Sequence labeling method and device, computer equipment and storage medium
CN114254660A (en) Multi-modal translation method and device, electronic equipment and computer-readable storage medium
CN111402861A (en) Voice recognition method, device, equipment and storage medium
CN111597342B (en) Multitasking intention classification method, device, equipment and storage medium
CN114118417A (en) Multi-mode pre-training method, device, equipment and medium
CN113886601B (en) Electronic text event extraction method, device, equipment and storage medium
CN112016275A (en) Intelligent error correction method and system for voice recognition text and electronic equipment
CN112527986A (en) Multi-round dialog text generation method, device, equipment and storage medium
CN114925195A (en) Standard content text abstract generation method integrating vocabulary coding and structure coding
CN111831783A (en) Chapter-level relation extraction method
CN113128206A (en) Question generation method based on word importance weighting
CN112307179A (en) Text matching method, device, equipment and storage medium
CN114691848A (en) Relational triple combined extraction method and automatic question-answering system construction method
CN114510576A (en) Entity relationship extraction method based on BERT and BiGRU fusion attention mechanism
CN113065352B (en) Method for identifying operation content of power grid dispatching work text
CN114117008A (en) Semantic understanding method, computer equipment and storage medium
CN117828024A (en) Plug-in retrieval method, device, storage medium and equipment
CN110866103A (en) Sentence diversity generation method and system in dialog system
CN111813907A (en) Question and sentence intention identification method in natural language question-answering technology
CN113342982B (en) Enterprise industry classification method integrating Roberta and external knowledge base
JP6550677B2 (en) Encoding device, decoding device, discrete sequence conversion device, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Liang Xiaodan

Inventor after: Chen Bingcheng

Inventor after: Lin Jing

Inventor before: Chen Bingcheng

Inventor before: Liang Xiaodan

Inventor before: Lin Jing

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant