CN112199482B - Dialogue generation method, device, equipment and readable storage medium - Google Patents

Dialogue generation method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN112199482B
CN112199482B CN202011059826.7A CN202011059826A CN112199482B CN 112199482 B CN112199482 B CN 112199482B CN 202011059826 A CN202011059826 A CN 202011059826A CN 112199482 B CN112199482 B CN 112199482B
Authority
CN
China
Prior art keywords
vector
reply
query
preset
common sense
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011059826.7A
Other languages
Chinese (zh)
Other versions
CN112199482A (en
Inventor
李雅峥
杨海钦
姚晓远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011059826.7A priority Critical patent/CN112199482B/en
Publication of CN112199482A publication Critical patent/CN112199482A/en
Priority to PCT/CN2021/091292 priority patent/WO2022068197A1/en
Application granted granted Critical
Publication of CN112199482B publication Critical patent/CN112199482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a dialogue generation method, a device, equipment and a readable storage medium, wherein the method comprises the following steps: acquiring questioning information, and converting the questioning information into a first query vector by using a preset first gate recursion unit GRU model; determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a question vector according to the first query vector and the common sense vector; according to the question vector, converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors; converting each reply vector into reply words respectively, and combining all the reply words into reply information; the invention can quickly and accurately form the reply information in the remote consultation dialogue, and improves the user experience.

Description

Dialogue generation method, device, equipment and readable storage medium
Technical Field
The present invention relates to the field of remote consultation dialogues for digital medical treatment, and in particular, to a dialog generating method, apparatus, device and readable storage medium.
Background
With the continuous development of artificial intelligence, man-machine conversations are increasingly being applied to various scenes; for example, in a manual customer service scene, the answer information corresponding to the question information is formed by identifying the question information input by a user, so that the labor cost is reduced; however, if the background knowledge of the user question and the common sense information of the user question are not understood, the conventional open domain man-machine dialogue system only starts from dialogue data, general answers lacking effective information are generated, and the readability of the answer information may be influenced. In addition, how to quickly and accurately form the reply information according to the user question information is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a dialogue generation method, a dialogue generation device, dialogue generation equipment and a readable storage medium, which can quickly and accurately form reply information in a remote consultation dialogue and improve user experience.
According to an aspect of the present invention, there is provided a dialog generation method, the method comprising:
acquiring questioning information, and converting the questioning information into a first query vector by using a preset first gate recursion unit GRU model;
determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a question vector according to the first query vector and the common sense vector;
according to the question vector, converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors;
each reply vector is converted into a reply word, and all the reply words are combined into reply information, respectively.
Optionally, the acquiring the query information and converting the query information into a first query vector by using a preset first gate recursion unit GRU model includes:
performing word segmentation on the questioning information, and forming word sequences by a plurality of keywords obtained after the word segmentation;
aiming at a target keyword in the word sequence, calculating a hiding influence factor of the target keyword transmitted to a keyword positioned behind the target keyword in the word sequence by using the first recursion unit GRU model according to the hiding influence factor transmitted to the target keyword by the keyword positioned behind the target keyword in the word sequence;
taking a hiding influence factor calculated according to the last keyword in the word sequence as a first query vector u corresponding to the question information 1
Optionally, the determining, according to the first query vector, a common sense vector associated with the first query vector by using a preset first end-to-end memory network MemN2N model, and forming a question vector according to the first query vector and the common sense vector includes:
in the 1 st cycle of the first end-to-end memory network MemN2N model, respectively calculating the first query vector u 1 With the ith common sense header vector x in the preset common sense header group i Is a correlation value pi of (1);
according to the i th common sense header vector x i Correlation value p of (2) i With the ith common sense tail vector y in the preset common sense tail group i Calculating the question sub-vector a of the 1 st cycle 1
The first query vector u 1 And the question vector a 1 Adding to obtain a first query vector u of the 2 nd cycle 2
A first query vector u according to the 2 nd cycle 2 Recalculating the challenge sub-vector a for cycle 2 2 And a first query vector u of the 3 rd cycle 3 And so on, until the question sub-vector a of the Mth cycle is calculated M
Question sub-vector a of the Mth cycle M As the question vector.
Optionally, the method further comprises:
obtaining a common sense information base; wherein the common sense information base includes a plurality of common sense information represented in the form of a knowledge triplet, and the common sense information includes: a head portion, a relationship portion, and a tail portion;
converting the header in each common sense information into a common sense header vector by presetting a first hidden layer matrix, thereby forming a common sense header group;
converting the tail in each common sense information into a common sense tail vector by presetting a second hidden layer matrix, thereby forming a common sense tail group;
and establishing a corresponding relation between the common sense head vector and the common sense tail vector according to the relation part in each common sense information.
Optionally, the converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model according to the question vector, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors, including:
taking the question vector as a hiding influence factor h of a first layer 0 And will preset the start character vector s 0 Input into the second gate recursive unit GRU model to obtain an output vector s 1 And a concealment influence factor h passed to the second layer 1
The output vector s 1 Is input into the second end-to-end memory network MemN2N model as a second query vector to obtain a reply vector r 1
The output vector s 1 And a concealment influence factor h of the second layer 1 Re-input into said second gate recursive unit GRU model to obtain an output vector s 2 And a concealment influence factor h passed to the third layer 2 And outputs the vector s 2 Re-input to the second end-to-end memory network MemN2N model to obtain a reply vector r 2 And so on until the output vector of the second gate recursion unit GRU model is a preset ending character vector.
Optionally, said outputting said output vector s 1 Is input into the second end-to-end memory network MemN2N model as a second query vector to obtain a reply vector r 1 Comprising:
in the 1 st cycle of the second end-to-end memory network MemN2N model, respectively calculating the second query vector s 1 With the ith reply header vector k in the preset reply header group i Correlation value p of (2) i
According to the ith reply header vector k i Correlation value p of (2) i With the ith reply tail vector l in the preset reply tail group i Calculate 1 st cycle reply subvector o 1
The second query vector s 1 Reply subvector o with cycle 1 1 Adding to obtainSecond query vector s of cycle 2 2
A second query vector s according to the 2 nd cycle 2 Recalculating the reply subvector o for cycle 2 2 Second query vector s of the 3 rd cycle 3 And so on, until the reply subvector o of the Nth cycle is calculated N
The reply subvector o of the Nth cycle N As a reply vector r 1
Optionally, the method further comprises:
obtaining a reply information base; wherein the reply information base includes a plurality of reply information expressed in the form of a knowledge triplet, and the reply information includes: a head portion, a relationship portion, and a tail portion;
the head in each piece of reply information is converted into a reply head vector through a preset conversion embedding TransE algorithm, so that a reply head group is formed;
converting the tail in each reply message into a reply tail vector by a preset conversion embedded TransE algorithm, thereby forming a reply tail group;
and establishing a corresponding relation between the reply head vector and the reply tail vector according to the relation part in each reply message.
In order to achieve the above object, the present invention also provides a dialog generating apparatus, including:
the acquisition module is used for acquiring the questioning information and converting the questioning information into a first query vector by utilizing a preset first gate recursion unit GRU model;
the questioning module is used for determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a questioning vector according to the first query vector and the common sense vector;
the reply module is used for converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model according to the question vector, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of reply vectors;
and the conversion module is used for respectively converting each reply vector into reply words and combining all the reply words into reply information.
In order to achieve the above object, the present invention further provides a computer device, which specifically includes: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the dialogue generation method when executing the computer program.
In order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the dialog generation method described above.
The dialogue generating method, the dialogue generating device, the dialogue generating equipment and the dialogue generating readable storage medium combine an end-to-end memory network MemN2N architecture with a GRU network to find out common sense information related to the questioning information according to the questioning information, and comprehensively consider the questioning information and the common sense information to determine answer information. In the process of encoding the questioning information into the questioning vector, the questioning information is encoded in a form of GRU+Mem2N, and aiming at the questioning information input by a user, an Embedding B in the MemN2N network is replaced by a GRU network, and the final hidden layer of the GRU network is used as a query vector to be input into the MemN2N network. In decoding the question vector into the reply information, the reply information is generated in the form of GRU+Memn2N. The invention can quickly and accurately form the reply information in the remote consultation dialogue, and improves the user experience.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a schematic flow chart of an alternative dialog generation method according to the first embodiment;
fig. 2 is a schematic diagram of an alternative composition structure of a dialogue generating device according to the second embodiment;
fig. 3 is a schematic diagram of an alternative hardware architecture of a computer device according to the third embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
The embodiment of the invention provides a dialogue generating method, as shown in fig. 1, which specifically comprises the following steps:
step S101: question information is obtained and converted into a first query vector using a preset first GRU (Gate Recurrent Unit, gate recursion unit) model.
Specifically, step S101 includes:
step A1: performing word segmentation on the questioning information, and forming word sequences by a plurality of keywords obtained after the word segmentation; wherein the word sequence includes N keywords;
step A2: aiming at a target keyword in the word sequence, calculating a hiding influence factor of the target keyword transmitted to a keyword positioned behind the target keyword in the word sequence by using the first recursion unit GRU model according to the hiding influence factor transmitted to the target keyword by the keyword positioned behind the target keyword in the word sequence;
step A3: taking a hiding influence factor calculated according to the last keyword in the word sequence as a first query vector u corresponding to the question information 1
Step S102: and determining a common sense vector associated with the first query vector by using a preset first MemN2N (End-to-End Memory Networks) model according to the first query vector, and forming a question vector according to the first query vector and the common sense vector.
Specifically, step S102 includes:
step B1: in the 1 st cycle hop of the first end-to-end memory network MemN2N model, respectively calculating the first query vector u 1 With the ith common sense header vector x in the preset common sense header group i Correlation value p of (2) i
Wherein p is i =Softmax((u 1 ) T *x i ) T is the transpose function.
Step B2: according to the i th common sense header vector x i Correlation value p of (2) i With the ith common sense tail vector y in the preset common sense tail group i Calculating the question sub-vector a of the 1 st cycle 1
Wherein a is 1 =∑ i p i y i
Step B3: the first query vector u 1 And the question vector a 1 Adding to obtain a first query vector u of the 2 nd cycle 2
Step B4: repeating steps B1-B3 until calculating the question sub-vector a of the Mth cycle M
Step B5: question sub-vector a of the Mth cycle M As the question vector.
Further, the method further comprises:
step C1: obtaining a common sense information base; wherein the common sense information base includes a plurality of common sense information represented in the form of a knowledge triplet, and the common sense information includes: a head portion, a relationship portion, and a tail portion;
taking the example of a "cat being an animal", the knowledge triplet form represents the position (h: cat, r: belonging to t: animal), where h represents the head, t represents the tail, and r represents the relationship between head and tail.
Step C2: converting the header in each common sense information into a common sense header vector by presetting a first hidden layer matrix Embedding A, thereby forming a common sense header group;
step C3: converting the tail in each common sense information into a common sense tail vector by presetting a second hidden layer matrix Embedding C, thereby forming a common sense tail group;
step C4: and establishing a corresponding relation between the common sense head vector and the common sense tail vector according to the relation part in each common sense information.
In the process of encoding the Encoder, namely, the process of encoding the questioning information into a questioning vector, the questioning information is encoded in the form of GRU+Mem2N, and for the questioning information input by a user, the GRU network is used for replacing Embedding B in the MemN2N network, and the final hidden layer of the GRU network is used as a query vector to be input into the MemN2N network. The whole Memory 2N network is overlapped by a plurality of hops, and in each hop, the correlation degree of the query vector and each piece of common sense information in the Memory is calculated respectively. In this embodiment, the Encoder is implemented by using the gru+memn2n, so that on the premise that complete question information is extracted by using the GRU, common sense information with high relevance to the whole question information can be continuously added, and information deviation caused by searching for a single entity word is avoided. In addition, the common sense information of the Memory is calculated in a weighted sum mode, so that a single knowledge triplet is prevented from being selected as compensation information, and the acquired common sense information is more comprehensive.
Step S103: according to the question vector, the question vector is converted into a plurality of second query vectors by using a preset second gate recursion unit GRU model, and each second query vector is sequentially input into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors.
Specifically, step S103 includes:
step D1: taking the question vector as a hiding influence factor h of a first layer 0 And will preset the start character vector s 0 Input into the second gate recursive unit GRU model toObtaining an output vector s 1 And a concealment influence factor h passed to the second layer 1
Wherein,(s) 1 ,h 1 )=GRU(s 0 ,h 0 )。
Step D2: the output vector s 1 Is input into the second end-to-end memory network MemN2N model as a second query vector to obtain a reply vector r 1
Further, step D2 includes:
step D21: in the 1 st cycle hop of the second end-to-end memory network MemN2N model, respectively calculating the second query vector s 1 With the ith reply header vector k in the preset reply header group i Correlation value p of (2) i
Wherein p is i =Softmax((s 1 ) T k i ) T is a transposition function;
step D22: according to the ith reply header vector k i Correlation value p of (2) i With the ith reply tail vector l in the preset reply tail group i Calculate 1 st cycle reply subvector o 1
Wherein o is 1 =∑ i p i l i
Step D23: the second query vector s 1 Reply subvector o with cycle 1 1 Adding to obtain a second query vector s of the 2 nd cycle hop 2
Step D24: repeating steps D21-D23 until the question sub-vector o of the Nth cycle hop is calculated N
Step D25: question sub-vector o of the nth cycle N As a reply vector r 1
Still further, the method further comprises:
step E1: obtaining a reply information base; wherein the reply information base includes a plurality of reply information expressed in the form of a knowledge triplet, and the reply information includes: a head portion, a relationship portion, and a tail portion;
step E2: the head in each piece of reply information is converted into a reply head vector through a preset conversion embedding TransE algorithm, so that a reply head group is formed;
step E3: converting the tail in each reply message into a reply tail vector by a preset conversion embedded TransE algorithm, thereby forming a reply tail group;
where k= (h, r, t) =mlp (transition (h, r, t));
k i =h;l i =t。
step E4: and establishing a corresponding relation between the reply head vector and the reply tail vector according to the relation part in each reply message.
Step D3: the output vector s 1 And a concealment influence factor h of the second layer 1 Re-input into said second gate recursive unit GRU model to obtain an output vector s 2 And a concealment influence factor h passed to the third layer 2 And outputs the vector s 2 Re-input to the second end-to-end memory network MemN2N model to obtain a reply vector r 2 And so on until the output vector of the second gate recursion unit GRU model is a preset ending character vector.
Step S104: each reply vector is converted into a reply word, and all the reply words are combined into reply information, respectively.
Specifically, step S104 includes:
the reply vector r is obtained according to the following formula i Corresponding reply word w i
P(r i =w i )=softmax(Wr i );
Wherein W is a preset matrix containing a plurality of reply words, and the word with the maximum P value in the calculated matrix W is taken as R i Corresponding reply word w i
In the decoding process, namely, the process of decoding the question vector into the reply information, the reply information is generated in the form of GRU+MemN2N; the initial hidden state of the GRU network is the output of the Encoder section. For Memory, different from the Encoder part, the encoding of the knowledge triples is completed by using a TransE algorithm to replace the coding A and the coding C in the Memory N2N model. Furthermore, unlike the Encoder, which takes the output of the last instant of the GRU network as the input to MemN2N, the Encoder portion takes each hidden state of the GRU as the query vector query of MemN 2N.
In this embodiment, the implementation of the Decoder portion avoids the distinction between the entity word and the common word when generating the reply, so that all the reply words can be obtained according to the vocabulary. In addition, the patent distinguishes the similarity calculation part of the Memory and the query from the weighted sum output part by means of the thought of Kay Value Memory Network, so that the query is more similar to the head entity in the knowledge triplet, the output is more similar to the tail entity in the knowledge triplet, and the repetition rate of model generation reply and question is reduced.
Example two
The embodiment of the invention provides a dialogue generating device, as shown in fig. 2, which specifically comprises the following components:
the acquiring module 201 is configured to acquire query information, and convert the query information into a first query vector by using a preset first gate recursion unit GRU model;
a questioning module 202, configured to determine, according to the first query vector, a common sense vector associated with the first query vector by using a preset first end-to-end memory network MemN2N model, and form a questioning vector according to the first query vector and the common sense vector;
the reply module 203 is configured to convert the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model according to the question vector, and sequentially input each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of reply vectors;
the conversion module 204 is configured to convert each reply vector into a reply word, and combine all the reply words into reply information.
Specifically, the obtaining module 201 is configured to:
for the saidThe questioning information is subjected to word segmentation, and a word sequence is formed by a plurality of keywords obtained after word segmentation; aiming at a target keyword in the word sequence, calculating a hiding influence factor of the target keyword transmitted to a keyword positioned behind the target keyword in the word sequence by using the first recursion unit GRU model according to the hiding influence factor transmitted to the target keyword by the keyword positioned behind the target keyword in the word sequence; taking a hiding influence factor calculated according to the last keyword in the word sequence as a first query vector u corresponding to the question information 1
Further, the questioning module 202 is specifically configured to:
in the 1 st cycle of the first end-to-end memory network MemN2N model, respectively calculating the first query vector u 1 With the ith common sense header vector x in the preset common sense header group i Correlation value p of (2) i The method comprises the steps of carrying out a first treatment on the surface of the According to the i th common sense header vector x i Correlation value p of (2) i With the ith common sense tail vector y in the preset common sense tail group i Calculating the question sub-vector a of the 1 st cycle 1 The method comprises the steps of carrying out a first treatment on the surface of the The first query vector u 1 And the question vector a 1 Adding to obtain a first query vector u of the 2 nd cycle 2 The method comprises the steps of carrying out a first treatment on the surface of the A first query vector u according to the 2 nd cycle 2 Recalculating the challenge sub-vector a for cycle 2 2 And a first query vector u of the 3 rd cycle 3 And so on, until the question sub-vector a of the Mth cycle is calculated M The method comprises the steps of carrying out a first treatment on the surface of the Question sub-vector a of the Mth cycle M As the question vector.
Further, the device further comprises:
the processing module is used for acquiring a common sense information base; wherein the common sense information base includes a plurality of common sense information represented in the form of a knowledge triplet, and the common sense information includes: a head portion, a relationship portion, and a tail portion; converting the header in each common sense information into a common sense header vector by presetting a first hidden layer matrix, thereby forming a common sense header group; converting the tail in each common sense information into a common sense tail vector by presetting a second hidden layer matrix, thereby forming a common sense tail group; and establishing a corresponding relation between the common sense head vector and the common sense tail vector according to the relation part in each common sense information.
Further, the reply module 203 is specifically configured to:
taking the question vector as a hiding influence factor h of a first layer 0 And will preset the start character vector s 0 Input into the second gate recursive unit GRU model to obtain an output vector s 1 And a concealment influence factor h passed to the second layer 1 The method comprises the steps of carrying out a first treatment on the surface of the The output vector s 1 Is input into the second end-to-end memory network MemN2N model as a second query vector to obtain a reply vector r 1 The method comprises the steps of carrying out a first treatment on the surface of the The output vector s 1 And a concealment influence factor h of the second layer 1 Re-input into said second gate recursive unit GRU model to obtain an output vector s 2 And a concealment influence factor h passed to the third layer 2 And outputs the vector s 2 Re-input to the second end-to-end memory network MemN2N model to obtain a reply vector r 2 And so on until the output vector of the second gate recursion unit GRU model is a preset ending character vector.
Further, the reply module 203 is configured to implement the step of outputting the vector s 1 Is input into the second end-to-end memory network MemN2N model as a second query vector to obtain a reply vector r 1 The method specifically comprises the following steps:
in the 1 st cycle of the second end-to-end memory network MemN2N model, respectively calculating the second query vector s 1 With the ith reply header vector k in the preset reply header group i Correlation value p of (2) i The method comprises the steps of carrying out a first treatment on the surface of the According to the ith reply header vector k i Correlation value p of (2) i With the ith reply tail vector l in the preset reply tail group i Calculate 1 st cycle reply subvector o 1 The method comprises the steps of carrying out a first treatment on the surface of the The second query vector s 1 Reply subvector o with cycle 1 1 Adding to obtain the 2 ndSecond query vector s of the loop 2 The method comprises the steps of carrying out a first treatment on the surface of the A second query vector s according to the 2 nd cycle 2 Recalculating the reply subvector o for cycle 2 2 Second query vector s of the 3 rd cycle 3 And so on, until the reply subvector o of the Nth cycle is calculated N The method comprises the steps of carrying out a first treatment on the surface of the The reply subvector o of the Nth cycle N As a reply vector r 1
Still further, the processing module is further configured to:
obtaining a reply information base; wherein the reply information base includes a plurality of reply information expressed in the form of a knowledge triplet, and the reply information includes: a head portion, a relationship portion, and a tail portion; the head in each piece of reply information is converted into a reply head vector through a preset conversion embedding TransE algorithm, so that a reply head group is formed; converting the tail in each reply message into a reply tail vector by a preset conversion embedded TransE algorithm, thereby forming a reply tail group; and establishing a corresponding relation between the reply head vector and the reply tail vector according to the relation part in each reply message.
Example III
The present embodiment also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack-mounted server, a blade server, a tower server, or a rack-mounted server (including an independent server or a server cluster formed by a plurality of servers) that can execute a program. As shown in fig. 3, the computer device 30 of the present embodiment includes at least, but is not limited to: a memory 301, a processor 302, which may be communicatively connected to each other via a system bus. It is noted that FIG. 3 only shows a computer device 30 having components 301-302, but it should be understood that not all of the illustrated components are required to be implemented, and that more or fewer components may alternatively be implemented.
In this embodiment, the memory 301 (i.e., readable storage medium) includes flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 301 may be an internal storage unit of the computer device 30, such as a hard disk or memory of the computer device 30. In other embodiments, the memory 301 may also be an external storage device of the computer device 30, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 30. Of course, the memory 301 may also include both internal storage units of the computer device 30 and external storage devices. In this embodiment, the memory 301 is typically used to store an operating system and various types of application software installed on the computer device 30. In addition, the memory 301 can also be used to temporarily store various types of data that have been output or are to be output.
The processor 302 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 302 is generally used to control the overall operation of the computer device 30.
Specifically, in the present embodiment, the processor 302 is configured to execute a program of a dialog generation method stored in the processor 302, and the program of the dialog generation method when executed implements the steps of:
acquiring questioning information, and converting the questioning information into a first query vector by using a preset first gate recursion unit GRU model;
determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a question vector according to the first query vector and the common sense vector;
according to the question vector, converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors;
each reply vector is converted into a reply word, and all the reply words are combined into reply information, respectively.
The specific embodiment of the above method steps may refer to the first embodiment, and this embodiment is not repeated here.
Example IV
The present embodiment also provides a computer readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., having stored thereon a computer program that when executed by a processor performs the following method steps:
acquiring questioning information, and converting the questioning information into a first query vector by using a preset first gate recursion unit GRU model;
determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a question vector according to the first query vector and the common sense vector;
according to the question vector, converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors;
each reply vector is converted into a reply word, and all the reply words are combined into reply information, respectively.
The specific embodiment of the above method steps may refer to the first embodiment, and this embodiment is not repeated here.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (9)

1. A method of dialog generation, the method comprising:
acquiring questioning information, and converting the questioning information into a first query vector by using a preset first gate recursion unit GRU model;
determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a question vector according to the first query vector and the common sense vector;
according to the question vector, converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors;
converting each reply vector into reply words respectively, and combining all the reply words into reply information;
the method for obtaining the question vector comprises the steps of converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model according to the question vector, sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of answer vectors, and comprises the following steps:
taking the question vector as a hiding influence factor of a first layerAnd vector +_for the preset start character>Input into said second gate recursive unit GRU model to obtain an output vector +.>And a concealment influence factor transferred to the second layer +.>
The output vector is processedIs input as a second query vector into the second end-to-end memory network MemN2N model to obtain a reply vector +.>
The output vector is processedAnd a concealment influence factor of the second layer +.>Re-input into said second gate recursive unit GRU model to obtain an output vector +.>And pass to the thirdHidden influencing factor of layer->And the output vector +.>Re-input to said second end-to-end memory network MemN2N model to obtain a reply vector +.>And so on until the output vector of the second gate recursion unit GRU model is a preset ending character vector.
2. The dialog generation method of claim 1, wherein the acquiring the question information and converting the question information into the first query vector using a preset first gate recursion unit GRU model includes:
performing word segmentation on the questioning information, and forming word sequences by a plurality of keywords obtained after the word segmentation;
aiming at a target keyword in the word sequence, calculating a hiding influence factor of the target keyword transmitted to a keyword positioned behind the target keyword in the word sequence by using the first recursion unit GRU model according to the hiding influence factor transmitted to the target keyword by the keyword positioned behind the target keyword in the word sequence;
taking a hiding influence factor calculated according to the last keyword in the word sequence as a first query vector corresponding to the question information
3. The method of claim 2, wherein determining a common sense vector associated with the first query vector according to the first query vector using a preset first end-to-end memory network MemN2N model, and forming a question vector according to the first query vector and the common sense vector comprises:
in the 1 st cycle of the first end-to-end memory network MemN2N model, respectively calculating the first query vectorWith the i-th common sense head vector in the preset common sense head group +.>Correlation value +.>
According to the ith common sense head vectorCorrelation value +.>With the ith common sense tail vector in the preset common sense tail groupCalculating the question sub-vector +.1 in cycle>
The first query vectorAnd the question vector->Adding to get the first query vector of cycle 2 +.>
A first query vector according to the 2 nd cycleRecalculating the question sub-vector for cycle 2>And the first query vector of cycle 3 +.>And so on until the question sub-vector of the Mth cycle is calculated +.>
Question sub-vector of the Mth cycleAs the question vector.
4. A dialog generation method according to claim 3, characterized in that the method further comprises:
obtaining a common sense information base; wherein the common sense information base includes a plurality of common sense information represented in the form of a knowledge triplet, and the common sense information includes: a head portion, a relationship portion, and a tail portion;
converting the header in each common sense information into a common sense header vector by presetting a first hidden layer matrix, thereby forming a common sense header group;
converting the tail in each common sense information into a common sense tail vector by presetting a second hidden layer matrix, thereby forming a common sense tail group;
and establishing a corresponding relation between the common sense head vector and the common sense tail vector according to the relation part in each common sense information.
5. The dialog generation method of claim 1, wherein the outputting the output vectorIs input as a second query vector into the second end-to-end memory network MemN2N model to obtain a reply vector +.>Comprising:
in the 1 st cycle of the second end-to-end memory network MemN2N model, respectively calculating the second query vectorsWith the i-th reply header vector in the preset reply header group +.>Correlation value +.>
According to the ith reply header vectorCorrelation value +.>With the ith reply tail vector in the preset reply tail groupCalculating reply subvector of 1 st cycle +.>
The second query vectorReply subvector +.1 with cycle>Adding to get the second query vector of the 2 nd cycle +.>
A second query vector according to the 2 nd cycleRecalculating the reply subvector for cycle 2>And second query vector of the 3 rd cycle +.>And so on, until the reply subvector of the Nth cycle is calculated +.>
The reply subvector of the Nth cycleAs reply vector +.>
6. The dialog generation method of claim 5, wherein the method further comprises:
obtaining a reply information base; wherein the reply information base includes a plurality of reply information expressed in the form of a knowledge triplet, and the reply information includes: a head portion, a relationship portion, and a tail portion;
the head in each piece of reply information is converted into a reply head vector through a preset conversion embedding TransE algorithm, so that a reply head group is formed;
converting the tail in each reply message into a reply tail vector by a preset conversion embedded TransE algorithm, thereby forming a reply tail group;
and establishing a corresponding relation between the reply head vector and the reply tail vector according to the relation part in each reply message.
7. A dialog generation device, the device comprising:
the acquisition module is used for acquiring the questioning information and converting the questioning information into a first query vector by utilizing a preset first gate recursion unit GRU model;
the questioning module is used for determining a common sense vector associated with the first query vector by utilizing a preset first end-to-end memory network (MemN 2N) model according to the first query vector, and forming a questioning vector according to the first query vector and the common sense vector;
the reply module is used for converting the question vector into a plurality of second query vectors by using a preset second gate recursion unit GRU model according to the question vector, and sequentially inputting each second query vector into a preset second end-to-end memory network MemN2N model to obtain a plurality of reply vectors;
the conversion module is used for respectively converting each reply vector into reply words and combining all the reply words into reply information;
wherein, the reply module is used for:
taking the question vector as a hiding influence factor of a first layerAnd vector +_for the preset start character>Input into said second gate recursive unit GRU model to obtain an output vector +.>And a concealment influence factor transferred to the second layer +.>
The output vector is processedIs input as a second query vector into the second end-to-end memory network MemN2N model to obtain a reply vector +.>
The output vector is processedAnd a concealment influence factor of the second layer +.>Re-input into said second gate recursive unit GRU model to obtain an output vector +.>And a concealment influence factor transferred to the third layer +.>And the output vector +.>Re-input to said second end-to-end memory network MemN2N model to obtain a reply vector +.>And so on until the output vector of the second gate recursion unit GRU model is a preset ending character vector.
8. A computer device, the computer device comprising: memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 6.
CN202011059826.7A 2020-09-30 2020-09-30 Dialogue generation method, device, equipment and readable storage medium Active CN112199482B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011059826.7A CN112199482B (en) 2020-09-30 2020-09-30 Dialogue generation method, device, equipment and readable storage medium
PCT/CN2021/091292 WO2022068197A1 (en) 2020-09-30 2021-04-30 Conversation generation method and apparatus, device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011059826.7A CN112199482B (en) 2020-09-30 2020-09-30 Dialogue generation method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN112199482A CN112199482A (en) 2021-01-08
CN112199482B true CN112199482B (en) 2023-07-21

Family

ID=74007267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011059826.7A Active CN112199482B (en) 2020-09-30 2020-09-30 Dialogue generation method, device, equipment and readable storage medium

Country Status (2)

Country Link
CN (1) CN112199482B (en)
WO (1) WO2022068197A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199482B (en) * 2020-09-30 2023-07-21 平安科技(深圳)有限公司 Dialogue generation method, device, equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704588A (en) * 2019-09-04 2020-01-17 平安科技(深圳)有限公司 Multi-round dialogue semantic analysis method and system based on long-term and short-term memory network
CN111291534A (en) * 2020-02-03 2020-06-16 苏州科技大学 Global coding method for automatic summarization of Chinese long text
CN111400468A (en) * 2020-03-11 2020-07-10 苏州思必驰信息科技有限公司 Conversation state tracking system and method, and man-machine conversation device and method
CN111414460A (en) * 2019-02-03 2020-07-14 北京邮电大学 Multi-round dialogue management method and device combining memory storage and neural network

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844368B (en) * 2015-12-03 2020-06-16 华为技术有限公司 Method for man-machine conversation, neural network system and user equipment
US10692099B2 (en) * 2016-04-11 2020-06-23 International Business Machines Corporation Feature learning on customer journey using categorical sequence data
US10803252B2 (en) * 2018-06-30 2020-10-13 Wipro Limited Method and device for extracting attributes associated with centre of interest from natural language sentences
CN109840255B (en) * 2019-01-09 2023-09-19 平安科技(深圳)有限公司 Reply text generation method, device, equipment and storage medium
CN110377719B (en) * 2019-07-25 2022-02-15 广东工业大学 Medical question and answer method and device
CN111143530B (en) * 2019-12-24 2024-04-05 平安健康保险股份有限公司 Intelligent reply method and device
CN112199482B (en) * 2020-09-30 2023-07-21 平安科技(深圳)有限公司 Dialogue generation method, device, equipment and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111414460A (en) * 2019-02-03 2020-07-14 北京邮电大学 Multi-round dialogue management method and device combining memory storage and neural network
CN110704588A (en) * 2019-09-04 2020-01-17 平安科技(深圳)有限公司 Multi-round dialogue semantic analysis method and system based on long-term and short-term memory network
CN111291534A (en) * 2020-02-03 2020-06-16 苏州科技大学 Global coding method for automatic summarization of Chinese long text
CN111400468A (en) * 2020-03-11 2020-07-10 苏州思必驰信息科技有限公司 Conversation state tracking system and method, and man-machine conversation device and method

Also Published As

Publication number Publication date
WO2022068197A1 (en) 2022-04-07
CN112199482A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
CN113591902B (en) Cross-modal understanding and generating method and device based on multi-modal pre-training model
CN110704588A (en) Multi-round dialogue semantic analysis method and system based on long-term and short-term memory network
CN110825857B (en) Multi-round question and answer identification method and device, computer equipment and storage medium
CN109344242B (en) Dialogue question-answering method, device, equipment and storage medium
CN107832300A (en) Towards minimally invasive medical field text snippet generation method and device
CN110619124A (en) Named entity identification method and system combining attention mechanism and bidirectional LSTM
CN116737938A (en) Fine granularity emotion detection method and device based on fine tuning large model online data network
CN112084301B (en) Training method and device for text correction model, text correction method and device
CN112199482B (en) Dialogue generation method, device, equipment and readable storage medium
CN112669215A (en) Training text image generation model, text image generation method and device
CN114445832A (en) Character image recognition method and device based on global semantics and computer equipment
CN113435210A (en) Social image text recognition method and device, computer equipment and storage medium
CN112988967A (en) Dialog generation method and device based on two-stage decoding, medium and computing equipment
CN112364602B (en) Multi-style text generation method, device, equipment and readable storage medium
WO2023108981A1 (en) Method and apparatus for training text generation model, and storage medium and computer device
CN112509559B (en) Audio recognition method, model training method, device, equipment and storage medium
CN116469359A (en) Music style migration method, device, computer equipment and storage medium
CN113420869B (en) Translation method based on omnidirectional attention and related equipment thereof
JP6633556B2 (en) Acoustic model learning device, speech recognition device, acoustic model learning method, speech recognition method, and program
CN115204366A (en) Model generation method and device, computer equipment and storage medium
CN112990434B (en) Training method of machine translation model and related device
CN115048926A (en) Entity relationship extraction method and device, electronic equipment and storage medium
CN113283241B (en) Text recognition method and device, electronic equipment and computer readable storage medium
CN111325068B (en) Video description method and device based on convolutional neural network
CN113095435A (en) Video description generation method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant