CN113934836A - Question reply method and device and electronic equipment - Google Patents

Question reply method and device and electronic equipment Download PDF

Info

Publication number
CN113934836A
CN113934836A CN202111565779.8A CN202111565779A CN113934836A CN 113934836 A CN113934836 A CN 113934836A CN 202111565779 A CN202111565779 A CN 202111565779A CN 113934836 A CN113934836 A CN 113934836A
Authority
CN
China
Prior art keywords
vector
text
word segmentation
answer text
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111565779.8A
Other languages
Chinese (zh)
Other versions
CN113934836B (en
Inventor
郭俊廷
林小俊
支涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yunji Technology Co Ltd
Original Assignee
北京云迹科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京云迹科技有限公司 filed Critical 北京云迹科技有限公司
Priority to CN202111565779.8A priority Critical patent/CN113934836B/en
Publication of CN113934836A publication Critical patent/CN113934836A/en
Application granted granted Critical
Publication of CN113934836B publication Critical patent/CN113934836B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a question reply method, a question reply device and electronic equipment, which can process and obtain answer texts carrying the character attributes of a robot, judging whether the character attributes carried in the answer text are consistent with the character attributes set for the robot or not, when the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradictory relationship, determining that the character attribute carried in the answer text is not consistent with the character attribute set for the robot, inputting the character attribute information and the answer text to be determined which is in a contradictory relation with the character attribute information into a second text generation model again to obtain a final answer text for answering the question text, therefore, consistency of character attributes carried by answers generated when the robot answers questions posed by the user and character attributes of the robot is ensured as much as possible.

Description

Question reply method and device and electronic equipment
Technical Field
The invention relates to the technical field of computers, in particular to a problem replying method and device and electronic equipment.
Background
At present, besides answering questions, a robot with a chat function can set character attributes (such as name, gender, occupation, age and family relationship) for the robot, so that the robot can blend the character attributes of the robot into sentences which are interacted with a user when the robot is interacted with the user.
Due to the diversity of character attributes and the diversity of the dialogue texts, the character attributes and the dialogue texts have large differences, so that the reply sentence model used by the robot often cannot learn enough character attributes during training, and the character attributes carried in the generated texts are sometimes inconsistent with the character attributes of the robot.
Disclosure of Invention
In order to solve the above problem, embodiments of the present invention provide a problem recovery method, device and electronic device.
In a first aspect, an embodiment of the present invention provides a question answering method, including:
acquiring character attribute information of the robot and a question text serving as a training corpus;
inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries the character attribute of the robot;
inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relations comprise an implication relation, a neutral relation and a contradiction relation;
when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text.
In a second aspect, an embodiment of the present invention further provides a problem recovery device, including:
the acquisition module is used for acquiring character attribute information of the robot and question texts serving as training corpora;
the first training module is used for inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries the character attribute of the robot;
the second training module is used for inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relations comprise an implication relation, a neutral relation and a contradiction relation;
and the third training module is used for inputting the character attribute information and the answer text to be determined, which is in a contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model when the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradiction relationship, so that the trained second text generation model can obtain the final answer text for answering the question text.
In a third aspect, the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the method in the first aspect.
In a fourth aspect, embodiments of the present invention also provide an electronic device, which includes a memory, a processor, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor to perform the steps of the method according to the first aspect.
In the solutions provided in the first to fourth aspects of the embodiments of the present invention, character attribute information and a question text are input into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, where the answer text to be determined carries character attributes of the robot; inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text; compared with the mode that the answer sentence model used by the robot in the related technology often cannot learn enough character attributes during training to cause the character attributes carried in the answer sentence of the robot to be inconsistent with the character attributes set by the robot, when the robot using the model trained by the question answering method, the device and the electronic equipment provided by the application carries out the user question answering, after the answer text carrying the character attributes of the robot is obtained through processing, whether the character attributes carried in the answer text are consistent with the character attributes set by the robot is judged, when the relation between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relation, the character attributes carried in the answer text are determined to be inconsistent with the character attributes set by the robot, and the character attribute information and the answer text to be determined which is inconsistent with the character attribute information are input into the second text generation And modeling to obtain a final answer text for answering the question text, so that consistency between character attributes carried by the generated answer and character attributes of the robot when the robot answers the question proposed by the user is ensured as much as possible.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart illustrating a problem recovery method according to embodiment 1 of the present invention;
fig. 2 is a schematic diagram illustrating a self-attention model matrix of a one-way mask attention mechanism in the problem recovery method provided in embodiment 1 of the present invention;
fig. 3 is a schematic structural diagram illustrating a problem recovery device according to embodiment 2 of the present invention;
fig. 4 shows a schematic structural diagram of an electronic device provided in embodiment 3 of the present invention.
Detailed Description
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
At present, besides answering questions, a robot with a chat function can set character attributes (such as name, gender, occupation, age and family relationship) for the robot, so that the robot can blend the character attributes of the robot into sentences which are interacted with a user when the robot is interacted with the user.
Due to the diversity of character attributes and the diversity of the dialogue texts, the character attributes and the dialogue texts have large differences, so that the reply sentence model used by the robot often cannot learn enough character attributes during training, and the character attributes carried in the generated texts are sometimes inconsistent with the character attributes of the robot.
Based on this, the embodiment provides a question replying method, a device and an electronic device, wherein character attribute information and a question text are input into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries character attributes of the robot; inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text; when the robot of the model obtained by training the problem reply method and device and the electronic equipment provided by the application is used for user question reply, after the answer text carrying the character attribute of the robot is obtained through processing, judging whether the character attributes carried in the answer text are consistent with the character attributes set for the robot or not, when the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradictory relationship, determining that the character attribute carried in the answer text is not consistent with the character attribute set for the robot, inputting the character attribute information and the answer text to be determined which is in a contradictory relation with the character attribute information into a second text generation model again to obtain a final answer text for answering the question text, therefore, consistency of character attributes carried by answers generated when the robot answers questions posed by the user and character attributes of the robot is ensured as much as possible.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
Example 1
The embodiment provides a question answering method, and an execution main body is a robot capable of answering questions posed by a user.
The robot is provided with a wireless network card and can access the Internet to acquire data from the Internet.
Referring to a flowchart of a problem recovery method shown in fig. 1, the present embodiment provides a problem recovery method, including the following specific steps:
and step 100, acquiring character attribute information of the robot and a question text serving as a training corpus.
In step 100, the question text as the corpus is a manually labeled question text.
Step 102, inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries the character attribute of the robot.
In the step 102, the first text generation model includes: an attribute fusion encoder and a unidirectional decoder.
In order to make the trained first text generation model obtain the answer text to be determined for answering the question text, the step 102 may perform the following steps (1) to (5):
(1) pre-training a BERT model, and pre-processing the character attribute information and the problem text to obtain an attribute word segmentation vector of the character attribute information and a problem word segmentation vector of the problem text;
(2) obtaining the dimensionality of the problem word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value and the minimum value of the problem word segmentation vector from the problem word segmentation vector;
(3) inputting the dimensionality of the problem word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the problem word segmentation vector and the minimum value of the problem word segmentation vector into the pre-trained BERT model, and executing the following operations:
(31) calculating a scaling coefficient used when the problem participle vector and the attribute participle vector are fused by the following formula:
Figure 100002_DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE004
representing a zoom systemCounting;
Figure 100002_DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 100002_DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 100002_DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure 100002_DEST_PATH_IMAGE012
representing the maximum value of the problem participle vector;
Figure 100002_DEST_PATH_IMAGE014
representing a minimum value of a problem participle vector;
Figure 100002_DEST_PATH_IMAGE016
a dimension representing a problem participle vector;
(32) selecting a first vector to be fused from the attribute word segmentation vectors, and selecting a second vector to be fused from the problem word segmentation vectors;
(33) calculating a fused vector obtained by fusing the first vector and the second vector by the following formula:
Figure 100002_DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE020
representing a fused vector after the first vector and the second vector are fused;
Figure 100002_DEST_PATH_IMAGE022
representing a first vector;
Figure 100002_DEST_PATH_IMAGE024
represents the second vector;
Figure 100002_DEST_PATH_IMAGE026
Representing a transpose of a second vector;
(34) calculating a problem vector fused with user attributes by the following formula:
Figure 100002_DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE030
representing a question vector fused with user attributes;
(35) when all the problem word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the attribute fusion encoder;
(4) inputting the problem vector fused with the user attribute into the pre-trained BERT model, and training the BERT model by using a one-way mask attention mechanism to obtain a one-way decoder; the unidirectional decoder is used for outputting an answer vector of an answer text to be determined;
(5) and inputting the answer vector of the answer text to be determined into a text generator to obtain the answer text to be determined for answering the question text.
In the step (1), the process of pre-training the BERT model is prior art, and is not described herein again.
The preprocessing the character attribute information and the question text to obtain an attribute word segmentation vector of the character attribute information and a question word segmentation vector of the question text specifically comprises the following steps: performing word segmentation operation on the character attribute information and the question text to obtain word segmentation of the character attribute information and word segmentation of the question text; and then, processing the participles of the character attribute information and the participles of the question text by using a word2vec model to obtain attribute participle vectors of the character attribute information and question participle vectors of the question text.
After the attribute word segmentation vector of the character attribute information and the problem word segmentation vector of the problem text are obtained, the dimension of the problem word segmentation vector and the dimension of the attribute word segmentation vector can be obtained. And the dimension of the problem word segmentation vector and the dimension of the attribute word segmentation vector are cached in the robot in advance.
In the step (2) above, the dimension of the question word segmentation vector and the dimension of the attribute word segmentation vector are the same.
In the above step (4), see the schematic diagram of the self-attention model matrix of the one-way mask attention mechanism shown in fig. 2, the one-way mask attention mechanism may also be referred to as a left-to-right language model (left-to-right language model); when encoding each word, the one-way mask attention mechanism is used to encode the word using only the information to the left of the word and the word itself as input.
For example, a prediction sequence
Figure 100002_DEST_PATH_IMAGE032
Sequence ofMask]Can utilize
Figure 100002_DEST_PATH_IMAGE034
And 2Mask]And (6) coding is carried out. The specific implementation process is to use an upper triangular matrix as the mask matrix. The shaded portion in fig. 2 is minus infinity indicating that this portion of information is ignored, and the blank portion is 0 indicating that this portion of information is allowed to be used.
Inputting the problem word segmentation vector into the BERT model, and training the BERT model by using a one-way mask attention mechanism, wherein the specific process of obtaining the one-way decoder is the prior art and is not described herein again.
In the step (5), the answer vector of the answer text to be determined is input into the text generator, so as to obtain the answer text to be determined which answers the question text, and the method includes the following specific steps (51) to (55):
(51) processing the answer vector of the answer text to be determined so as to predict the participles forming the answer text to be determined, and putting the predicted participles into a participle list;
(52) when determining that a candidate word with a word segmentation is the same as the word segmentation in the word segmentation list in the process of predicting the word segmentation, determining the number of the word segmentation in the word segmentation list and acquiring the prediction probability of each candidate word;
(53) adjusting the prediction probability of each candidate word of the participle through the following formula:
Figure 100002_DEST_PATH_IMAGE036
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE038
representing the adjusted prediction probability of the candidate word;
Figure 100002_DEST_PATH_IMAGE040
representing the prediction probability of the candidate words before adjustment;
Figure 100002_DEST_PATH_IMAGE042
representing a candidate word;
Figure 100002_DEST_PATH_IMAGE044
representing the number of participles in the participle list;
(54) determining the candidate word with the maximum prediction probability in the candidate words after the prediction probability adjustment as the predicted participle, and putting the predicted participle into a participle list;
(55) and when the word segmentation prediction operation is finished, splicing all the word segments in the word segmentation list according to the display sequence of all the word segments in the word segmentation list to obtain the answer text to be determined.
In the step (51), the answer vector of the answer text to be determined is processed by using the beam search algorithm to predict the word segmentation of the answer text to be determined, which is the prior art and is not described herein again.
The answer text to be determined is the answer text carrying the character attributes.
Step 104, inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relationship comprises an implication relationship, a neutral relationship and a contradiction relationship.
In the above step 104, the relational inference model includes: a first BERT network, a second BERT network, and a classifier.
In order to train the relational inference model, the step 104 may specifically perform the following steps (1) to (3):
(1) inputting the character attribute information of the robot into a first BERT network to obtain a robot attribute vector, and inputting the answer text to be determined into a second BERT network to obtain an answer text vector; wherein the first and second BERT networks are twin BERT networks having the same parameters;
(2) splicing the obtained robot attribute vector and the answer text vector to obtain a spliced vector;
(3) inputting the splicing vector into the classifier, and determining the relationship between character attribute information and character attributes carried in the answer text to be determined, thereby training to obtain the relational inference model.
In the step (1), the specific processing procedure of inputting the character attribute information of the robot into the first BERT network to obtain the robot attribute vector and inputting the answer text to be determined into the second BERT network to obtain the answer text vector is the prior art.
In the step (2), the specific process of splicing the attribute vector of the robot and the answer text vector to obtain a spliced vector is the prior art, and is not described herein again.
In the step (3), the classifier is obtained by training sentences having relationships including relationships, neutral relationships, and contradictory relationships. The specific training process is prior art and will not be described herein.
Wherein each sentence may comprise at least two clauses. The relation reasoning model obtained by training is used for judging the relation between each clause in the answer text to be determined and the character attribute information set by the robot.
When the relationship between the character attribute information and the character attribute carried in the answer text to be determined is an implication relationship, the character attribute information and the character attribute carried in the answer text to be determined are the same.
When the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a neutral relationship, it is described that the character attribute information is not related to the character attribute carried in the answer text to be determined.
When the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradictory relationship, the character attribute information and the character attribute carried in the answer text to be determined are explained to be contradictory.
The contradictory relationship between the person attribute information and the person attribute carried in the answer text to be determined is described by the following example: the character attribute information set for the robot is a young person aged 20 to 25 years, but the character attribute carried in the answer text to be determined is an old person aged 55 years or older; then, it can be determined that the person attribute information and the person attribute carried in the answer text to be determined are in a contradictory relationship.
It can be determined through the above description that, when the clause in the answer text to be determined is the clause expressing the character attribute of the robot, the relationship between the clause expressing the character attribute of the robot in the answer text to be determined and the character attribute information may be an implication relationship or a contradiction relationship. When the clause in the answer text to be determined is the clause of the answer of the reply user, the relation between the clause of the answer of the reply user in the answer text to be determined and the character attribute information is a neutral relation.
And 106, when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relationship, inputting the character attribute information and the answer text to be determined which is in the contradictory relationship with the character attribute information into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text.
In the above step 106, the second text generation model includes: fusing an encoder and a decoder.
Specifically, in order to make the trained second text generation model obtain the final answer text for answering the question text, the above step 106 may perform the following steps (20) to (28):
(20) when the fact that the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relationship is determined, replacing the participles in the clauses which are in the contradictory relationship with the character attribute information in the answer text to be determined by using a preset identification to obtain an answer text to be processed;
(21) acquiring a sentence set, selecting partial sentences from the sentence set, inputting the partial sentences into the BERT model, and performing mask operation;
(22) disorganizing clauses in partial sentences in the sentence set so that adjacent clauses in the sentences in which the clauses are disorganized are discontinuous;
(23) inputting sentences with disordered clauses and sentences without disordered clauses into the BERT model after mask operation, and completing pre-training of the BERT model;
(24) preprocessing the character attribute information and the answer text to be processed to obtain an attribute word segmentation vector of the character attribute information and an answer text word segmentation vector of the answer text to be processed;
(25) obtaining the dimensionality of the answer text word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value of the attribute word segmentation vector and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector from the answer text word segmentation vector;
(26) inputting the dimensionality of the answer text word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector into the pre-trained BERT model, and executing the following operations:
(261) calculating a scaling factor used when the answer text participle vector and the attribute participle vector are fused by the following formula:
Figure 100002_DEST_PATH_IMAGE046
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE048
representing a scaling factor;
Figure 131136DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 254950DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 601617DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure 100002_DEST_PATH_IMAGE050
representing the maximum value of the word segmentation vector of the answer text;
Figure 100002_DEST_PATH_IMAGE052
representing the minimum value of the word segmentation vectors of the answer text;
Figure 100002_DEST_PATH_IMAGE054
representing the dimension of the answer text participle vector;
(262) selecting a third vector to be fused from the attribute word segmentation vectors, and selecting a fourth vector to be fused from the answer text word segmentation vectors;
(263) calculating a fused vector obtained by fusing the third vector and the fourth vector by the following formula:
Figure 100002_DEST_PATH_IMAGE056
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE058
representing a fused vector after the third vector and the fourth vector are fused;
Figure DEST_PATH_IMAGE060
representing a third vector;
Figure DEST_PATH_IMAGE062
represents a fourth vector;
Figure DEST_PATH_IMAGE064
representing a transpose of a fourth vector;
(264) calculating a fused answer text vector for answering the question text by the following formula:
Figure DEST_PATH_IMAGE066
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE068
a final answer text vector representing text to answer the question;
(265) when all the answer text word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the fusion encoder;
(27) inputting the fused answer text vector into the decoder for decoding operation, training the decoder, and obtaining a final answer text vector of the question text;
(28) and inputting the final answer text vector for answering the question text into a text generator to obtain the final answer text for answering the question text.
In the step (20), the preset flag may be [ mask ].
When the answer text to be determined has three clauses, namely clause 1, clause 2 and clause 3, only clause 2 and the set character attribute information of the robot are in a contradiction relationship; moreover, clause 2 is comprised of 4 participles; then, the preset identifier is used to replace the participle in the clause which is in contradiction relation with the character attribute information and is to be determined in the answer text, so as to obtain the answer text to be processed, which can be expressed as: clause 1, [ mask ] [ mask ] [ mask ] [ mask ], clause 2.
In the step (21), the sentence set is obtained by the robot from sentences randomly crawled from the network.
The specific process of selecting partial sentences from the sentence sets and inputting the partial sentences into the BERT model to perform masking operation is the prior art, and is not described herein again.
In step (22) above, the order of the clauses in 50% of the sentences in the sentence set may be shuffled.
In the step (23), the sentence with the disordered clause and the sentence with the unscrambled clause are input into the BERT model after the mask operation, and a specific process of completing the pre-training of the BERT model is the prior art, and is not described herein again.
In the step (24), the specific process of preprocessing the character attribute information and the answer text to be processed to obtain the attribute word segmentation vector of the character attribute information and the answer text word segmentation vector of the answer text to be processed is similar to the process of preprocessing the character attribute information and the question text to obtain the attribute word segmentation vector of the character attribute information and the question word segmentation vector of the question text, which is described in the step (1) in the process of training the first text generation model in the step 102, and is not repeated here.
After the attribute word segmentation vector of the character attribute information and the question word segmentation vector of the question text are obtained, the dimension of the answer text word segmentation vector and the dimension of the attribute word segmentation vector can be determined. The dimensions of the answer text participle vector and the dimensions of the attribute participle vector are cached in the robot.
In the step (25), the dimension of the answer text word segmentation vector and the dimension of the attribute word segmentation vector are the same.
In the step (27), the fused answer text vector is input to the decoder for decoding operation, and a specific process of training the decoder is similar to the process of training the BERT model by using the one-way mask attention mechanism described in the step (4) in the process of training the first text generation model in the step 102 to obtain the one-way decoder, and is not described herein again.
In the step (28), the specific process of obtaining the final answer text for answering the question text is similar to the process of inputting the answer vector of the answer text to be determined into the text generator to obtain the answer text to be determined for answering the question text, which is described in the step (5) in the process of training the first text generation model in the step 102, and is not described herein again.
After training the model through the steps described above in steps 102 to 104, upon receiving a question posed by the user, the following steps (31) to (34) may be performed:
(31) inputting questions provided by a user and character attribute information of the robot into the first text generation model to obtain answer texts carrying character attributes of the robot;
(32) inputting the answer text carrying the character attributes of the robot and the character attribute information of the robot into the relational reasoning model, and determining the relation between the character attribute information and the character attributes carried in the answer text carrying the character attributes of the robot;
(33) when the relationship between the character attribute information and the character attributes carried in the answer text carrying the character attributes of the robot is a contradictory relationship, inputting the character attribute information and the answer text carrying the character attributes of the robot into a second text generation model to obtain a final answer text for answering the question text;
(34) and when the relation between the character attribute information and the character attribute carried in the answer text carrying the character attribute of the robot is an implication relation and/or a neutral relation, determining the answer text carrying the character attribute of the robot as a final answer text for answering the question text.
In summary, the present embodiment provides a question answering method, which includes inputting character attribute information and a question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, where the answer text to be determined carries character attributes of a robot; inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text; compared with the mode that the answer sentence model used by the robot in the related technology often cannot learn enough character attributes during training to cause the character attributes carried in the answer sentence of the robot to be inconsistent with the character attributes set by the robot, when the robot using the model trained by the question answering method, the device and the electronic equipment provided by the application carries out the user question answering, after the answer text carrying the character attributes of the robot is obtained through processing, whether the character attributes carried in the answer text are consistent with the character attributes set by the robot is judged, when the relation between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relation, the character attributes carried in the answer text are determined to be inconsistent with the character attributes set by the robot, and the character attribute information and the answer text to be determined which is inconsistent with the character attribute information are input into the second text generation And modeling to obtain a final answer text for answering the question text, so that consistency between character attributes carried by the generated answer and character attributes of the robot when the robot answers the question proposed by the user is ensured as much as possible.
Example 2
This embodiment provides a problem recovery apparatus for implementing the problem recovery method of embodiment 1.
Referring to fig. 3, a schematic structural diagram of a problem recovery device is shown, in this embodiment, a problem recovery device is provided, including:
an obtaining module 300, configured to obtain character attribute information of the robot and a question text as a training corpus;
a first training module 302, configured to input the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined, which answers the question text, where the answer text to be determined carries the character attribute of the robot;
the second training module 304 is configured to input the character attribute information of the robot and the answer text to be determined into a relational inference model to train the relational inference model, so that the trained relational inference model can obtain a relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relations comprise an implication relation, a neutral relation and a contradiction relation;
and a third training module 306, configured to, when the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradiction relationship, input the character attribute information and the answer text to be determined, which is in a contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text.
The first text generation model comprising: an attribute fusion encoder and a unidirectional decoder.
The first training module 302 is specifically configured to:
pre-training a BERT model, and pre-processing the character attribute information and the problem text to obtain an attribute word segmentation vector of the character attribute information and a problem word segmentation vector of the problem text;
obtaining the dimensionality of the problem word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value and the minimum value of the problem word segmentation vector from the problem word segmentation vector;
inputting the dimensionality of the problem word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the problem word segmentation vector and the minimum value of the problem word segmentation vector into the pre-trained BERT model, and executing the following operations:
calculating a scaling coefficient used when the problem participle vector and the attribute participle vector are fused by the following formula:
Figure 654891DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE070
representing a scaling factor;
Figure 807524DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 51423DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 303413DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure 203236DEST_PATH_IMAGE012
representing the maximum value of the problem participle vector;
Figure 831664DEST_PATH_IMAGE014
representing a minimum value of a problem participle vector;
Figure 195649DEST_PATH_IMAGE016
a dimension representing a problem participle vector;
selecting a first vector to be fused from the attribute word segmentation vectors, and selecting a second vector to be fused from the problem word segmentation vectors;
calculating a fused vector obtained by fusing the first vector and the second vector by the following formula:
Figure 618540DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 67976DEST_PATH_IMAGE020
representing a fused vector after the first vector and the second vector are fused;
Figure 500094DEST_PATH_IMAGE022
representing a first vector;
Figure 453007DEST_PATH_IMAGE024
representing a second vector;
Figure 312378DEST_PATH_IMAGE026
representing a transpose of a second vector;
calculating a problem vector fused with user attributes by the following formula:
Figure 983531DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure 157023DEST_PATH_IMAGE030
representing a question vector fused with user attributes;
when all the problem word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the attribute fusion encoder;
inputting the problem vector fused with the user attribute into the pre-trained BERT model, and training the BERT model by using a one-way mask attention mechanism to obtain a one-way decoder; the unidirectional decoder is used for outputting an answer vector of an answer text to be determined;
and inputting the answer vector of the answer text to be determined into a text generator to obtain the answer text to be determined for answering the question text.
The relational inference model comprises: a first BERT network, a second BERT network, and a classifier.
The second training module 304 is specifically configured to:
inputting the character attribute information of the robot into a first BERT network to obtain a robot attribute vector, and inputting the answer text to be determined into a second BERT network to obtain an answer text vector; wherein the first and second BERT networks are twin BERT networks having the same parameters;
splicing the obtained robot attribute vector and the answer text vector to obtain a spliced vector;
inputting the splicing vector into the classifier, and determining the relationship between character attribute information and character attributes carried in the answer text to be determined, thereby training to obtain the relational inference model.
The second text generation model comprising: fusing an encoder and a decoder.
The third training module 306 is specifically configured to:
when the fact that the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relationship is determined, replacing the participles in the clauses which are in the contradictory relationship with the character attribute information in the answer text to be determined by using a preset identification to obtain an answer text to be processed;
acquiring a sentence set, selecting partial sentences from the sentence set, inputting the partial sentences into the BERT model, and performing mask operation;
disorganizing clauses in partial sentences in the sentence set so that adjacent clauses in the sentences in which the clauses are disorganized are discontinuous;
inputting sentences with disordered clauses and sentences without disordered clauses into the BERT model after mask operation, and completing pre-training of the BERT model;
preprocessing the character attribute information and the answer text to be processed to obtain an attribute word segmentation vector of the character attribute information and an answer text word segmentation vector of the answer text to be processed;
obtaining the dimensionality of the answer text word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value of the attribute word segmentation vector and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector from the answer text word segmentation vector;
inputting the dimensionality of the answer text word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector into the pre-trained BERT model, and executing the following operations:
calculating a scaling factor used when the answer text participle vector and the attribute participle vector are fused by the following formula:
Figure 230022DEST_PATH_IMAGE046
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE072
representing a scaling factor;
Figure 328471DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 486920DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 260841DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure 188345DEST_PATH_IMAGE050
representing the maximum value of the word segmentation vector of the answer text;
Figure 389520DEST_PATH_IMAGE052
representing the minimum value of the word segmentation vectors of the answer text;
Figure 35265DEST_PATH_IMAGE054
representing the dimension of the answer text participle vector;
selecting a third vector to be fused from the attribute word segmentation vectors, and selecting a fourth vector to be fused from the answer text word segmentation vectors;
calculating a fused vector obtained by fusing the third vector and the fourth vector by the following formula:
Figure 347297DEST_PATH_IMAGE056
wherein the content of the first and second substances,
Figure 332571DEST_PATH_IMAGE058
representing a fused vector after the third vector and the fourth vector are fused;
Figure 704646DEST_PATH_IMAGE060
representing a third vector;
Figure 837687DEST_PATH_IMAGE062
represents a fourth vector;
Figure 953411DEST_PATH_IMAGE064
representing a transpose of a fourth vector;
calculating a fused answer text vector for answering the question text by the following formula:
Figure 855508DEST_PATH_IMAGE066
wherein the content of the first and second substances,
Figure 398485DEST_PATH_IMAGE068
a final answer text vector representing text to answer the question;
when all the answer text word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the fusion encoder;
inputting the fused answer text vector into the decoder for decoding operation, training the decoder, and obtaining a final answer text vector of the question text;
and inputting the final answer text vector for answering the question text into a text generator to obtain the final answer text for answering the question text.
In summary, the present embodiment provides a question answering device, which trains a first text generation model by inputting character attribute information and a question text into the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, where the answer text to be determined carries character attributes of a robot; inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text; compared with the mode that the answer sentence model used by the robot in the related technology often cannot learn enough character attributes during training to cause the character attributes carried in the answer sentence of the robot to be inconsistent with the character attributes set by the robot, when the robot using the model trained by the question answering method, the device and the electronic equipment provided by the application carries out the user question answering, after the answer text carrying the character attributes of the robot is obtained through processing, whether the character attributes carried in the answer text are consistent with the character attributes set by the robot is judged, when the relation between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relation, the character attributes carried in the answer text are determined to be inconsistent with the character attributes set by the robot, and the character attribute information and the answer text to be determined which is inconsistent with the character attribute information are input into the second text generation And modeling to obtain a final answer text for answering the question text, so that consistency between character attributes carried by the generated answer and character attributes of the robot when the robot answers the question proposed by the user is ensured as much as possible.
Example 3
This embodiment proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the problem recovery method described in embodiment 1 above. For specific implementation, refer to method embodiment 1, which is not described herein again.
In addition, referring to the schematic structural diagram of an electronic device shown in fig. 4, the present embodiment also provides an electronic device, which includes a bus 51, a processor 52, a transceiver 53, a bus interface 54, a memory 55, and a user interface 56. The electronic device comprises a memory 55.
In this embodiment, the electronic device further includes: one or more programs stored on the memory 55 and executable on the processor 52, configured to be executed by the processor for performing the following steps (1) to (4):
(1) acquiring character attribute information of the robot and a question text serving as a training corpus;
(2) inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries the character attribute of the robot;
(3) inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relations comprise an implication relation, a neutral relation and a contradiction relation;
(4) when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text.
A transceiver 53 for receiving and transmitting data under the control of the processor 52.
Where a bus architecture (represented by bus 51) is used, bus 51 may include any number of interconnected buses and bridges, with bus 51 linking together various circuits including one or more processors, represented by processor 52, and memory, represented by memory 55. The bus 51 may also link various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further in this embodiment. A bus interface 54 provides an interface between the bus 51 and the transceiver 53. The transceiver 53 may be one element or may be multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. For example: the transceiver 53 receives external data from other devices. The transceiver 53 is used for transmitting data processed by the processor 52 to other devices. Depending on the nature of the computing system, a user interface 56, such as a keypad, display, speaker, microphone, joystick, may also be provided.
The processor 52 is responsible for managing the bus 51 and the usual processing, running a general-purpose operating system as described above. And memory 55 may be used to store data used by processor 52 in performing operations.
Alternatively, processor 52 may be, but is not limited to: a central processing unit, a singlechip, a microprocessor or a programmable logic device.
It will be appreciated that the memory 55 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 55 of the systems and methods described in this embodiment is intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, memory 55 stores elements, executable modules or data structures, or a subset thereof, or an expanded set thereof as follows: an operating system 551 and application programs 552.
The operating system 551 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application 552 includes various applications, such as a Media Player (Media Player), a Browser (Browser), and the like, for implementing various application services. A program implementing the method of an embodiment of the present invention may be included in the application 552.
In summary, the present embodiment provides a computer-readable storage medium and an electronic device, where character attribute information and a question text are input into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, where the answer text to be determined carries character attributes of the robot; inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text; compared with the mode that the answer sentence model used by the robot in the related technology often cannot learn enough character attributes during training to cause the character attributes carried in the answer sentence of the robot to be inconsistent with the character attributes set by the robot, when the robot using the model trained by the question answering method, the device and the electronic equipment provided by the application carries out the user question answering, after the answer text carrying the character attributes of the robot is obtained through processing, whether the character attributes carried in the answer text are consistent with the character attributes set by the robot is judged, when the relation between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relation, the character attributes carried in the answer text are determined to be inconsistent with the character attributes set by the robot, and the character attribute information and the answer text to be determined which is inconsistent with the character attribute information are input into the second text generation And modeling to obtain a final answer text for answering the question text, so that consistency between character attributes carried by the generated answer and character attributes of the robot when the robot answers the question proposed by the user is ensured as much as possible.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A method for problem recovery, comprising:
acquiring character attribute information of the robot and a question text serving as a training corpus;
inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries the character attribute of the robot;
inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relations comprise an implication relation, a neutral relation and a contradiction relation;
when the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradiction relationship, inputting the character attribute information and the answer text to be determined, which is in the contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text.
2. The method of claim 1, wherein the first text generation model comprises: an attribute fusion encoder and a unidirectional decoder;
the inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, and the method comprises the following steps:
pre-training a BERT model, and pre-processing the character attribute information and the problem text to obtain an attribute word segmentation vector of the character attribute information and a problem word segmentation vector of the problem text;
obtaining the dimensionality of the problem word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value and the minimum value of the problem word segmentation vector from the problem word segmentation vector;
inputting the dimensionality of the problem word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the problem word segmentation vector and the minimum value of the problem word segmentation vector into the pre-trained BERT model, and executing the following operations:
calculating a scaling coefficient used when the problem participle vector and the attribute participle vector are fused by the following formula:
Figure DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE004
representing a scaling factor;
Figure DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure DEST_PATH_IMAGE012
representing the maximum value of the problem participle vector;
Figure DEST_PATH_IMAGE014
representing a minimum value of a problem participle vector;
Figure DEST_PATH_IMAGE016
a dimension representing a problem participle vector;
selecting a first vector to be fused from the attribute word segmentation vectors, and selecting a second vector to be fused from the problem word segmentation vectors;
calculating a fused vector obtained by fusing the first vector and the second vector by the following formula:
Figure DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE020
representing a fused vector after the first vector and the second vector are fused;
Figure DEST_PATH_IMAGE022
representing a first vector;
Figure DEST_PATH_IMAGE024
representing a second vector;
Figure DEST_PATH_IMAGE026
representing a transpose of a second vector;
calculating a problem vector fused with user attributes by the following formula:
Figure DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE030
representing a question vector fused with user attributes;
when all the problem word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the attribute fusion encoder;
inputting the problem vector fused with the user attribute into the pre-trained BERT model, and training the BERT model by using a one-way mask attention mechanism to obtain a one-way decoder; the unidirectional decoder is used for outputting an answer vector of an answer text to be determined;
and inputting the answer vector of the answer text to be determined into a text generator to obtain the answer text to be determined for answering the question text.
3. The method of claim 2, wherein the relational inference model comprises: a first BERT network, a second BERT network and a classifier;
inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined, and the method comprises the following steps:
inputting the character attribute information of the robot into a first BERT network to obtain a robot attribute vector, and inputting the answer text to be determined into a second BERT network to obtain an answer text vector; wherein the first and second BERT networks are twin BERT networks having the same parameters;
splicing the obtained robot attribute vector and the answer text vector to obtain a spliced vector;
inputting the splicing vector into the classifier, and determining the relationship between character attribute information and character attributes carried in the answer text to be determined, thereby training to obtain the relational inference model.
4. The method of claim 3, wherein the second text generation model comprises: fusing an encoder and a decoder;
when the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradictory relationship, inputting the character attribute information and the answer text to be determined which is in the contradictory relationship with the character attribute information into a second text generation model to train the second text generation model, so that the trained second text generation model can obtain a final answer text for answering the question text, and the method comprises the following steps:
when the fact that the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relationship is determined, replacing the participles in the clauses which are in the contradictory relationship with the character attribute information in the answer text to be determined by using a preset identification to obtain an answer text to be processed;
acquiring a sentence set, selecting partial sentences from the sentence set, inputting the partial sentences into the BERT model, and performing mask operation;
disorganizing clauses in partial sentences in the sentence set so that adjacent clauses in the sentences in which the clauses are disorganized are discontinuous;
inputting sentences with disordered clauses and sentences without disordered clauses into the BERT model after mask operation, and completing pre-training of the BERT model;
preprocessing the character attribute information and the answer text to be processed to obtain an attribute word segmentation vector of the character attribute information and an answer text word segmentation vector of the answer text to be processed;
obtaining the dimensionality of the answer text word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value of the attribute word segmentation vector and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector from the answer text word segmentation vector;
inputting the dimensionality of the answer text word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector into the pre-trained BERT model, and executing the following operations:
calculating a scaling factor used when the answer text participle vector and the attribute participle vector are fused by the following formula:
Figure DEST_PATH_IMAGE032
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE034
representing a scaling factor;
Figure 515835DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 432975DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 150396DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure DEST_PATH_IMAGE036
representing the maximum value of the word segmentation vector of the answer text;
Figure DEST_PATH_IMAGE038
representing the minimum value of the word segmentation vectors of the answer text;
Figure DEST_PATH_IMAGE040
representing the dimension of the answer text participle vector;
selecting a third vector to be fused from the attribute word segmentation vectors, and selecting a fourth vector to be fused from the answer text word segmentation vectors;
calculating a fused vector obtained by fusing the third vector and the fourth vector by the following formula:
Figure DEST_PATH_IMAGE042
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE044
representing a fused vector after the third vector and the fourth vector are fused;
Figure DEST_PATH_IMAGE046
representing a third vector;
Figure DEST_PATH_IMAGE048
represents a fourth vector;
Figure DEST_PATH_IMAGE050
representing a transpose of a fourth vector;
calculating a fused answer text vector for answering the question text by the following formula:
Figure DEST_PATH_IMAGE052
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE054
a final answer text vector representing text to answer the question;
when all the answer text word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the fusion encoder;
inputting the fused answer text vector into the decoder for decoding operation, training the decoder, and obtaining a final answer text vector of the question text;
and inputting the final answer text vector for answering the question text into a text generator to obtain the final answer text for answering the question text.
5. A problem recovery device, comprising:
the acquisition module is used for acquiring character attribute information of the robot and question texts serving as training corpora;
the first training module is used for inputting the character attribute information and the question text into a first text generation model to train the first text generation model, so that the trained first text generation model can obtain an answer text to be determined for answering the question text, wherein the answer text to be determined carries the character attribute of the robot;
the second training module is used for inputting the character attribute information of the robot and the answer text to be determined into a relational reasoning model to train the relational reasoning model, so that the trained relational reasoning model can obtain the relationship between the character attribute information and the character attribute carried in the answer text to be determined; wherein the relations comprise an implication relation, a neutral relation and a contradiction relation;
and the third training module is used for inputting the character attribute information and the answer text to be determined, which is in a contradiction relationship with the character attribute information, into a second text generation model to train the second text generation model when the relationship between the character attribute information and the character attribute carried in the answer text to be determined is a contradiction relationship, so that the trained second text generation model can obtain the final answer text for answering the question text.
6. The apparatus of claim 5, wherein the first text generation model comprises: an attribute fusion encoder and a unidirectional decoder;
the first training module is specifically configured to:
pre-training a BERT model, and pre-processing the character attribute information and the problem text to obtain an attribute word segmentation vector of the character attribute information and a problem word segmentation vector of the problem text;
obtaining the dimensionality of the problem word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value and the minimum value of the problem word segmentation vector from the problem word segmentation vector;
inputting the dimensionality of the problem word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the problem word segmentation vector and the minimum value of the problem word segmentation vector into the pre-trained BERT model, and executing the following operations:
calculating a scaling coefficient used when the problem participle vector and the attribute participle vector are fused by the following formula:
Figure 2331DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE056
representing a scaling factor;
Figure 754386DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 514532DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 719248DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure 855832DEST_PATH_IMAGE012
representing the maximum value of the problem participle vector;
Figure 524710DEST_PATH_IMAGE014
representing a minimum value of a problem participle vector;
Figure 455757DEST_PATH_IMAGE016
a dimension representing a problem participle vector;
selecting a first vector to be fused from the attribute word segmentation vectors, and selecting a second vector to be fused from the problem word segmentation vectors;
calculating a fused vector obtained by fusing the first vector and the second vector by the following formula:
Figure 147770DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 822465DEST_PATH_IMAGE020
representing a fused vector after the first vector and the second vector are fused;
Figure 345850DEST_PATH_IMAGE022
representing a first vector;
Figure 713377DEST_PATH_IMAGE024
representing a second vector;
Figure 892686DEST_PATH_IMAGE026
representing a transpose of a second vector;
calculating a problem vector fused with user attributes by the following formula:
Figure 105493DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure 686647DEST_PATH_IMAGE030
representing a question vector fused with user attributes;
when all the problem word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the attribute fusion encoder;
inputting the problem vector fused with the user attribute into the pre-trained BERT model, and training the BERT model by using a one-way mask attention mechanism to obtain a one-way decoder; the unidirectional decoder is used for outputting an answer vector of an answer text to be determined;
and inputting the answer vector of the answer text to be determined into a text generator to obtain the answer text to be determined for answering the question text.
7. The apparatus of claim 6, wherein the relational inference model comprises: a first BERT network, a second BERT network and a classifier;
the second training module is specifically configured to:
inputting the character attribute information of the robot into a first BERT network to obtain a robot attribute vector, and inputting the answer text to be determined into a second BERT network to obtain an answer text vector; wherein the first and second BERT networks are twin BERT networks having the same parameters;
splicing the obtained robot attribute vector and the answer text vector to obtain a spliced vector;
inputting the splicing vector into the classifier, and determining the relationship between character attribute information and character attributes carried in the answer text to be determined, thereby training to obtain the relational inference model.
8. The apparatus of claim 7, wherein the second text generation model comprises: fusing an encoder and a decoder;
the third training module is specifically configured to:
when the fact that the relationship between the character attribute information and the character attributes carried in the answer text to be determined is a contradictory relationship is determined, replacing the participles in the clauses which are in the contradictory relationship with the character attribute information in the answer text to be determined by using a preset identification to obtain an answer text to be processed;
acquiring a sentence set, selecting partial sentences from the sentence set, inputting the partial sentences into the BERT model, and performing mask operation;
disorganizing clauses in partial sentences in the sentence set so that adjacent clauses in the sentences in which the clauses are disorganized are discontinuous;
inputting sentences with disordered clauses and sentences without disordered clauses into the BERT model after mask operation, and completing pre-training of the BERT model;
preprocessing the character attribute information and the answer text to be processed to obtain an attribute word segmentation vector of the character attribute information and an answer text word segmentation vector of the answer text to be processed;
obtaining the dimensionality of the answer text word segmentation vector and the dimensionality of the attribute word segmentation vector, determining the maximum value of the attribute word segmentation vector and the minimum value of the attribute word segmentation vector from the attribute word segmentation vector, and determining the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector from the answer text word segmentation vector;
inputting the dimensionality of the answer text word segmentation vector, the dimensionality of the attribute word segmentation vector, the maximum value of the attribute word segmentation vector, the minimum value of the attribute word segmentation vector, the maximum value of the answer text word segmentation vector and the minimum value of the answer text word segmentation vector into the pre-trained BERT model, and executing the following operations:
calculating a scaling factor used when the answer text participle vector and the attribute participle vector are fused by the following formula:
Figure 21813DEST_PATH_IMAGE032
wherein the content of the first and second substances,
Figure 688418DEST_PATH_IMAGE034
representing a scaling factor;
Figure 699056DEST_PATH_IMAGE006
representing the maximum value of the attribute word segmentation vector;
Figure 134716DEST_PATH_IMAGE008
representing the minimum value of the attribute word segmentation vectors;
Figure 906363DEST_PATH_IMAGE010
representing the dimension of the attribute word segmentation vector;
Figure 794685DEST_PATH_IMAGE036
representing the maximum value of the word segmentation vector of the answer text;
Figure 614873DEST_PATH_IMAGE038
representing the minimum value of the word segmentation vectors of the answer text;
Figure 905040DEST_PATH_IMAGE040
representing the dimension of the answer text participle vector;
selecting a third vector to be fused from the attribute word segmentation vectors, and selecting a fourth vector to be fused from the answer text word segmentation vectors;
calculating a fused vector obtained by fusing the third vector and the fourth vector by the following formula:
Figure 785272DEST_PATH_IMAGE042
wherein the content of the first and second substances,
Figure 223206DEST_PATH_IMAGE044
representing a fused vector after the third vector and the fourth vector are fused;
Figure DEST_PATH_IMAGE058
representing a third vector;
Figure 519190DEST_PATH_IMAGE048
represents a fourth vector;
Figure 929442DEST_PATH_IMAGE050
representing a transpose of a fourth vector;
calculating a fused answer text vector for answering the question text by the following formula:
Figure 980575DEST_PATH_IMAGE052
wherein the content of the first and second substances,
Figure 905806DEST_PATH_IMAGE054
a final answer text vector representing text to answer the question;
when all the answer text word segmentation vectors and all the attribute word segmentation vectors are subjected to fusion operation in the BERT model, obtaining the fusion encoder;
inputting the fused answer text vector into the decoder for decoding operation, training the decoder, and obtaining a final answer text vector of the question text;
and inputting the final answer text vector for answering the question text into a text generator to obtain the final answer text for answering the question text.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1 to 4.
10. An electronic device comprising a memory, a processor, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor to perform the steps of the method of any of claims 1-4.
CN202111565779.8A 2021-12-21 2021-12-21 Question reply method and device and electronic equipment Active CN113934836B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111565779.8A CN113934836B (en) 2021-12-21 2021-12-21 Question reply method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111565779.8A CN113934836B (en) 2021-12-21 2021-12-21 Question reply method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113934836A true CN113934836A (en) 2022-01-14
CN113934836B CN113934836B (en) 2022-03-01

Family

ID=79289358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111565779.8A Active CN113934836B (en) 2021-12-21 2021-12-21 Question reply method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113934836B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829777A (en) * 2018-05-30 2018-11-16 出门问问信息科技有限公司 A kind of the problem of chat robots, replies method and device
CN111079418A (en) * 2019-11-06 2020-04-28 科大讯飞股份有限公司 Named body recognition method and device, electronic equipment and storage medium
CN111143540A (en) * 2020-04-03 2020-05-12 腾讯科技(深圳)有限公司 Intelligent question and answer method, device, equipment and storage medium
CN111209384A (en) * 2020-01-08 2020-05-29 腾讯科技(深圳)有限公司 Question and answer data processing method and device based on artificial intelligence and electronic equipment
CN111611355A (en) * 2019-02-25 2020-09-01 北京嘀嘀无限科技发展有限公司 Dialog reply method, device, server and storage medium
US20200334334A1 (en) * 2019-04-18 2020-10-22 Salesforce.Com, Inc. Systems and methods for unifying question answering and text classification via span extraction
CN111831789A (en) * 2020-06-17 2020-10-27 广东工业大学 Question-answer text matching method based on multilayer semantic feature extraction structure
CN111860083A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Character relation completion method and device
CN111966812A (en) * 2020-10-20 2020-11-20 中国人民解放军国防科技大学 Automatic question answering method based on dynamic word vector and storage medium
CN112256847A (en) * 2020-09-30 2021-01-22 昆明理工大学 Knowledge base question-answering method integrating fact texts
CN112417877A (en) * 2020-11-24 2021-02-26 广州平云信息科技有限公司 Text inclusion relation recognition method based on improved BERT
CN112667799A (en) * 2021-03-15 2021-04-16 四川大学 Medical question-answering system construction method based on language model and entity matching
CN112905744A (en) * 2021-02-25 2021-06-04 华侨大学 Qiaoqing question and answer method, device, equipment and storage device

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829777A (en) * 2018-05-30 2018-11-16 出门问问信息科技有限公司 A kind of the problem of chat robots, replies method and device
CN111611355A (en) * 2019-02-25 2020-09-01 北京嘀嘀无限科技发展有限公司 Dialog reply method, device, server and storage medium
US20200334334A1 (en) * 2019-04-18 2020-10-22 Salesforce.Com, Inc. Systems and methods for unifying question answering and text classification via span extraction
CN111860083A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Character relation completion method and device
CN111079418A (en) * 2019-11-06 2020-04-28 科大讯飞股份有限公司 Named body recognition method and device, electronic equipment and storage medium
CN111209384A (en) * 2020-01-08 2020-05-29 腾讯科技(深圳)有限公司 Question and answer data processing method and device based on artificial intelligence and electronic equipment
CN111143540A (en) * 2020-04-03 2020-05-12 腾讯科技(深圳)有限公司 Intelligent question and answer method, device, equipment and storage medium
CN111831789A (en) * 2020-06-17 2020-10-27 广东工业大学 Question-answer text matching method based on multilayer semantic feature extraction structure
CN112256847A (en) * 2020-09-30 2021-01-22 昆明理工大学 Knowledge base question-answering method integrating fact texts
CN111966812A (en) * 2020-10-20 2020-11-20 中国人民解放军国防科技大学 Automatic question answering method based on dynamic word vector and storage medium
CN112417877A (en) * 2020-11-24 2021-02-26 广州平云信息科技有限公司 Text inclusion relation recognition method based on improved BERT
CN112905744A (en) * 2021-02-25 2021-06-04 华侨大学 Qiaoqing question and answer method, device, equipment and storage device
CN112667799A (en) * 2021-03-15 2021-04-16 四川大学 Medical question-answering system construction method based on language model and entity matching

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
吴炎 等: "基于BERT的语义匹配算法在问答系统中的应用", 《仪表技术》 *
张笑 等: "基于BERT多特征融合的番茄问答模型研究", 《信息与电脑》 *
李景玉: "基于BERT的孪生网络计算句子语义相似度", 《科技资讯》 *
王智悦 等: "基于知识图谱的智能问答研究综述", 《计算机工程与应用》 *

Also Published As

Publication number Publication date
CN113934836B (en) 2022-03-01

Similar Documents

Publication Publication Date Title
JP7087938B2 (en) Question generator, question generation method and program
CN112214604A (en) Training method of text classification model, text classification method, device and equipment
CN113297366B (en) Emotion recognition model training method, device, equipment and medium for multi-round dialogue
CN109344242B (en) Dialogue question-answering method, device, equipment and storage medium
CN110399454B (en) Text coding representation method based on transformer model and multiple reference systems
CN113326374B (en) Short text emotion classification method and system based on feature enhancement
CN112101042A (en) Text emotion recognition method and device, terminal device and storage medium
CN113536795A (en) Method, system, electronic device and storage medium for entity relation extraction
CN115759042A (en) Sentence-level problem generation method based on syntax perception prompt learning
CN113934836B (en) Question reply method and device and electronic equipment
CN117033796A (en) Intelligent reply method, device, equipment and medium based on user expression preference
CN116432705A (en) Text generation model construction method, text generation device, equipment and medium
CN113934825B (en) Question answering method and device and electronic equipment
CN112016281B (en) Method and device for generating wrong medical text and storage medium
CN113704466B (en) Text multi-label classification method and device based on iterative network and electronic equipment
CN114998041A (en) Method and device for training claim settlement prediction model, electronic equipment and storage medium
KR102354898B1 (en) Vocabulary list generation method and device for Korean based neural network language model
CN115730568A (en) Method and device for generating abstract semantics from text, electronic equipment and storage medium
CN113807512A (en) Training method and device of machine reading understanding model and readable storage medium
CN112749556A (en) Multi-language model training method and device, storage medium and electronic equipment
CN115935195B (en) Text matching method and device, computer readable storage medium and terminal
CN113886556B (en) Question answering method and device and electronic equipment
CN116432663B (en) Controllable diversity professional text generation method and system based on element diagram
CN111914560B (en) Text inclusion relation recognition method, device, equipment and storage medium
CN112000777A (en) Text generation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: Room 702, 7th floor, NO.67, Beisihuan West Road, Haidian District, Beijing 100080

Patentee after: Beijing Yunji Technology Co.,Ltd.

Address before: Room 702, 7th floor, NO.67, Beisihuan West Road, Haidian District, Beijing 100080

Patentee before: BEIJING YUNJI TECHNOLOGY Co.,Ltd.