CN114443824A - Data processing method and device, electronic equipment and computer storage medium - Google Patents

Data processing method and device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN114443824A
CN114443824A CN202210078998.1A CN202210078998A CN114443824A CN 114443824 A CN114443824 A CN 114443824A CN 202210078998 A CN202210078998 A CN 202210078998A CN 114443824 A CN114443824 A CN 114443824A
Authority
CN
China
Prior art keywords
node
vector corresponding
question
user
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210078998.1A
Other languages
Chinese (zh)
Other versions
CN114443824B (en
Inventor
高莘
张寓弛
王永亮
董扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210078998.1A priority Critical patent/CN114443824B/en
Priority claimed from CN202210078998.1A external-priority patent/CN114443824B/en
Publication of CN114443824A publication Critical patent/CN114443824A/en
Application granted granted Critical
Publication of CN114443824B publication Critical patent/CN114443824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification discloses a data processing method and device, electronic equipment and a computer storage medium. Wherein, the method comprises the following steps: and inputting the received user questions and N information sources related to the user questions obtained by querying from a preset question-answer data set into a question-answer model so as to obtain target answers of the user questions. N is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.

Description

Data processing method and device, electronic equipment and computer storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer storage medium.
Background
The community question-answering system is mostly constructed by using a method of manually writing answers, or an information extraction-based method, selecting correct answers from given answers, or extracting a sentence or a segment from a given article as an answer. These methods merely generate answers based on a single data source, which give answers that are generally less relevant to the question.
Disclosure of Invention
The embodiment of the specification provides a data processing method, a data processing device, electronic equipment and a computer storage medium, knowledge information contained in multiple information sources is understood by combining the association among the multiple information sources, and then the learned knowledge is integrated into a process of answering a user question, so that reasoning is realized in the multiple information sources, the consistency of a generated answer and the user question is improved, and the user viscosity and the user experience are improved. The technical scheme is as follows:
in a first aspect, an embodiment of the present specification provides a data processing method, including:
receiving a user question input by a user;
inquiring from a preset question-answer data set to obtain N information sources based on the user questions and N preset information source types; n is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources;
inputting the user question and the N information sources into a question-answering model, and outputting a target answer; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
In one possible implementation manner, the inputting the user question and the N information sources into a question-answering model and outputting a target answer includes:
inputting the user question and the N information sources;
constructing an abnormal graph according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises a user problem node and N information source nodes; the abnormal picture represents the user problem and the relation between the N information sources;
coding the text information corresponding to each node in the abnormal composition graph to obtain a vector corresponding to the text information of each node;
updating the vector corresponding to the text information of each node based on the abnormal graph to obtain the updated vector corresponding to each node;
and decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In a possible implementation manner, the text information corresponding to each node in the abnormal graph includes at least one word;
the encoding the text information corresponding to each node in the heteromorphic graph to obtain the vector corresponding to the text information of each node includes:
coding each word in the text information corresponding to each node to obtain a vector corresponding to each word;
and averagely pooling the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
In a possible implementation manner, the updating the vector corresponding to the text information of each node based on the heteromorphic graph to obtain an updated vector corresponding to each node includes:
calculating a first attention score between two adjacent nodes based on the abnormal graph and a vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
readjusting the first attention score based on the vector corresponding to the text message of the problem node and the vector corresponding to the text message of the source node to obtain a second attention score;
and determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node.
In a possible implementation manner, the calculating a first attention score between two adjacent nodes based on the dissimilarity graph and a vector corresponding to the text information of each node includes:
projecting a vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector and the second vector corresponding to each node correspond to each other one by one;
and calculating a first attention score between every two adjacent nodes based on the abnormal graph, the first vector corresponding to each node and the second vector.
In a possible implementation manner, before the calculating a first attention score between two adjacent nodes based on the dissimilarity graph and the vector corresponding to the text information of each node, the method further includes:
determining a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
the calculating a first attention score between two adjacent nodes based on the abnormal graph and the vector corresponding to the text information of each node includes:
projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node, and the edge type between the source node and the target node.
In one possible implementation manner, the readjusting the first attention score based on the vector corresponding to the text message of the problem node and the vector corresponding to the text message of the source node to obtain the second attention score includes:
determining a correlation between the source node and the problem node based on a vector corresponding to the text information of the problem node, a vector corresponding to the text information of the source node, and an edge type between the problem node and the source node;
a second attention score is determined based on the correlation between the source node and the problem node and the first attention score.
In a possible implementation manner, the target node corresponds to M source nodes; m is a positive integer;
the determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node includes:
determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
determining a second message to be delivered by the source node to the target node based on the first message and the second attention score;
performing weighted summation on the M second messages transmitted to the target node by the M source nodes corresponding to the target node to obtain third messages transmitted to the target node by all the source nodes;
and performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text message of the target node to obtain an updated vector corresponding to each node.
In a possible implementation manner, the decoding, based on the updated vector corresponding to each node, a vector corresponding to text information of the user question node to obtain a target answer includes:
respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and an updated encoding vector corresponding to each node;
fusing the problem coding vector with the updated coding vector corresponding to each node to obtain a target vector;
and inquiring from a preset vocabulary according to the target vector to obtain a target answer.
In a second aspect, an embodiment of the present specification provides a data processing apparatus, including:
the receiving module is used for receiving a user question input by a user;
the query module is used for querying from a preset question and answer data set to obtain N information sources based on the user question and N preset information source types; n is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources;
the question-answering module is used for inputting the user question and the N information sources into a question-answering model and outputting a target answer; the question-answer model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
In a possible implementation manner, the question answering module includes:
an input unit for inputting the user question and the N information sources;
the construction unit is used for constructing the heteromorphic graph according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises a user problem node and N information source nodes; the abnormal picture represents the user problem and the relation between the N information sources;
the encoding unit is used for encoding the text information corresponding to each node in the abnormal composition graph to obtain a vector corresponding to the text information of each node;
the updating unit is used for updating the vector corresponding to the text information of each node based on the abnormal graph to obtain the updated vector corresponding to each node;
and the decoding unit is used for decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In a possible implementation manner, the text information corresponding to each node in the abnormal graph includes at least one word;
the encoding unit includes:
a coding subunit, configured to code each word in the text information corresponding to each node to obtain a vector corresponding to each word;
and the average pooling subunit is used for performing average pooling on the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
In a possible implementation manner, the update unit includes:
a calculation subunit, configured to calculate a first attention score between two adjacent nodes based on the abnormal graph and a vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
an adjusting subunit, configured to readjust the first attention score based on a vector corresponding to the text information of the problem node and a vector corresponding to the text information of the source node, to obtain a second attention score;
a first determining subunit, configured to determine, based on the second attention score, a vector corresponding to the text information of the source node, a vector corresponding to the text information of the target node, and an edge type between the source node and the target node, a vector corresponding to each updated node.
In a possible implementation manner, the calculating subunit is specifically configured to:
projecting a vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector and the second vector corresponding to each node correspond to each other one by one;
and calculating a first attention score between every two adjacent nodes based on the abnormal graph, the first vector corresponding to each node and the second vector.
In a possible implementation manner, the update unit further includes:
a second determining subunit, configured to determine a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
the calculation subunit is specifically configured to:
projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node, and the edge type between the source node and the target node.
In a possible implementation manner, the adjusting subunit is specifically configured to:
determining a correlation between the source node and the problem node based on a vector corresponding to the text information of the problem node, a vector corresponding to the text information of the source node, and an edge type between the problem node and the source node;
a second attention score is determined based on the correlation between the source node and the problem node and the first attention score.
In a possible implementation manner, the target node corresponds to M source nodes; m is a positive integer;
the first determining subunit is specifically configured to:
determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
determining a second message to be delivered by the source node to the target node based on the first message and the second attention score;
performing weighted summation on the M second messages transmitted to the target node by the M source nodes corresponding to the target node to obtain third messages transmitted to the target node by all the source nodes;
and performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text message of the target node to obtain an updated vector corresponding to each node.
In a possible implementation manner, the decoding unit includes:
a decoding subunit, configured to decode a vector corresponding to the text information of the problem node and a vector corresponding to each updated node, respectively, to obtain a problem decoding vector and an encoding vector corresponding to each updated node;
a fusion subunit, configured to fuse the problem coding vector with the updated coding vector corresponding to each node to obtain a target vector;
and the query subunit is used for querying from a preset vocabulary according to the target vector to obtain a target answer.
In a third aspect, an embodiment of the present specification provides an electronic device, including: a processor and a memory;
the processor is connected with the memory;
the memory is used for storing executable program codes;
the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory, so as to perform the method provided by the first aspect of the embodiments of the present specification or any one of the possible implementation manners of the first aspect.
In a fourth aspect, the present specification provides a computer storage medium storing a plurality of instructions, where the instructions are adapted to be loaded by a processor and to execute a method provided by the first aspect of the present specification or any one of the possible implementation manners of the first aspect.
In the embodiment of the specification, the received user question and N information sources related to the user question obtained by querying from a preset question-answer data set are input into a question-answer model, so that a target answer of the user question is obtained. N is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions. The embodiment of the specification understands the information contained in the multiple information sources by combining the association among the multiple information sources, and then integrates the learned information into the process of answering the user question, so that reasoning is realized in the multiple heterogeneous information sources, and the consistency of the generated target answer and the user question is improved, namely the accuracy of answering the user question is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is an architectural diagram of a data processing system provided in an exemplary embodiment of the present description;
FIG. 2 is a schematic structural diagram of a question-answering model provided in an exemplary embodiment of the present specification;
FIG. 3 is a schematic structural diagram of a heterogeneous graph provided by an exemplary embodiment of the present description;
FIG. 4 is a flow chart illustrating a data processing method according to an exemplary embodiment of the present disclosure;
FIG. 5A is a schematic diagram of a terminal interface provided in an exemplary embodiment of the present description;
FIG. 5B is a diagram illustrating an information source queried according to a user question according to an exemplary embodiment of the present disclosure;
FIG. 5C is a schematic diagram of another terminal interface provided in an exemplary embodiment of the present description;
FIG. 6 is a schematic diagram illustrating a flow chart of implementing a question-answering model according to an exemplary embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram of another alternative relief pattern provided in an exemplary embodiment of the present disclosure;
fig. 8 is a flowchart illustrating an implementation process of decoding according to an exemplary embodiment of the present disclosure;
FIG. 9 is a flowchart illustrating an implementation of updating a heterogeneous graph according to an exemplary embodiment of the present disclosure;
fig. 10 is a schematic diagram illustrating a specific implementation flow of updating an abnormal image according to an exemplary embodiment of the present specification;
fig. 11 is a schematic structural diagram of a data processing apparatus according to an exemplary embodiment of the present specification;
fig. 12 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure.
The terms "first," "second," "third," and the like in the description and in the claims, and in the drawings described above, are used for distinguishing between different objects and not necessarily for describing a particular sequential order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an architecture of a data processing system according to an exemplary embodiment of the present disclosure. As shown in fig. 1, a data processing system may include: a cluster of terminals, and a server 120. Wherein:
the endpoint cluster may be a user endpoint, and specifically includes one or more user endpoints, where the plurality of user endpoints may include a user endpoint 110a, a user endpoint 110b, a user endpoint 110c …, and the like. User software can be installed in the terminal cluster and used for realizing functions of inputting questions and checking target answers corresponding to the questions on line by a user and the like. Any user end in the terminal cluster can establish a data relationship with the network, and establish a data connection relationship with the server 120 through the network, for example, sending a user question, receiving a target answer, and the like. Any user side in the terminal cluster can be, but is not limited to, a mobile phone, a tablet computer, a notebook computer and other devices with user version software.
The server 120 may be a server capable of providing various data processing, and may receive data such as a user question sent by any user side in a network or a terminal cluster, query a preset question-answer data set based on the user question and N preset information source types to obtain N information sources, and input the data such as the user question and the N information sources into a question-answer model to obtain a target answer corresponding to the user question. The server 120 may also send data such as a corresponding target answer to any user terminal in the network or the terminal cluster. The server 120 may be, but is not limited to, a hardware server, a virtual server, a cloud server, and the like.
The network may be a medium that provides a communication link between the server 120 and any user end in the terminal cluster, or may be the internet including network devices and transmission media, but is not limited thereto. The transmission medium may be a wired link (such as, but not limited to, coaxial cable, fiber optic cable, and Digital Subscriber Line (DSL), etc.) or a wireless link (such as, but not limited to, wireless fidelity (WIFI), bluetooth, and mobile device network, etc.).
It will be appreciated that the number of end clusters and servers 120 in the data processing system shown in FIG. 1 is by way of example only, and that any number of clients, and servers, may be included in the data processing system in a particular implementation. The examples in this specification are not particularly limited thereto. For example, but not limiting of, server 120 may be a server cluster of multiple servers.
The question-answering model provided by the embodiments of the present specification will be described with reference to fig. 1. Specifically, refer to fig. 2, which is a schematic structural diagram of a question-answering model according to an exemplary embodiment of the present specification. As shown in fig. 2, the question-answering model includes: an input module 210, a mapping module 220, an encoding module 230, a question perception map transformer 240, an answer generator 250, and an output module 260. Wherein:
the input module 210 may be a network interface, and is specifically configured to input text information such as a user question received by the server 120 through a network and N information sources queried from a preset question and answer data set according to the user question and N preset information source types. N is a positive integer of 2 or more. At least two information sources are related in the N information sources. The preset information source types include user articles, article comments, related questions, answers to the related questions, and the like, which are not limited in this specification. The input module 210 can also input the voice message sent by the user terminal and received by the server 120 through the network, and convert the voice message into the corresponding text message such as the user question.
And a mapping module 220 for constructing an abnormal map according to a preset rule according to the user question input by the input module 210 and the N information sources related to the user question. The heterogeneous graph comprises a user problem node and N information source nodes. The anomaly map characterizes the user problem and the relationship between the N information sources.
Illustratively, N-4, the 4 information sources input in the input module 210 are user articles, article comments, related questions and answers to the related questions, respectively. The article comment is a comment corresponding to the user article, namely the article comment is associated with the user article; the answer of the related question is the answer corresponding to the related question, namely the answer of the related question is related to the related question. As shown in fig. 3, the graph creating module 220 may connect the user question node 310 and the user article node 320 according to a preset edge type α 1 by using the 4 information sources and the user question input by the input module 210 as nodes, where the preset edge type α 1 is a user article to user question; connecting the user problem node 310 with the related problem node 330 according to a preset edge type α 2, wherein the preset edge type α 2 is the related problem-to-user problem; connecting the article comment node 340 with the user article node 320 according to a preset edge type alpha 3, wherein the preset edge type alpha 3 is an article comment to a user article; the answer node 350 of the relevant question is connected to the relevant question node 330 according to a preset edge type α 4, where the preset edge type α 4 is the answer to the relevant question. After the mapping module 220 connects the 5 nodes according to the 4 preset edge types, the heteromorphic mapping shown in fig. 3 can be obtained. The user problem and the relationship between the 4 information sources can be intuitively seen from the anomaly graph shown in fig. 3.
The encoding module 230 may include at least one node encoder, configured to encode text information corresponding to each node in the heteromorphic graph constructed by the mapping module 220, so as to obtain a vector corresponding to the text information of each node.
The problem awareness graph transformer 240 may aggregate information related to user problems from different types of information source nodes by passing information between heterogeneous nodes. A Question-aware Graph Transformer (QGT) 240, configured to update a vector corresponding to the text information of each node in the heterogeneous composition according to a correlation between nodes in the heterogeneous composition, and obtain an updated vector corresponding to each node. Each node after the update aggregates all the information related to the user problem transferred to the node to which the information can be transferred.
The answer generator 250 includes a mask Output Embedding layer (Output-Embedding), a Self-Attention layer (Self-Attention), a Question-Attention layer (Question-Attention), a Graph-Attention layer (Graph-Attention), and a Feed-Forward layer (Feed-Forward), and is configured to decode a vector corresponding to text information of a user Question node based on a vector corresponding to each updated node, thereby generating a target answer.
And an output module 260 for outputting the target answer generated by the answer generator 250.
Next, a data processing method provided by the embodiment of the present specification is described with reference to fig. 1 to 3. Specifically, refer to fig. 4, which is a flowchart illustrating a data processing method according to an exemplary embodiment of the present disclosure. As shown in fig. 4, the data processing method includes the following steps:
step 402, receiving a user question input by a user.
Specifically, after a user inputs a user question in user version software installed on the terminal, the terminal sends the user question to the server through the network, so that the server can receive the user question input in the user version software by the user through the network.
Illustratively, as shown in fig. 5A, the server can receive, via the network, a user question 510 "how to borrow and pay in the financial institution of kyoto? ".
Optionally, when the user inputs voice in the user version software installed on the terminal, the terminal may convert the voice into text information such as a user question through the voice conversion device, and then send the text information such as the user question to the server through the network, so that the server can receive the text information of the user question corresponding to the voice input by the user in the user version software through the network.
Optionally, the terminal may also directly send the voice to the server through the network, and after receiving the voice, the server converts the voice into text information such as a user question through the input module 210 in the question-and-answer model.
Step 404, obtaining N information sources from a preset question and answer data set based on the user question and N preset information source types.
Specifically, N kinds of information sources related to the user question may be queried using the user question from a preset question-and-answer data set (database) that has been constructed according to N kinds of preset information source types. Namely, the preset retrieval algorithm can be adopted to retrieve the N information sources with the correlation with the user question larger than the preset threshold value from the preset question-answer data set according to the N preset information source types. N is a positive integer of 2 or more. The preset information source types may include user articles, article comments, related questions, answers to the related questions, and the like, which are not limited in this specification. At least two information sources are related in the N information sources. The two related information sources may transmit messages therebetween, which may include a user article, article comments corresponding to the user article, related questions, answers corresponding to the related questions, and the like, and this specification does not limit this. The preset question and answer data set includes a question and answer data set MS-MARCO, a question and answer data set AntQA, etc., which is not limited in this specification. The preset search algorithm includes a BM25 algorithm, an Edit Distance (Edit Distance) algorithm, and the like, which is not limited in this specification. Each of the N information sources includes at least one piece of information. And the relevance of each piece of information in each information source and the user problem is larger than a preset threshold value. The preset threshold may be 0.99, 0.90, 0.80, 0.60, etc., which is not limited in this specification.
Illustratively, given a user question 510 as shown in fig. 5A and four preset information source types of user articles, article comments, related questions and answers to related questions, the BM25 algorithm can be used to retrieve from the already constructed question-answer dataset MS-MARCO and question-answer dataset AntQA the message that the user article 521 "kyoto financial credit card repayment of more than 2000 yuan from 3 months and 26 days as shown in fig. 5B would charge 0.1% of commission, which was transmitted to boil at various platforms since yesterday, that the user question has a correlation greater than the preset threshold" 0.5 ". It is believed that all people have been prepared psychologically in the first instance. "the article review 522 corresponding to the above-mentioned user article 521" i have been repayment through the financial affairs in the kyoto since other applications were charged ", the related question 523" how well is the financial affairs in the kyoto? "the answer 524" of the relevant question enters the white paper of financial affairs in the Jingdong province, so that the money borrowing and returning can be seen, the number of the money repayment can be seen by clicking the money repayment, enough money is stored in advance in the balance of the white paper of financial affairs in the Jingdong province ", and" the relevant question 525 "is borrowed and the fee is charged in advance? "and answers 526 to related questions" do not require that the blank bar pay ahead for a while without a temporary commission, and that everyone operates directly on the blank bar page. Logging in the capital of Jingdong, clicking the lower right corner, selecting by borrowing the home page, and prompting and operating the six information according to the page.
Step 406, inputting the user question and the N information sources into the question-answering model, and outputting the target answer.
Specifically, after receiving a user question input by a user and querying N information sources related to the user question from a preset question and answer data set according to the user question, the user question and the N information sources may be input into a question and answer model, so as to output a target answer corresponding to the user question. The question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions. After the question model outputs the target answers corresponding to the user questions, the server can send the target answers to the terminal, so that the user questions input by the user in the user edition software installed on the terminal are answered.
Illustratively, the server may input the user question 510 illustrated in fig. 5A and the 4 information sources illustrated in fig. 5B into the question-answering model, and may output the user question 510 "how to borrow money from the financial institution in the kyoto? The "corresponding target answer" may be paid in the capital of kyoton. At this time, as shown in fig. 5C, the server also feeds back the target answer output by the question-answering model to the terminal, i.e., the user version software QA in fig. 5C outputs the target answer 530.
Optionally, after the question-answer model outputs a target answer corresponding to the user question, cross entropy loss in the question-answer process of this time is also output, and the cross entropy loss is used as a training target to further optimize each parameter in the question-answer model, so that the accuracy of the question-answer model for answering the user question is further improved.
Specifically, referring to fig. 2 to fig. 3, a specific implementation process of the question-answering model outputting the target answer according to the input user question and the N information sources in step 406 in the data processing method provided in the embodiment of the present specification is described next. Specifically, refer to fig. 6, which is a schematic diagram illustrating an implementation flow of a question-answering model according to an exemplary embodiment of the present specification. As shown in fig. 6, the implementation process of the question-answering model includes the following steps:
step 602, user questions and N information sources are input.
Specifically, text information such as user questions received by the server through the network and N information sources queried from the preset question and answer data set according to the user questions and N preset information source types may be used as inputs of the question and answer model. N is a positive integer of 2 or more. At least two information sources are related in the N information sources. The preset information source types include user articles, article comments, related questions, answers to the related questions, and the like, which are not limited in this specification.
And step 604, constructing an abnormal graph according to a preset rule based on the user question and the N information sources.
Specifically, each piece of information in the input user question and N information sources related to the user question may be used as a node, and the nodes may be connected according to a preset edge type, so as to complete the construction of the heteromorphic graph. The heterogeneous graph comprises a user problem node and N information source nodes. The anomaly map characterizes the user problem and the relationship between the N information sources. The preset edge type is used for representing a connection relationship between two associated nodes in the abnormal graph, and may include a user article to user question, an article comment to a user article, a related question to a user question, an answer to a related question, and the like, which is not limited in this specification.
Illustratively, N-4, the 4 information sources input are user articles, article comments, related questions and answers to the related questions, respectively. The input user article includes a user article a and a user article B, the article comment a1 and the article comment a2 are comments corresponding to the user article a, the input related question includes a related question C and a related question D, the answer C1 of the related question and the answer C2 of the related question are answers corresponding to the related question C, as shown in fig. 7, the mapping module 220 in the question-answer model may use the 4 information sources and the input user question as nodes, and connect the user question node 710 with the user article a node 720 and the user article B node 730 respectively according to a preset edge type α 1, where the preset edge type α 1 is a user article to user question; connecting the user problem node 710 with a related problem C node 740 and a related problem D node 750 according to a preset edge type alpha 2, wherein the preset edge type alpha 2 is a related problem-to-user problem; connecting an article comment a1 node 760 and an article comment a2 node 770 with a user article A node 720 according to a preset edge type alpha 3, wherein the preset edge type alpha 3 refers to article comments to a user article; the answer C1 node 780 and the answer C2 node 790 to the related question are connected to the related question C node 740 according to a preset edge type α 4, wherein the preset edge type α 4 is the answer to the related question. After the mapping module 220 connects the 5 nodes according to the 4 preset edge types, the heteromorphic mapping shown in fig. 7 can be obtained. The above-described user problem and the relationship between the above-described 4 information sources can be intuitively seen from the anomaly diagram shown in fig. 7.
And 606, coding the text information corresponding to each node in the heteromorphic graph to obtain a vector corresponding to the text information of each node.
Specifically, the text information corresponding to each node in the heteromorphic graph includes at least one word. Each word in the text information corresponding to each node may be encoded by the encoding module 230 in the question-answering model, so as to obtain a vector corresponding to each word. And then, carrying out average pooling on the vector corresponding to each word in each node of the heteromorphic graph, thereby obtaining the vector corresponding to the text information of each node.
Illustratively, when a node in the heteromorphic graph
Figure BDA0003485173400000141
Wherein L isdFor showing the above sectionThe length of the word that the point u includes, i.e. the above-mentioned node u includes LdThe words (words) are the words,
Figure BDA0003485173400000142
lth for representing the jth node of the ith typedWords, which can be pre-trained using a node (BART) encoder for each of the above user questions
Figure BDA0003485173400000143
Encoding is carried out so as to obtain each word of the node u
Figure BDA0003485173400000144
Corresponding vector
Figure BDA0003485173400000145
Then, each word in the user problem node is used
Figure BDA0003485173400000146
Carrying out average pooling operation on the corresponding vectors so as to obtain the vectors corresponding to the text information of the node u
Figure BDA0003485173400000147
The vector corresponding to the text information of the node u can be obtained
Figure BDA0003485173400000148
As the initial node representation of node u.
And 608, updating the vector corresponding to the text information of each node based on the heteromorphic graph to obtain the updated vector corresponding to each node.
Specifically, the vector corresponding to the text information of each node in the above-mentioned abnormal graph, that is, the initialized node representation, may be updated by the question perception graph converter 240 in the question-answering model, so as to obtain an updated vector corresponding to each node. Each node after the update aggregates all the information related to the user problem transferred to the node to which the information can be transferred.
And step 610, decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
Specifically, the answer generator 250 shown in fig. 2 may be adopted to decode the vector corresponding to the text information of the user question node through the updated vector corresponding to each node, so as to generate the target answer corresponding to the user question.
Specifically, the answer generator 250 includes a mask Output Embedding layer (Output-Embedding), a Self-Attention layer (Self-Attention), a Question-Attention layer (Question-Attention), a Graph-Attention layer (Graph-Attention), and a Feed-Forward layer (Feed-Forward), and as shown in fig. 8, the specific implementation process of decoding the vector corresponding to the text information of the user Question node includes the following steps:
and step 802, respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and an updated encoding vector corresponding to each node.
Specifically, self-attention is first applied to embed with the mask output, resulting in an output p for each decoding stepsThen by corresponding the text information of the problem node q to the vector
Figure BDA0003485173400000151
Inputting into Question Attention layer (Question-Attention) to obtain Question decoding vector
Figure BDA0003485173400000152
And aggregating useful knowledge from the nodes included in the updated heterogeneous graph according to the state of each decoding step, namely updating the vector corresponding to each node
Figure BDA0003485173400000153
Inputting into Graph Attention layer (Graph-Attention) to obtain updated code vector corresponding to each node
Figure BDA0003485173400000154
As described above
Figure BDA0003485173400000155
For characterizing T types
Figure BDA0003485173400000156
The updated vector corresponding to each node.
And step 804, fusing the problem coding vector with the updated coding vector corresponding to each node to obtain a target vector.
Specifically, information included in the user problem and information included in each updated node may be dynamically fused through a Feed-Forward layer (Feed-Forward), that is, a problem code vector and a code vector corresponding to each updated node are fused to obtain a target vector. Equivalently, the probability that the coding vector corresponding to the updated node in the heteromorphic graph appears in the target answer is obtained by fusing the user question with the information source in the N through a Sigmoid function
Figure BDA0003485173400000157
Then, weighted summation is carried out according to the probability and the code vector corresponding to the updated node and the problem code vector, so as to obtain a target vector pf=ρpq+(1-ρ)pg
Step 806, obtaining a target answer by querying from a preset vocabulary according to the target vector.
Specifically, the answer generator 250 may obtain a plurality of words corresponding to the target vector by querying from a preset vocabulary according to the target vector, and rank the words according to the corresponding word positions in the target vector, thereby obtaining the target answer.
As shown in fig. 9, the specific implementation process of the problem-awareness graph converter 240 in step 608 to update the vector corresponding to the text information of each node includes the following steps:
step 902, calculating a first attention score between two adjacent nodes based on the heterogeneous graph and the vector corresponding to the text information of each node.
Specifically, the two adjacent nodes include a source node and a target node, and the source node and the target node are connected by a preset edge type, so that the source node can transmit a message to the target node. The problem perception graph converter 240 may calculate a correlation between each two adjacent nodes, that is, a first attention score, according to a connection relationship between the two adjacent nodes in the heterogeneous graph and a vector corresponding to the text information of each node.
Optionally, the problem awareness graph transformer 240 may first project a vector projection corresponding to the text information of each node in the heterogeneous graph into the first space and the second space, respectively, so as to obtain a first vector and a second vector corresponding to each node, and then calculate a first attention score between each two adjacent nodes according to the heterogeneous graph and the first vector and the second vector corresponding to each node. The method comprises the steps of firstly calculating the correlation between a target node and each source node by combining a second vector of the target node and a first vector of the source node through a preset edge type-specific parameter matrix, then mapping the attention scores between the target node and each source node to zero to positive infinity through a normalization index (softmax) function, and finally dividing each mapped attention score by the sum of all mapped attention scores to obtain a first attention score between the target node and each source node. The first attention score is a number of 0 or more and 1 or less. The first space corresponds to a second space, the first space is used for representing a value space, and the second space may be used for representing a key space, which is not limited in the present specification. The first vector and the second vector of each node correspond to each other one by one. The first vector is used for representing the index of an element, and the second vector is used for representing the specific content corresponding to the first vector. The first vector may be a key vector, and the second vector may be a value vector, which is not limited in the present specification.
Optionally, before calculating the first attention score between two adjacent nodes, the method further needs to calculateA source node and a target node are first determined from the heterogeneous graph. The target node is adjacent to the source node. That is, a node corresponding to the start end of an arrow (edge type of connection between a target node and a source node) in the heterogeneous graph is determined as the source node, and the tail end of the arrow, i.e., a node pointed by the arrow, is determined as the target node. And then projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node. And finally, determining a first attention score between the source node and the target node according to the first vector corresponding to the source node, the second vector corresponding to the target node and the edge type between the source node and the target node. The first attention score is a number of 0 or more and 1 or less. The first space corresponds to a second space, the first space is used for representing a value space, and the second space may be used for representing a key space, which is not limited in the present specification. There is a mapping relationship between the first vector and the second vector. The first vector and the second vector of each node correspond to each other one by one. The first vector is used for representing the index of an element, and the second vector is used for representing the specific content corresponding to the first vector. The first vector may be a key vector, and the second vector may be a value vector, which is not limited in the present specification. That is, if the first vector v(s) corresponding to the source node s is MLPγ(s)(D[s]) And a second vector k (t) corresponding to the target node t is MLPγ(t)(D[t]) The first attention score between the source node s and the target node t
Figure BDA0003485173400000171
In the above formula: γ(s) denotes a source node, γ (t) denotes a target node, D [ s ]]Is the initialized node representation of the source node s, i.e. the vector corresponding to the text information of the source node s, D [ t ]]An initialized node representation of the target node t, i.e. a vector corresponding to the text information of the target node t, e is used for representing the edge type between the source node s and the target node t, Wδ(e) Parameters specific to edge type eAnd N (t) is a neighbor node of the target node t, namely all source nodes corresponding to the target node t. The method for calculating the first attention score between two adjacent nodes in the embodiment of the present specification is not limited to the above-described method, and other methods may be used in a specific implementation.
Step 904, readjust the first attention score based on the vector corresponding to the text message of the problem node and the vector corresponding to the text message of the source node to obtain a second attention score.
In particular, since the purpose of passing messages between nodes is to extract useful knowledge from heterogeneous information sources to answer user questions, only the knowledge related to the questions should be retained when passing messages. Therefore, the first attention score between the source node and the target node needs to be re-measured by using the relationship score, i.e., the relevance, between the source node and the user question node. The semantic relevance between the source node and the user problem node can be determined according to the vector corresponding to the text information of the user problem node, the vector corresponding to the text information of the source node and the edge type between the user problem node and the source node. That is, the semantic correlation β(s) ═ D [ q ] between the source node s and the user problem node q can be calculated by the double-line layer]WrD[s]Wherein W isrFor trainable parameters, D [ q ]]And representing the initialized node of the user problem node q, namely a vector corresponding to the text information of the user problem node q. After the problem perception graph converter 240 calculates the semantic correlation between the source node and the user problem node, a second attention score related to the user problem between the source node and the target node is determined according to the semantic correlation β(s) between the source node and the user problem node and the first attention score α (s, e, t) mentioned above
Figure BDA0003485173400000172
Step 906, determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node.
In particular, since the messaging behavior on different edge types should be different. For example, the message transmitted between the relevant question node and the answer node of the relevant question should be the message related to the relevant question and the user question extracted from the answer of the relevant question according to the user question, and therefore, the message related to both the target node and the user question node transmitted from the source node needs to be calculated through different edge types. The target node corresponds to M source nodes. M is a positive integer.
Specifically, as shown in fig. 10, the specific implementation process of determining the updated vector corresponding to each node according to the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node includes the following steps:
step 1002, determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node.
In particular, the messages communicated from different source nodes to their target nodes should be different. Therefore, it is necessary to extract useful messages from the source node, i.e. the first message that the source node delivers to the target node, by means of the edge type. Namely, the first message transmitted from the source node s to the target node t can be obtained through the output of the multilayer perceptron
Figure BDA0003485173400000181
Wherein,
Figure BDA0003485173400000182
is a parameter matrix specific to edge type e, M1(s, e, t) is the first message related to the target node t that the source node s delivers to the target node t through the edge e.
And a step 1004 of determining a second message transmitted by the source node to the target node based on the first message and the second attention score.
Specifically, the source node s may be directed to the target node through the edge eAnd calculating a first message which is transmitted by the point t and is related to the target node t and a second attention score which is between the source node and the target node and is related to the user problem according to a preset formula, so that a second message which is transmitted by the source node to the target node and is related to the target node t and the user problem node is obtained. The preset formula may be the second message
Figure BDA0003485173400000183
The method for calculating the second message in the preset formula is not limited to the above, and other methods may be used in specific implementations, which are not limited in this description.
Step 1006, performing weighted summation on the M second messages transmitted from the M source nodes corresponding to the target node to obtain third messages transmitted from all the source nodes to the target node.
Specifically, since one target node corresponds to M source nodes, when updating the target node, all (M) second messages transmitted to the target node by all (M) source nodes are weighted and summed, so as to obtain third messages transmitted to the target node by all (M) source nodes corresponding to the target node. Namely the third message
Figure BDA0003485173400000184
Step 1008, performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text message of the target node to obtain an updated vector corresponding to each node.
Specifically, after the third messages transmitted by all the source nodes to the target node are obtained, a linear projection with nonlinear activation is applied to the sum of the messages transmitted by all the source nodes, that is, the third messages, and the vectors corresponding to the messages included in the target node, that is, the text information of the target node, are connected with the vectors obtained after projection through residual connection, so that the target node fuses the information related to the user problem from the adjacent source nodes, and the updated vectors corresponding to the target node are obtained
Figure BDA0003485173400000191
The method for calculating the vector corresponding to each updated node is not limited to the above-mentioned method, and other calculation methods may be used in a specific implementation, which is not limited in this embodiment of the present disclosure. When there are some nodes in the abnormal graph that cannot serve as source nodes, for example, the user article B node 730 in fig. 7, the answer c2 node 790 of the relevant question, and the like, and there is no target node corresponding to them, the vector corresponding to the text information of the node, that is, the initialized representation, may be directly used as the updated vector corresponding to the node, which is equivalent to that such nodes are not updated.
In the embodiment of the present specification, the received user question and N information sources related to the user question obtained by querying from a preset question-answer data set are input into the question-answer model provided in the embodiment of the present specification, so as to obtain the target answer to the user question. N is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions. In other words, in the embodiment of the present specification, information included in multiple information sources is understood and fused by combining associations between multiple heterogeneous information sources, and the information is fused into a process of answering a user question, so that reasoning is performed in the multiple heterogeneous information sources, and consistency between a generated target answer and the user question, that is, accuracy of answering the user question, is improved.
Referring to fig. 11, fig. 11 is a data processing apparatus according to an exemplary embodiment of the present disclosure. The data processing apparatus 1100 includes:
a receiving module 1110, configured to receive a user question input by a user;
a query module 1120, configured to query, based on the user question and the N types of preset information sources, N types of information sources from a preset question-and-answer data set; n is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources;
a question-answer module 1130, configured to input the user question and the N information sources into a question-answer model, and output a target answer; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
In one possible implementation, the question-answering module 1130 includes:
an input unit for inputting the user question and the N information sources;
the construction unit is used for constructing the heteromorphic graph according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises a user problem node and N information source nodes; the abnormal picture represents the user problem and the relation between the N information sources;
the encoding unit is used for encoding the text information corresponding to each node in the abnormal composition graph to obtain a vector corresponding to the text information of each node;
the updating unit is used for updating the vector corresponding to the text information of each node based on the abnormal graph to obtain the updated vector corresponding to each node;
and the decoding unit is used for decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In a possible implementation manner, the text information corresponding to each node in the abnormal graph includes at least one word;
the encoding unit includes:
a coding subunit, configured to code each word in the text information corresponding to each node to obtain a vector corresponding to each word;
and the average pooling subunit is used for performing average pooling on the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
In a possible implementation manner, the update unit includes:
a calculation subunit, configured to calculate a first attention score between two adjacent nodes based on the abnormal graph and a vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
an adjusting subunit, configured to readjust the first attention score based on a vector corresponding to the text information of the problem node and a vector corresponding to the text information of the source node, to obtain a second attention score;
a first determining subunit, configured to determine, based on the second attention score, a vector corresponding to the text information of the source node, a vector corresponding to the text information of the target node, and an edge type between the source node and the target node, a vector corresponding to each updated node.
In a possible implementation manner, the calculating subunit is specifically configured to:
projecting a vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector and the second vector corresponding to each node correspond to each other one by one;
and calculating a first attention score between every two adjacent nodes based on the abnormal graph, the first vector corresponding to each node and the second vector.
In a possible implementation manner, the update unit further includes:
a second determining subunit, configured to determine a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
the calculation subunit is specifically configured to:
projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node, and the edge type between the source node and the target node.
In a possible implementation manner, the adjusting subunit is specifically configured to:
determining a correlation between the source node and the problem node based on a vector corresponding to the text information of the problem node, a vector corresponding to the text information of the source node, and an edge type between the problem node and the source node;
a second attention score is determined based on the correlation between the source node and the problem node and the first attention score.
In a possible implementation manner, the target node corresponds to M source nodes; m is a positive integer;
the first determining subunit is specifically configured to:
determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
determining a second message to be delivered by the source node to the target node based on the first message and the second attention score;
performing weighted summation on the M second messages transmitted to the target node by the M source nodes corresponding to the target node to obtain third messages transmitted to the target node by all the source nodes;
and performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text message of the target node to obtain an updated vector corresponding to each node.
In a possible implementation manner, the decoding unit includes:
a decoding subunit, configured to decode a vector corresponding to the text information of the problem node and a vector corresponding to each updated node, respectively, to obtain a problem decoding vector and an encoding vector corresponding to each updated node;
a fusion subunit, configured to fuse the problem coding vector with the updated coding vector corresponding to each node to obtain a target vector;
and the query subunit is used for querying from a preset vocabulary according to the target vector to obtain a target answer.
The division of the modules in the data processing apparatus is only for illustration, and in other embodiments, the data processing apparatus may be divided into different modules as needed to complete all or part of the functions of the data processing apparatus. The implementation of each module in the data processing apparatus provided in the embodiments of the present specification may be in the form of a computer program. The computer program may be run on a terminal or a server. The program modules constituted by the computer program may be stored on the memory of the terminal or the server. The computer program, when executed by a processor, implements all or part of the steps of the data processing method described in the embodiments of the present specification.
Referring to fig. 12, fig. 12 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure. As shown in fig. 12, the electronic device 1200 may include: at least one processor 1210, at least one communication bus 1220, a user interface 1230, at least one network interface 1240, and a memory 1250. The communication bus 1220 may be used for implementing connection communication of the above components.
User interface 1230 may include a Display screen (Display) and a Camera (Camera), and optional user interfaces may also include standard wired interfaces and wireless interfaces.
The network interface 1240 may optionally include a bluetooth module, a Near Field Communication (NFC) module, a Wireless Fidelity (Wi-Fi) module, and the like.
Processor 1210 may include one or more processing cores, among others. The processor 1210, using various interfaces and connections throughout the electronic device 1200, performs various functions and processes data for the electronic device 1200 by executing or performing instructions, programs, code sets, or instruction sets stored in the memory 1250, and invoking data stored in the memory 1250. Optionally, the processor 1210 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1210 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1210, but may be implemented by a single chip.
The Memory 1250 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). Optionally, the memory 1250 includes a non-transitory computer readable medium. The memory 1250 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1250 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a receiving function, a querying function, a question and answer function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. Memory 1250 can also optionally be at least one memory device located remotely from the aforementioned processor 1210. As shown in fig. 12, the memory 1250, which is a type of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and program instructions.
In particular, processor 1210 may be used to call program instructions stored in memory 1250 and perform the following operations in particular:
a user question input by a user is received.
Inquiring from a preset question-answer data set to obtain N information sources based on the user questions and N preset information source types; n is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources.
Inputting the user question and the N information sources into a question-answering model, and outputting a target answer; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
In some possible embodiments, the processor 1210 inputs the user question and the N information sources into a question-answering model, and when outputting a target answer, is specifically configured to:
and inputting the user question and the N information sources.
Constructing an abnormal graph according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises a user problem node and N information source nodes; the anomaly map characterizes the user problem and the relationship between the N information sources.
And coding the text information corresponding to each node in the abnormal composition to obtain a vector corresponding to the text information of each node.
And updating the vector corresponding to the text information of each node based on the abnormal graph to obtain the updated vector corresponding to each node.
And decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In some possible embodiments, the text information corresponding to each node in the abnormal image includes at least one word;
the processor 1210 is specifically configured to, when encoding text information corresponding to each node in the heteromorphic graph to obtain a vector corresponding to the text information of each node, perform:
and coding each word in the text information corresponding to each node to obtain a vector corresponding to each word.
And averagely pooling the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
In some possible embodiments, the processor 1210 is configured to update the vector corresponding to the text information of each node based on the abnormal pattern, and when the updated vector corresponding to each node is obtained, specifically perform:
calculating a first attention score between two adjacent nodes based on the abnormal graph and a vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node.
And readjusting the first attention score based on the vector corresponding to the text message of the problem node and the vector corresponding to the text message of the source node to obtain a second attention score.
And determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node.
In some possible embodiments, when the processor 1210 calculates the first attention score between two adjacent nodes based on the anomaly map and the vector corresponding to the text information of each node, the processor is specifically configured to:
projecting a vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; and the first vector and the second vector corresponding to each node correspond to each other one by one.
And calculating a first attention score between every two adjacent nodes based on the abnormal graph, the first vector corresponding to each node and the second vector.
In some possible embodiments, before the processor 1210 calculates the first attention score between two adjacent nodes based on the anomaly map and the vector corresponding to the text information of each node, the processor is further configured to:
determining a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
the processor 1210 is specifically configured to perform, when calculating the first attention score between two adjacent nodes based on the anomaly map and the vector corresponding to the text information of each node:
projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; there is a mapping relationship between the first vector and the second vector.
Determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node, and the edge type between the source node and the target node.
In some possible embodiments, the processor 1210 is specifically configured to, when readjusting the first attention score based on the vector corresponding to the text information of the problem node and the vector corresponding to the text information of the source node to obtain the second attention score, perform:
and determining the correlation between the source node and the problem node based on the vector corresponding to the text information of the problem node, the vector corresponding to the text information of the source node and the edge type between the problem node and the source node.
A second attention score is determined based on the correlation between the source node and the problem node and the first attention score.
In some possible embodiments, the target node corresponds to M source nodes; m is a positive integer;
the processor 1210 is specifically configured to, when determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node, execute:
and determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and the edge type between the source node and the target node.
Determining a second message to be delivered by the source node to the target node based on the first message and the second attention score.
And performing weighted summation on the M second messages transmitted to the target node by the M source nodes corresponding to the target node to obtain third messages transmitted to the target node by all the source nodes.
And performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text message of the target node to obtain an updated vector corresponding to each node.
In some possible embodiments, the processor 1210 is configured to decode a vector corresponding to the text message of the user question node based on the updated vector corresponding to each node, and when obtaining a target answer, is specifically configured to:
and respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and an updated encoding vector corresponding to each node.
And fusing the problem coding vector with the updated coding vector corresponding to each node to obtain a target vector.
And inquiring from a preset vocabulary according to the target vector to obtain a target answer.
The present specification also provides a computer readable storage medium having stored therein instructions, which when run on a computer or processor, cause the computer or processor to perform one or more of the steps of the above embodiments. The respective constituent modules of the data processing apparatus may be stored in the computer-readable storage medium if they are implemented in the form of software functional units and sold or used as independent products.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions described above in accordance with the embodiments of this specification are all or partially performed when the computer program instructions described above are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., a flexible Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a Digital Versatile Disk (DVD)), a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. And the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks. The technical features in the present examples and embodiments may be arbitrarily combined without conflict.
The above-described embodiments are merely preferred embodiments of the present disclosure, and are not intended to limit the scope of the present disclosure, and various modifications and improvements made to the technical solutions of the present disclosure by those skilled in the art without departing from the design spirit of the present disclosure should fall within the protection scope defined by the claims.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Claims (12)

1. A method of data processing, the method comprising:
receiving a user question input by a user;
inquiring from a preset question-answer data set to obtain N information sources based on the user question and N preset information source types; n is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources;
inputting the user question and the N information sources into a question-answering model, and outputting a target answer; the question-answering model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
2. The method of claim 1, wherein inputting the user question and the N information sources into a question-answering model and outputting a target answer comprises:
inputting the user question and the N information sources;
constructing an abnormal graph according to a preset rule based on the user question and the N information sources; the heterogeneous graph comprises a user problem node and N information source nodes; the heterogeneous graph characterizes the relationship between the user question and the N information sources;
coding the text information corresponding to each node in the abnormal graph to obtain a vector corresponding to the text information of each node;
updating the vector corresponding to the text information of each node based on the abnormal graph to obtain the updated vector corresponding to each node;
and decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
3. The method of claim 2, wherein the text information corresponding to each node in the heteromorphic graph comprises at least one word;
the encoding the text information corresponding to each node in the heteromorphic graph to obtain the vector corresponding to the text information of each node comprises:
coding each word in the text information corresponding to each node to obtain a vector corresponding to each word;
and averagely pooling the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
4. The method according to claim 2, wherein the updating the vector corresponding to the text information of each node based on the abnormal pattern to obtain an updated vector corresponding to each node comprises:
calculating a first attention score between two adjacent nodes based on the abnormal graph and the vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
readjusting the first attention score based on the vector corresponding to the text information of the problem node and the vector corresponding to the text information of the source node to obtain a second attention score;
and determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node and the edge type between the source node and the target node.
5. The method of claim 4, wherein the calculating a first attention score between two adjacent nodes based on the anomaly map and the vector corresponding to the text information of each node comprises:
projecting the vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector and the second vector corresponding to each node correspond to each other one by one;
and calculating a first attention score between every two adjacent nodes based on the abnormal graph, the first vector corresponding to each node and the second vector.
6. The method of claim 4, before calculating the first attention score between two adjacent nodes based on the anomaly map and the vector corresponding to the text information of each node, the method further comprising:
determining a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
the calculating a first attention score between two adjacent nodes based on the abnormal graph and the vector corresponding to the text information of each node comprises:
projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node, and the edge type between the source node and the target node.
7. The method of claim 4, wherein the readjusting the first attention score based on the vector corresponding to the text message of the question node and the vector corresponding to the text message of the source node to obtain a second attention score comprises:
determining the relevance between the source node and the problem node based on the vector corresponding to the text information of the problem node, the vector corresponding to the text information of the source node and the edge type between the problem node and the source node;
determining a second attention score based on the correlation between the source node and the problem node and the first attention score.
8. The method of claim 4, the target node corresponding to M source nodes; m is a positive integer;
the determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node includes:
determining a first message transferred to the target node by the source node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
determining a second message that the source node delivers to the target node based on the first message and the second attention score;
carrying out weighted summation on M second messages transmitted to the target node by the M source nodes corresponding to the target node to obtain third messages transmitted to the target node by all the source nodes;
and performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text information of the target node to obtain an updated vector corresponding to each node.
9. The method of claim 2, wherein decoding a vector corresponding to the text message of the user question node based on the updated vector corresponding to each node to obtain a target answer, comprises:
respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and an updated encoding vector corresponding to each node;
fusing the problem coding vector with the updated coding vector corresponding to each node to obtain a target vector;
and inquiring from a preset vocabulary according to the target vector to obtain a target answer.
10. A data processing apparatus, the apparatus comprising:
the receiving module is used for receiving a user question input by a user;
the query module is used for querying from a preset question and answer data set to obtain N information sources based on the user question and N preset information source types; n is a positive integer greater than or equal to 2; at least two information sources are related in the N information sources;
the question-answering module is used for inputting the user question and the N information sources into a question-answering model and outputting a target answer; the question-answer model is obtained by training based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
11. An electronic device, comprising: a processor and a memory;
the processor is connected with the memory;
the memory for storing executable program code;
the processor runs a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any one of claims 1-9.
12. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1-9.
CN202210078998.1A 2022-01-24 Data processing method, device, electronic equipment and computer storage medium Active CN114443824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210078998.1A CN114443824B (en) 2022-01-24 Data processing method, device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210078998.1A CN114443824B (en) 2022-01-24 Data processing method, device, electronic equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN114443824A true CN114443824A (en) 2022-05-06
CN114443824B CN114443824B (en) 2024-08-02

Family

ID=

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114817512A (en) * 2022-06-28 2022-07-29 清华大学 Question-answer reasoning method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104216913A (en) * 2013-06-04 2014-12-17 Sap欧洲公司 Problem answering frame
US20160217209A1 (en) * 2015-01-22 2016-07-28 International Business Machines Corporation Measuring Corpus Authority for the Answer to a Question
US20200004875A1 (en) * 2018-06-29 2020-01-02 International Business Machines Corporation Query expansion using a graph of question and answer vocabulary
US20200134414A1 (en) * 2018-10-29 2020-04-30 International Business Machines Corporation Determining rationale of cognitive system output
US20200348659A1 (en) * 2019-05-03 2020-11-05 Chevron U.S.A. Inc. Automated model building and updating environment
US20200403945A1 (en) * 2019-06-19 2020-12-24 International Business Machines Corporation Methods and systems for managing chatbots with tiered social domain adaptation
CN112749265A (en) * 2021-01-08 2021-05-04 哈尔滨工业大学 Intelligent question-answering system based on multiple information sources
CN112948546A (en) * 2021-01-15 2021-06-11 中国科学院空天信息创新研究院 Intelligent question and answer method and device for multi-source heterogeneous data source
US20210217109A1 (en) * 2020-09-28 2021-07-15 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus of constructing a fused relationship network, electronic device and medium
CN113449038A (en) * 2021-06-29 2021-09-28 东北大学 Mine intelligent question-answering system and method based on self-encoder

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104216913A (en) * 2013-06-04 2014-12-17 Sap欧洲公司 Problem answering frame
US20160217209A1 (en) * 2015-01-22 2016-07-28 International Business Machines Corporation Measuring Corpus Authority for the Answer to a Question
US20200004875A1 (en) * 2018-06-29 2020-01-02 International Business Machines Corporation Query expansion using a graph of question and answer vocabulary
US20200134414A1 (en) * 2018-10-29 2020-04-30 International Business Machines Corporation Determining rationale of cognitive system output
US20200348659A1 (en) * 2019-05-03 2020-11-05 Chevron U.S.A. Inc. Automated model building and updating environment
US20200403945A1 (en) * 2019-06-19 2020-12-24 International Business Machines Corporation Methods and systems for managing chatbots with tiered social domain adaptation
US20210217109A1 (en) * 2020-09-28 2021-07-15 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus of constructing a fused relationship network, electronic device and medium
CN112749265A (en) * 2021-01-08 2021-05-04 哈尔滨工业大学 Intelligent question-answering system based on multiple information sources
CN112948546A (en) * 2021-01-15 2021-06-11 中国科学院空天信息创新研究院 Intelligent question and answer method and device for multi-source heterogeneous data source
CN113449038A (en) * 2021-06-29 2021-09-28 东北大学 Mine intelligent question-answering system and method based on self-encoder

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114817512A (en) * 2022-06-28 2022-07-29 清华大学 Question-answer reasoning method and device

Similar Documents

Publication Publication Date Title
CN111695674B (en) Federal learning method, federal learning device, federal learning computer device, and federal learning computer readable storage medium
CN110147435B (en) Dialogue generation method, device, equipment and storage medium
CN114330312A (en) Title text processing method, apparatus, storage medium, and program
CN114330474B (en) Data processing method, device, computer equipment and storage medium
CN111241850B (en) Method and device for providing business model
CN115310611B (en) Figure intention reasoning method and related device
CN111291170A (en) Session recommendation method based on intelligent customer service and related device
CN113569017A (en) Model processing method and device, electronic equipment and storage medium
CN112906361A (en) Text data labeling method and device, electronic equipment and storage medium
CN114358023B (en) Intelligent question-answer recall method, intelligent question-answer recall device, computer equipment and storage medium
WO2022188534A1 (en) Information pushing method and apparatus
CN115098700A (en) Knowledge graph embedding and representing method and device
CN113362852A (en) User attribute identification method and device
CN111460113A (en) Data interaction method and related equipment
CN116958738A (en) Training method and device of picture recognition model, storage medium and electronic equipment
CN114443824A (en) Data processing method and device, electronic equipment and computer storage medium
CN114443824B (en) Data processing method, device, electronic equipment and computer storage medium
CN112052680B (en) Question generation method, device, equipment and storage medium
CN115438164A (en) Question answering method, system, equipment and storage medium
CN112346737B (en) Method, device and equipment for training programming language translation model and storage medium
CN111310460A (en) Statement adjusting method and device
CN115510203B (en) Method, device, equipment, storage medium and program product for determining answers to questions
CN115102852B (en) Internet of things service opening method and device, electronic equipment and computer medium
US20220286416A1 (en) Method and apparatus for generating account intimacy
CN116776870B (en) Intention recognition method, device, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant