CN116383354A

CN116383354A - Automatic visual question-answering method based on knowledge graph

Info

Publication number: CN116383354A
Application number: CN202310277208.7A
Authority: CN
Inventors: 宋思程; 陈俊潼; 李晨辉; 王长波
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2023-03-21
Filing date: 2023-03-21
Publication date: 2023-07-04

Abstract

The invention discloses a graph visual automatic question-answering method based on a knowledge graph, which is characterized in that a knowledge base expansion module is adopted to expand semantic and graph attributes of the knowledge graph, shallow semantic analysis and named entity recognition are utilized to extract topic entities in questions, a knowledge path from topic entities to answer entities is output based on an automatic question-answering model of reinforcement learning and BERT, and text answers and visual answers are output according to different answer types. Compared with the prior art, the method solves the problem of automatic question and answer of a complex visual type of graph visualization, constructs a knowledge graph through the data characteristics of the graph visualization, improves the robustness and generalization capability of question and answer by using a method based on reinforcement learning and BERT, and automatically questions and answers a network graph visual graph in various practical application scenes such as literature character relation research, intelligent education, commodity cargo analysis, scientific research hot spot analysis and the like, thereby having higher practical value and good development prospect.

Description

Automatic visual question-answering method based on knowledge graph

Technical Field

The invention relates to the technical field of graph visualization knowledge graph in natural language processing, in particular to a graph visualization automatic question-answering method based on knowledge graph.

Background

When a user is faced with decision tasks, graph visualization is a common type of visualization that analyzes topology data and answers questions. However, this is not an easy task when the user is faced with many complex analysis problems with respect to the visualization of the graph. Firstly, a user needs to understand the problem before observing the chart, and when the user observes the chart for visualization, the user needs to understand the steps of legend and text information, perceiving graphic data, analyzing communities, calculating attributes of various chart data and the like according to different problems.

Currently, some natural language systems have been used for automatic chart questions and answers to help users analyze charts and answer questions faster. However, most existing work has focused on simple visual charts, such as scatter, line and bar charts. These charts are easily converted into form data and are suitable for use in a form question-answering system. Graph visualization is typically used to represent topological relationships between nodes, such as character relationships in novels. People typically analyze data and answer questions through charts. In practical applications, the problem posed by the user to the graph visualization involves not only the topological relation of nodes, but also the properties specific to the graph visualization, such as node degree, graph community, etc. Because the graphic visualization contains more attributes than the simple visualization and the graphic data is not a two-dimensional structure, it is difficult to convert the graphic visualization into the form data to apply the existing automatic question-answering method of the graph. In addition, most existing automated question-answering systems answer using text information, which is tedious and departs from the original purpose of visualizing the data. The visual answer is matched with the text answer, so that the answer is more useful and has more convincing and interpretability.

Disclosure of Invention

The invention aims to provide a graph visualization automatic question-answering method based on a knowledge graph aiming at the defects of the prior art, which adopts the knowledge graph to store the visualized information data of the graph, expands the semantic and graph attribute of the knowledge graph through a knowledge base expansion module, extracts topic entities in questions through shallow semantic analysis and named entity recognition, outputs a knowledge path from the topic entities to answer entities through an automatic question-answering model based on reinforcement learning and BERT, finally outputs text answers and visual answers according to different answer types, effectively solves the problem of automatic question-answering of the complex visualization type of the graph visualization, constructs the knowledge graph through the data characteristics of the graph visualization, improves the robustness and generalization capability of the question-answering through a method based on reinforcement learning and BERT, and can automatically question-answer a network graph in various practical application scenes such as literature relationship research, intelligent education, commodity cargo analysis, scientific research hot spot analysis and the like.

The specific technical scheme for realizing the aim of the invention is as follows: a visual automatic question-answering method based on knowledge graph is characterized by adopting a framework GVGA (graphic visual question-answering) to answer questions related to visual graph and automatically generate visual answers, firstly converting visual graph into standard GML format, then extracting graphic data and visual attributes from GML, designing the data structure of graph according to the characteristics of graph and converting the graph data structure into knowledge graph, designing an expansion module for the knowledge base for expanding semantic information and graph analysis data, adopting shallow semantic analysis and named entity recognition to extract topic entities in questions, using an automatic question-answering model based on reinforcement learning and BERT to output knowledge paths from topic entities to answer entities, and finally outputting text answers and visual answers according to different answer types, and the specific process of the invention comprises the following steps:

a step: a graph is input to visualize G and problem Q, and G is converted into a GML standard format.

b, step b: converting native data in a graph visualization into entities and native triples for storage in oneKnowledge graph

Wherein h represents a head entity, R represents a relation, t represents a tail entity, E represents an entity set, R represents a relation set, and the original data comprises node information, side information, weight and the like originally contained in the visualization of the graph.

c, step c: will be

Carrying out knowledge graph expansion, and storing the derived triples obtained after expansion into +.>

The knowledge graph extension comprises semantic information extension and graph attribute extension, the derived triples are obtained by calculating data of the original triples or other derived triples, and the extended knowledge graph is constructed and comprises graph topology information, graph attribute information and graph semantic information of graph visualization.

The graph topology information is that each node and each edge in a network graph are taken as entities, the connection relation of the edges and the neighbor relation of the nodes are taken as relation predicates, a plurality of triples are constructed, meanwhile, the names of the entities are taken as metadata triples to be added into an annotation graph, one graph entity is introduced to represent the whole graph, and the edges and the nodes are added from the triples of the graph; the graph attribute information is an analysis index common in network graph analysis and comprises the degree (in degree and out degree) of a node, the degree center degree of the node, the aggregation degree of the node, the weight of edges, community information and the like; after detecting communities of the graph by using a Louvain algorithm, building an entity for each community, adding the average degree, the average side weight, the average center aggregation degree and other attributes of each community, and adding the information of the community belonging to the graph, the information of nodes and sides belonging to the community into a knowledge graph; the graph semantic information is obtained by using predicate aliasing mechanism to add information in triplesr substitution with r with semantic information _s The method is added into the knowledge graph again, the single relation can be repeatedly added for a plurality of times, and the relation capable of having predicate aliases comprises the following steps: the node degree, the node degree center degree, the node aggregation degree, the edge weight, the community attribution, the community average weight, the community average degree, the community average aggregation degree, the edge connection relationship, the node neighbor relationship and the like.

d, step d: extracting topic entities in the input problem Q in the step a through a topic entity extraction module, and outputting candidate sets of topic entities

Includes a set {<t ₁ ，c ₁ >，<t ₂ ，c ₂ >,..}, wherein t _i Is the entity name, c _i Is a credibility score, and the credibility value interval is [0,1 ]]The higher the score, the more reliable the entity's predicted outcome.

The topic entity extraction module comprises named entity identification, multi-word entity identification, edge entity matching and graph entity addition, and is specifically as follows:

1) Named entity identification: tokenizing the input problem, sequentially carrying out named entity identification on each token by using a NER model in Flair, obtaining POS information of the sentence by utilizing a sequencer Tagger, and obtaining the POS information of the sentence in the following way

Word elements which are matched with POS labels as nouns in all entity names; if in the knowledge graph->

If there is a corresponding match, it is added to the knowledge-graph +.>

In (1) to obtain

Wherein->

The fraction output for the NER model.

2) Multi-lemma entity identification: sequentially matching entities with names containing spaces, namely entities corresponding to a plurality of word elements by using an N-Gram method, wherein the value of N is 1 to 4, sequentially matching, and if the matching is successful, then<t _NGram ，c＝max(c _n )>(c _n E t) to candidate set

Is a kind of medium.

3) Edge entity identification: if a plurality of node entities are identified, detecting whether the word element between the two entities is a conjunctive word, and if so, detecting whether the two entities contain a connection relation in the diagram; if so, add the entity of the edge to the candidate set

In which, let the confidence scores c of two entities be c respectively _a And c _b Then add entity<t _edge ，c＝max(c _a ，c _b )>。

4) Graph entity adds: considering that all the problems are related to the graph, if the above steps are carried out

Then the graph entity is to be rendered<t _graph ，c＝1>Add to candidate set->

In (a) and (b); if at this time->

Will be<t _graph ，c＝0.4>Add to candidate set->

Is a kind of medium.

And e, step e: as a candidate set

Constructing a candidate query graph, wherein the query graph comprises a core relation path from the topic entity to the answer entity, and the query graph is defined as QG= { N, E }, wherein N is a node set, and four types of nodes are all arranged: n is n _i ∈{n _g ，n _ug ，n _ag ，n _an }，n _g Is a grounding node and represents an entity existing in the knowledge graph; n is n _ug Is a non-grounded node, can be used to represent multiple entities or an intermediate query result, n _ag Is an aggregation node, which can be used to perform an aggregation operation, n _an Answer nodes represent query answers; e represents the edge set { E } ₁ ，e ₂ ,..}, wherein e _i Is the relation r in the triplet of the knowledge graph.

f, step f: iterative expansion inquiry diagram, order

For the set of query graphs at generation t, t=0,/for the selection of round t>

In each selection, all +.>

Attempting to attach a feasible relationship or a keyword-based aggregation operation to the end of the query graph; the possible relation is +.>

The relationship exists in the query graph, and the entity of the relationship is the tail node of the query graph; if only one topic entity in the current G is provided, the tail is designated as n after the feasible relation is attached _an The method comprises the steps of carrying out a first treatment on the surface of the If it already exists, the original n _an Change to n _ug Will be new n after the attachment _an As a new tail point.

g, the step: and (3) calculating the feature vector of the query graph, adopting a candidate query graph sorting model based on reinforcement learning, repeating the step f for searching and reserving 3 query graphs with the highest score by using the alternate beams of each round until the query graph cannot be expanded, and finally outputting the query graph with the highest score as an optimal query graph.

The reinforcement learning-based candidate query graph ranking model generates a 6-dimensional feature vector for each candidate query graph

The method specifically comprises the following steps:

1) BERT-based semantic matching: using a standard BERT model to scale the tokenized problem sequence s _q And query graph sequence s _g Semantic similarity between the two. s is(s) _g The sequence [ CLS ] is generated by sequentially concatenating the ground entity name and the relationship name along the core relationship path]s _q [SEP]s _g Providing the BERT models to calculate their semantic similarity;

2) Entity confidence: a cumulative confidence score c for all topic entities;

3) Number of entities: n in query graph _g Number of pieces;

4) Number of entity types: number of entity types;

5) Answer entity number: n in query graph _an Is the number of (3);

6) Polymerization amount: n in query graph _ag Is a number of (3).

Feeding the feature vector v of each candidate query graph Q into the fully connected layer of the enhancement model to obtain p (q|q); the training target is to learn a strategy function p _θ (q|q) where θ represents a parameter in the model, using the F1 score between the predicted answer and the correct answer label as a reward.

And h, step h: generating SPARQL statement S from head node to tail node according to the optimal query graph, and executing the query statement on the knowledge graph to obtain text answer A _t ，A _t The method can be divided into literal answers, entity answers and statistical answers; in the visualization of highlights according to different answer typesAnd drawing an auxiliary information card comprising a statistical bar chart, a related community, related nodes and related edges to generate a visual answer A _v 。

i, step: outputting text answer A _t With visual answer A _v And completing the visual automatic question and answer of the graph.

Compared with the prior art, the method solves the problem of automatic question and answer of the complex visualization type of graph visualization, constructs a knowledge graph through the data characteristics of the graph visualization, and improves the robustness and generalization capability of question and answer by using a method based on reinforcement learning and BERT. The method can automatically question and answer the visual chart of the network chart in various practical application scenes such as literature person relation research, intelligent education, commodity cargo analysis, scientific research hot spot analysis and the like, and has higher practical value and good development prospect.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

FIG. 2 is a schematic view of the diagram of example 1;

FIG. 3 is a schematic diagram of the graph visualization transition GML of embodiment 1;

FIG. 4 is a schematic diagram of the knowledge graph constructed in example 1;

fig. 5 is a topic entity candidate set extraction schematic diagram of embodiment 1.

Detailed Description

Referring to fig. 1, the data extraction and automatic question answering of the network visual map according to the present invention comprises the following steps:

step one: inputting a graph to visualize G and a problem Q;

step two: converting G into GML standard format, and constructing knowledge graph

Step three: generating derived triples and expanding knowledge graph

Step four: extracting topic entities from Q to obtain topic entity candidate sets

Step five: will be

And->

Inputting automatic question-answering module based on BERT and reinforcement learning to output text answer A _t ；

Step six: generating visual answer A _s And outputting the visual automatic question and answer with the text answers to complete the visual automatic question and answer of the graph.

The invention will be described in further detail below with the example of questions and answers visualized on a novel character relationship diagram.

Example 1

Referring to fig. 2, step 1: a graphic visualization G of a novel character is input with a set of questions Q.

Referring to fig. 3, step 2: converting the visual G of the graph into a GML standard format, and constructing a knowledge graph

Referring to FIG. 4, a native triplet including graph topology data is added to a knowledge graph

I.e., b in fig. 4), and adds the entity and subordinate triples of the overall graph (i.e., e in fig. 4).

Step 3: adding the derived triples after semantic expansion (i.e., a and d in FIG. 4) and graph attribute analysis expansion (i.e., e in FIG. 4) to the knowledge graph

Is a kind of medium.

Referring to fig. 5, step 4: extracting topic entity candidate sets for each question in input Q

Like question "What is the weight of the edge between Myriel and Napoleon? The topic entity candidate set of' is {<"m.node@myriel":1>,<"m.node@napoleo":1>,<"m.edge@myranap":1>,<"m.grap@2y96ai":0.4>}。

Step 5: inputting the constructed knowledge graph and topic entity candidate set into an automatic question-answering module, and outputting an optimal query graph and a corresponding SPARQL statement SELECT? e1 WHERE { m.edge@myranap edge.property.weight? e1, obtaining the text answer '1' according to the query statement.

Step 6: the literal answer "1" is a literal answer and the associated graph visualization element is the side of the junction Myriel that is connected to Napolean, so the visual answer generation module will highlight the side of the graph visualization to which the two junctions are connected and output the information card of their associated attributes as auxiliary information.

Step 7: and outputting the literal answer and the visual answer together to complete the visual automatic question and answer of the graph.

The invention is further described above without limiting the scope of the patent, which is defined by the appended claims and equivalents thereof without departing from the spirit and scope of the inventive concept.

Claims

1. The automatic graph visualization question-answering method based on the knowledge graph is characterized by converting graph visualization into a standard GML format, extracting graph data and visual attributes from the GML, designing a data structure of the graph according to the characteristics of the graph, converting the graph data into the knowledge graph to perform automatic question-answering of the graph visualization, and specifically comprises the following steps of:

a step: inputting a graph to visualize G and a problem Q, and converting G into a GML standard format;

b, step b: converting the original data in the visualization of the graph into an entity and an original triplet to store in a knowledge graph

Wherein h represents a head entity, R represents a relation, t represents a tail entity, E represents an entity set, R represents a relation set, and the original data comprises node information, side information and weight originally contained in the visualization of the graph;

c, step c: for knowledge graph

Expanding, and storing the derived triples into a knowledge graph ++>

The knowledge graph extension comprises semantic information extension and graph attribute extension, and the derived triples are obtained by data calculation of original triples or other derived triples;

Includes a set {<t ₁ ,c ₁ >,<t ₂ ,c ₂ >,..}, wherein t _i Is the entity name, c _i Is a credibility score, and the value interval is 0,1]The higher the score, the more reliable the entity's predicted outcome;

and e, step e: as a candidate set

Constructing a candidate query graph, wherein the query graph comprises a core relation path from the topic entity to the answer entity, and the query graph is defined as QG= { N, E }, wherein N is a node set, and four types of nodes are all arranged: n is n _i ∈{n _g ，n _ug ，n _ag ，n _an }，n _g A grounding node for representing an entity existing in the knowledge graph; n is n _ug A non-grounded node representing a plurality of entities or an intermediate query result; n is n _ag Aggregation nodes for performing aggregation operations; n is n _an Answer nodes representing answers to the query; for edge set { e, e ₂ ,..}, wherein e _i Is the relationship in the knowledge graph triplet;

f, step f: iteratively expanding a query graph to enable parameters

For the set of query graphs at the first generation, t=0,/for the first generation>

In each selection, all +.>

Attempting to attach a feasible relationship or a keyword-based aggregation operation to the end of the query graph; the feasible relationship is a knowledge graph +.>

The relationship exists in the query graph, and the entity of the relationship is the tail node of the query graph; if only one topic entity exists in the current visualization G, designating the tail as n after attaching the feasible relationship _an The method comprises the steps of carrying out a first treatment on the surface of the If it already exists, the original n _an Change to n _ug Will be new n after the attachment _an As a new tail point；

g, the step: calculating feature vectors of the query graphs, sorting the query graphs by adopting a candidate query graph sorting model based on reinforcement learning, and repeating the step f for searching and reserving 3 query graphs with highest scores by using the alternate beams of each round until the query graphs cannot be expanded, and finally outputting the query graph with the highest scores as an optimal query graph;

and h, step h: generating SPARQL statement S from head node to tail node according to the optimal query graph, and executing the query statement on the knowledge graph to obtain text answer A _t The text answer A _t For literal answers, entity answers and statistical answers, according to different answer types, relevant elements in the visualization of the graph are highlighted, and auxiliary information cards comprising statistical bar graphs, relevant communities, relevant nodes and relevant edges are drawn to generate visual answer A _v ；

2. The knowledge-graph-based automatic question-answering method according to claim 1, wherein the graph visualization includes: d3, ECharts, matplotlib, scipy open source visualization framework drawn vector diagrams or diversified network diagrams of bitmaps.

3. The knowledge-based graph visualization automatic question-answering method according to claim 1, wherein the knowledge-graph extension includes graph topology information, graph attribute information and graph semantic information of the graph visualization; the graph topology information is that each node and each edge in a network graph are taken as entities, the connection relation of the edges and the neighbor relation of the nodes are taken as relation predicates, a plurality of triples are constructed, meanwhile, the names of the entities are taken as metadata triples to be added into an annotation graph, one graph entity is introduced to represent the whole graph, and the edges and the nodes are added from the triples of the graph; the map attribute information includes: the degree of the nodes, the degree center of the nodes, the aggregation degree of the nodes, the weight of the edges and community information; after the community information uses the Louvain algorithm to detect communities of the graph, the community information is used for each communityThe group builds an entity, and adds average degree, average side weight and average center aggregation degree attribute of each community; the graph semantic information uses predicate aliasing mechanism to replace r in the triplet of added information with r with semantic information _s Adding the knowledge graph again, and repeatedly adding a single relation for a plurality of times; the relationships of predicate aliases include: the node degree, the node degree center degree, the node aggregation degree, the edge weight, the community attribution, the community average weight, the community average degree, the community average aggregation degree, the edge connection relationship and the node neighbor relationship.

4. The knowledge-graph-based visual automatic question-answering method according to claim 1, wherein the topic entity extraction module comprises named entity recognition, multi-word entity recognition, edge entity matching and graph entity addition, the named entity recognition tokenizes an input question, uses an NER model in Flair to sequentially recognize a named entity for each word element, and uses a sequencer tagger to obtain POS information of sentences in the knowledge graph

If there is a corresponding match, it is added to the candidate set +.>

In (1) get->

Wherein t is _NER Entity output for NER model +.>

The score output for the NER model;

the multi-epoch entity recognition usesThe N-Gram method sequentially matches the entities with names containing spaces, namely corresponding to a plurality of word elements, takes n=1-4 to sequentially match, and if the matching is successful, the method comprises the steps of<t _NGram ,c＝max(c _n )>(c _n E t) to candidate set

Wherein t is _NGram An entity identified for N-Gram; c _n Is a confidence score;

the edge entity identification is to identify a plurality of node entities, detect whether the word element between the two entities is a conjunctive word, if so, detect whether the two entities contain a connection relationship in the graph; if so, add the entity of the edge to the candidate set

In (a) and (b); let the confidence scores c of two entities be c respectively _a And c _b Then add graph entity<t _edge ,c＝max(c _a ,c _b )>；

The addition of the graph entities is that the graph entities are identified by named entities and edge entities, such as

Then the graph entity is to be rendered<t _graph ,c＝1>Add to candidate set->

In (a) and (b); for example->

Will be<t _graph ,c＝0.4>Add to candidate set->

In (1)/(2)>

Is represented as emptyParameters of the set.

5. The knowledge-based graph visualization automatic question-answering method according to claim 1, wherein the reinforcement learning-based candidate query graph ranking model generates a 6-dimensional feature vector for each candidate query graph q

The method specifically comprises the following steps: based on the semantic matching, entity confidence, entity type number, answer entity number and aggregation number of BERT, the feature vector v of each candidate query graph Q is fed into the full-connection layer of the enhancement model to obtain p (q|Q), and the training target is a learning strategy function p _q (q|q), where θ represents a parameter in the model; p is a policy function; using F1 points between predicted answers and correct answer labels as rewards, the BERT-based semantic matching uses a standard BERT model to scale a sequence of tagged questions s _q And query graph sequence s _g Semantic similarity between, where s _g The sequence [ CLS ] is generated by sequentially concatenating the ground entity name and the relationship name along the core relationship path]s _q [SEP]s _g Providing the BERT models to calculate their semantic similarity; the number of the entities is n in the query graph _g Number, where CLS is a sentence start tag; SEP is a flag separating two sentences; the number of entity types is the number of entity types; the number of answer entities is n in the query graph _an Is the number of (3); the aggregate number is n in the query graph _ag Is a number of (3).