CN112949421B - Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science - Google Patents

Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science Download PDF

Info

Publication number
CN112949421B
CN112949421B CN202110167638.4A CN202110167638A CN112949421B CN 112949421 B CN112949421 B CN 112949421B CN 202110167638 A CN202110167638 A CN 202110167638A CN 112949421 B CN112949421 B CN 112949421B
Authority
CN
China
Prior art keywords
graph
relation
text
matching
science
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110167638.4A
Other languages
Chinese (zh)
Other versions
CN112949421A (en
Inventor
余新国
彭饶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central China Normal University
Original Assignee
Central China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central China Normal University filed Critical Central China Normal University
Priority to CN202110167638.4A priority Critical patent/CN112949421B/en
Publication of CN112949421A publication Critical patent/CN112949421A/en
Application granted granted Critical
Publication of CN112949421B publication Critical patent/CN112949421B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Tourism & Hospitality (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Technology (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for solving questions of an artificial intelligence science image and text, which are characterized in that through identifying the science image and text questions, an electronic text and a vectorized image are obtained, the electronic text is subjected to vector conversion to obtain a vector sequence of the electronic text, a text model pool and a relational sub-model pool are selected according to category information of the science image and text questions, vector calculation matching is carried out on the vector sequence according to the text model pool to obtain a direct-old relation and/or an implicit relation, relational sub-graph matching is carried out on the vectorized image according to the relational sub-model pool to obtain a relational sub-graph, a graph relation is obtained from the relational sub-graph, a subset is selected from the relational group to be used as a question understanding result, and the question understanding result is solved to obtain a solving process. The invention improves the range and efficiency of solving the problems in the science of solving based on relation evolution, model pool, graph relation subgraph, direct-aging relation and hidden relation.

Description

Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence science-oriented graphic and text question solving method, device and equipment and a storage medium.
Background
The basic education science includes three subjects of mathematics, physics and chemistry, and the graphic subjects refer to subjects which are described in various natural languages and contain graphic representations. Algebraic topics of the order of the science of algebraic relation are topics related to algebraic relation calculation, and include physical algebraic topics, chemical algebraic topics, arithmetic topics, algebraic topics in mathematics, planar geometry calculation topics, planar geometry demonstration topics and the like. The topic of the science refers to the topic related to algebraic relation calculation and graphic relation calculation, and comprises algebraic topic, planar geometric proving topic and the like. In the aspect of the subject matters in the machine solution basic education, the mathematic subject matters in the machine solution basic education become active research questions for many times, become research hotspots again in recent years in the form of the subject matters in the machine solution basic education under the promotion of the technological progress in the related fields and the combination force of the intelligent education demands, and a plurality of geometrical expert systems, solution systems, online solution coaching systems and the like which are oriented to intelligent education services are successively put into practical application.
At present, the machine solution of the topic of the science chart in basic education mainly comprises the following five types of technologies:
1. double-frame method for machine solving arithmetic graphic and text questions
The double-frame method pre-establishes a problem solving frame and a knowledge frame, and firstly identifies the type of the problem when solving the problem, then selects the corresponding problem solving frame according to the type, extracts the knowledge in the problem and puts the knowledge into the knowledge frame. The relationship of each frame is inferred from the knowledge frame and the solution frame together, and an unknown quantity is calculated to form a solution process. Kinsch et al (1995) proposed a solution theory and a double-frame solution method for automatically solving an arithmetic topic, but only a one-step arithmetic topic was solved. Ma Yuhui et al (2012) expanded the knowledge frame representation of Kinsch et al, enabling machine solution of multi-step primary school math application questions. Hosseini et al (2014) at Washington university uses verb classification and solving process blocks to solve arithmetic application problems, another implementation of Kinsch solving theory. Because the method has no recognized problem classification method and system, for more complex problems, the problem type is difficult to be adopted to match a proper knowledge frame and a proper solution frame.
2. Machine understanding of geometric graphic topics in formal language-based basic education
Machine understanding of geometric topics in formal language-based basic education is to express the geometric topics to be understood in formal language, and further convert the formal language into geometric relationships to express the results of geometric topic understanding. Guo Haiyan et al (2012) propose understanding of geometric graphic topics based on a template matching method, which converts the topics into formalized constrained geometric propositions by matching geometric sentences with designed geometric sentence templates. The method aims at taking the limited geometric proposition as an intermediate language to generate a drawing command sequence so as to automatically construct a geometric figure, does not give a specific form for machine understanding of geometric picture and text questions, and is difficult to expand into question solutions of other categories.
3. Arithmetic graphic and text question machine solving method based on formal language
The arithmetic graphic and text question machine solving based on the formal language is to express the questions to be solved by the formal language which is simpler than the natural language, and establish a method for converting the natural language into the formal language to further deduce and solve the questions. Shi et al (2015) developed a Dolphin system to automatically complete semantic analysis and reasoning of arithmetic graphic questions, create DOL language with structured semantics to represent the question text, utilize a semantic analyzer to realize the transformation of mathematical question text into DOL tree, and further deduce the number relationship contained therein through the analysis of the DOL tree, thereby completing the question understanding. Liang et al (2016) presents a method for solving simple arithmetic graphic questions based on semantics and labels, converts the graphic questions into a fixed semantic structure so as to understand the questions, and selects relevant parts in the questions to be inferred through an inference module, thereby finally giving out the question solving expression of the person-like. This approach designs specific formal language representations for specific categories, i.e., without unified formal language representations, and is difficult to extend into the question solutions of other categories.
4. Arithmetic graphic and text question understanding based on machine learning
Kushman et al (2014) propose to use a machine learning based arithmetic topic understanding method. Firstly, establishing a linear equation set template library, and acquiring the corresponding relation between variables and parameters in the problem and equation template parameters by adopting a statistical model, so as to instantiate and obtain a linear equation set required by solving the problem. The current equation set template of the method can only be formed by linear equations, and the number of templates is limited, so that the problem which can be understood by the method is limited. In addition, the method is not subject to topic language analysis, so that the method is sensitive to irrelevant information in topics, and the performance is seriously reduced when facing to more complex topics.
5. Sequence-to-sequence arithmetic graphic and text question answering
Wang et al (2017) first proposed a sequence-to-sequence (Seq 2 Seq) method to solve arithmetic topics. The method designs a deep neural network to convert an input sequence into an output sequence, wherein the input sequence is a question text, and the output sequence is an answer expression consisting of numbers and operands. The numbers appearing in the answer expressions are the numbers appearing in the question text or variants thereof, as well as some numbers converted from the question text. The main disadvantage of this approach is that it cannot generate a readable solution because its entire process is within the black box of the deep neural network. In addition, the method has very limited capability and range of solving the questions, and can only solve the questions of single unknown quantity in the mathematical literal questions.
In summary, the existing method for machine answering the graphic questions is mainly focused on the algebraic relation calculation part in the graphic, and the calculation of the graphic relation is not deeply studied, so that the machine answering department graphic questions are still needed to be further improved in the aspects of solving method, question understanding depth, large-scale popularization and application and the like.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide an artificial intelligence method, device, equipment and storage medium for solving problems of limited solving range and low solving efficiency of the image-text questions of the science in the prior art.
In order to achieve the above purpose, the invention provides an artificial intelligence science graphic topic solving method, which comprises the following steps:
acquiring a science image question, identifying and extracting the science image question, and acquiring an electronic text and a vectorization image corresponding to the science image question;
classifying the electronic text and the vectorized image through a trained classifier to obtain category information of the tally text;
The electronic text is subjected to word segmentation and part-of-speech tagging through a word segmentation tool, keywords of topic contents are tagged according to a keyword list, and the part-of-speech and the words are converted into vectors according to a word-part-of-speech and word-to-vector correspondence list, so that a vector sequence of the topic electronic text in the science is obtained;
selecting a corresponding text model pool and a corresponding relation sub-graph model pool according to the category information;
vector calculation matching is carried out on the vector sequence according to the text model pool, and a direct-display relation and/or an implicit relation in the rational graph questions are obtained;
performing relationship sub-graph matching on the vectorized image according to the relationship sub-graph model pool to obtain a relationship sub-graph, and obtaining a graph relationship in the science graph questions from the relationship sub-graph;
forming a relation group according to the graph relation, the direct-aged relation and/or the hidden relation in the science-style graph questions, and selecting a subset from the relation group to serve as a question understanding result according to a selection rule corresponding to the category information;
and solving the topic understanding result to obtain a solving process corresponding to the topic of the science graph.
Preferably, the text model pool comprises a syntactic semantic model pool and an implicit relation model pool;
Before obtaining the science graph text, the method further comprises the following steps:
acquiring all image-text questions in each teaching branch field in science subjects as an image-text question set;
constructing a vectorized syntactic semantic model pool and a vectorized implication relation model pool for each teaching sub-field according to the text set in the graphic topic set;
and constructing a relational sub-graph model pool according to the graph set in the graph-text question set.
Preferably, performing relationship sub-graph matching on the vectorized image according to the relationship sub-graph model pool to obtain a relationship sub-graph, and obtaining the graph relationship in the rational graph questions from the relationship sub-graph, including:
generating a corresponding excavation process according to each model sub-graph in the relational sub-graph model pool;
performing relationship sub-graph matching on the vectorized image according to the mining process of each model sub-graph to obtain a relationship sub-graph corresponding to the vectorized image;
and excavating graph relations from the relation subgraph to obtain the graph relations in the science-oriented graph questions. Preferably, the performing vector calculation matching on the vector sequence according to the text model pool to obtain a direct-old relationship and/or an implicit relationship in the rational text question includes:
According to the syntactic semantic model pool, a matching network is calculated on the vector sequence based on an inference graph embedded with the syntactic semantic model, and a direct-aging relation in the rational graph questions is obtained;
and/or the number of the groups of groups,
and calculating a matching network for the vector sequence based on an inference graph embedded with an implicit relation model according to the implicit relation model pool to obtain the implicit relation in the science graph text.
Preferably, the obtaining the direct-aged relation in the rational graph questions by calculating a matching network for the vector sequence based on the inference graph of the embedded syntactic semantic model according to the syntactic semantic model pool includes:
matching the vector sequence with each word in the vectorized syntactic semantic model in the syntactic semantic model pool as a starting point according to a matching rule of the syntactic semantic model to obtain a first matching confidence and a first relation;
if the matching is successful, recording the position of an entity in the syntactic semantic model corresponding to the entity in the rational graph questions, recording the first matching confidence coefficient and the first relation in a next layer of nodes of an inference graph of the syntactic semantic model, and eliminating the matching corresponding to the minimum value in the first matching confidence coefficient if the next layer of nodes have no vacant nodes;
Circularly matching until all the matching starting points are matched with all the syntactic semantic models in the syntactic semantic model pool to obtain a direct-aging relation in the tally text;
the syntax semantic model is a tetrad M= (K, P, V, R), wherein K represents a keyword element, P is POS part of speech and is a change mode of punctuation marks, V is a calculation matching process, and R is a relation between related entities; the syntactic semantic model pool is Σ= { mi= (Ki, pi, vi, ri) |i=1, 2, …, m }.
Preferably, the obtaining the implicit relation in the science graph text question by calculating a matching network for the vector sequence based on the inference graph embedded with the implicit relation model according to the implicit relation model pool includes:
matching the vector sequence with each word in the implicit relation model pool as a starting point according to a matching rule of the implicit relation model to obtain a second matching confidence coefficient and a second relation;
if the matching is successful, recording the position of an entity in the implicit relation model corresponding to the entity in the rational graph questions, recording the second matching confidence coefficient and the second relation in the next layer of nodes of the graph of the implicit relation model, and eliminating the matching corresponding to the minimum value in the second matching confidence coefficient if the next layer of nodes have no vacant nodes;
A step of circularly matching until all the matching starting points are matched with all the implicit relation models in the implicit relation model pool;
wherein the implicit relation model is a triplet h= (F, V, R), F represents a feature set, V is a calculation matching process, and R is a relation between related entities; the pool of implicit relationship models is pi= { hi= (Fi, vi, ri) |i=1, 2, …, m }.
Preferably, the solving the topic understanding result to obtain a solving process corresponding to the rational graph-text topic includes:
if the category information of the science-style graphic questions is plane geometry proving graphic questions, proving the question understanding result through a geometry proving system to obtain a solving process corresponding to the science-style graphic questions;
if the category information of the rational image-text questions is algebraic image-text questions, finding out all number entities in the relation group according to the question understanding result, distributing variables to all number entities, converting the algebraic relation group into an algebraic equation group, recording a comparison table of the entities and the variables, solving a resolvable part in the algebraic equation group, substituting part of the resolvable part into the algebraic equation group to obtain a new resolvable part, and repeating the solving process to solve the algebraic equation group to obtain a solving process corresponding to the rational image-text questions.
In addition, in order to achieve the above object, the present invention further provides an artificial intelligence science graphic problem solving device, where the artificial intelligence science graphic problem solving device includes a memory, a processor, and an artificial intelligence science graphic problem solving program stored in the memory and capable of running on the processor, and the artificial intelligence science graphic problem solving program is configured to implement the steps of the artificial intelligence science graphic problem solving method as described above.
In addition, in order to achieve the above object, the present invention further provides a storage medium, on which an artificial intelligence science graphic problem solving program is stored, which when executed by a processor, implements the steps of the artificial intelligence science graphic problem solving method as described above.
In addition, in order to achieve the above purpose, the invention also provides an artificial intelligence science graphic and text question solving device, which comprises:
the recognition and extraction module is used for acquiring the tally graphic questions, recognizing and extracting the tally graphic questions, and acquiring electronic texts and vectorized images corresponding to the tally graphic questions;
the classification module is used for classifying the electronic text and the vectorized image through a trained classifier to obtain the category information of the science-oriented graphic questions;
The vector conversion module is used for carrying out word segmentation and part-of-speech tagging on the electronic text through a word segmentation tool, tagging keywords of the topic content according to a keyword list, and converting the part-of-speech and the word into vectors according to a word-part-of-speech and word-to-vector correspondence list to obtain a vector sequence of the electronic text;
the selection module is used for selecting a corresponding text model pool and a corresponding relation sub-graph model pool according to the category information;
the matching module is used for carrying out vector calculation matching on the vector sequence according to the text model pool to obtain a direct-display relation and/or an implicit relation in the rational graph questions;
the matching module is further used for performing relationship sub-graph matching on the vectorized image according to the relationship sub-graph model pool to obtain a sub-graph relationship, and obtaining a graph relationship in the science graph questions from the sub-graph relationship;
the selection module is used for forming a relation group according to the graph relation, the direct-aged relation and/or the hidden relation in the science-oriented graph questions, and selecting a subset from the relation group to serve as a question understanding result according to a selection rule corresponding to the category information;
and the solving module is used for solving the topic understanding result to obtain a solving process corresponding to the science graphic topic.
In the invention, through obtaining the picture and text questions of the science, carrying out recognition extraction on the picture and text questions of the science, obtaining the electronic text and the vectorization image corresponding to the picture and text questions of the science, classifying the electronic text and the vectorization image through a trained classifier, obtaining the category information of the picture and text questions of the science, carrying out word segmentation and part-of-speech tagging on the electronic text through a word segmentation tool, tagging the keywords of the subject contents according to a keyword list, carrying out the tagging on the keywords according to the part-of-speech and the words, converting the part-speech and the words into vectors, obtaining the vector sequence of the electronic text, selecting a corresponding text model pool and a relational sub-graph model pool according to the category information, carrying out vector computation matching on the vector sequence according to the text model pool, obtaining the direct-display relation and/or implicit relation in the picture and text sub-graph, carrying out relational sub-graph matching on the vectorization image according to the relational sub-graph model pool, obtaining the graph relation in the science picture and text sub-graph, obtaining the graph relation in the topic, carrying out the figure relation in the topic, according to the word relation, obtaining the corresponding relation in the topic sub-graph and the topic, obtaining the item set according to the rule, and obtaining the result of the word from the word sub-graph and the topic. The range and efficiency of solving the questions of the science department are improved based on relation evolution, model pools, graphic relation subgraphs, direct-aged relations and implicit relations.
Drawings
FIG. 1 is a schematic diagram of an artificial intelligence science-style graphic problem solving device of a hardware operation environment according to an embodiment of the present invention;
FIG. 2 is a flow chart of a first embodiment of the method for solving the topic of the artificial intelligence science graph and text of the invention;
FIG. 3 is a diagram illustrating the conversion of topic text into vectors in an embodiment of the present invention;
FIG. 4 is a flowchart of a second embodiment of the method for solving topic of artificial intelligence science graph and text according to the present invention;
FIG. 5 is a block diagram of a first embodiment of an artificial intelligence device for solving problems in a science-oriented image.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an artificial intelligence device for solving problems in a technical topic of a hardware operation environment according to an embodiment of the present invention.
As shown in fig. 1, the artificial intelligence science image topic solving apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), and the optional user interface 1003 may also include a standard wired interface, a wireless interface, and the wired interface for the user interface 1003 may be a USB interface in the present invention. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) Memory or a stable Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the structure shown in fig. 1 is not limiting of the artificial intelligence science image question solving apparatus, and may include more or less components than those illustrated, or may combine certain components, or a different arrangement of components.
As shown in FIG. 1, an operating system, a network communication module, a user interface module, and an artificial intelligence science image and text problem solving program may be included in the memory 1005 as a computer storage medium.
In the artificial intelligence science-graphic problem solving device shown in fig. 1, the network interface 1004 is mainly used for connecting a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting user equipment; the device for solving the image-text questions of the artificial intelligence science department calls the program for solving the image-text questions of the artificial intelligence science department stored in the memory 1005 through the processor 1001, and executes the method for solving the image-text questions of the artificial intelligence science department provided by the embodiment of the invention.
Based on the hardware structure, the embodiment of the artificial intelligence science graphic problem solving method is provided.
Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of an artificial intelligence science graphic theme solving method according to the present invention, and the first embodiment of the artificial intelligence science graphic theme solving method according to the present invention is provided.
In a first embodiment, the method for solving the problems of the artificial intelligence science graphic problems comprises the following steps:
step S10: and acquiring the rational image-text questions, identifying and extracting the rational image-text questions, and acquiring the electronic text and the vectorized image corresponding to the rational image-text questions.
In a specific implementation, the execution body of the embodiment is the device for solving the image-text questions of the artificial intelligence science, where the device for solving the image-text questions of the artificial intelligence science may be an electronic device such as a personal computer or a server, and the embodiment is not limited thereto. For the input topic image, identifying all topic contents in the topic image by using OCR (optical character recognition) technology, and obtaining the topic text of the science. For the input voice, identifying all the topic contents in the topic voice by using STT (Speech to Text) technology, and obtaining the topic text. Equivalent conversion of the topic from the topic input state to the topic text content state is realized, and the topic text is a natural language description text of the topic. And converting the topic text part in the science-oriented picture questions into an electronic text by adopting an existing algorithm, wherein the electronic text can be ASCII (integrated code for information interchange) code electronic text, and converting the graph in the science-oriented picture questions into a vectorized image, namely, combining each element in the graph with a group of numbers and feature descriptions.
Step S20: and classifying the electronic text and the vectorized image through a trained classifier to obtain the category information of the tally text.
It can be understood that, for each type of tally chart questions, a feature word vector set of the type of questions is established, and feature word vector sets of all types of tally chart questions are also established, which is called a tally chart question feature vector set. For an input topic text, firstly selecting a topic feature vector group of a science graph. And then, classifying by using a trained classifier, thereby obtaining the category information of the rational image and text questions. The classifier may be an SVM classifier or other classifiers, which is not limited in this embodiment. The category information comprises physical graphic questions, chemical graphic questions, arithmetic graphic questions, mathematical graphic questions, plane geometric calculation graphic questions, plane geometric proof graphic questions and the like.
Step S30: and performing word segmentation and part-of-speech tagging on the electronic text by a word segmentation tool, tagging keywords of the topic content according to a keyword list, and converting the part-of-speech and the word into vectors according to a word-to-vector correspondence list to obtain a vector sequence of the electronic text.
It should be noted that, the existing word segmentation tool is used to complete word segmentation and part of speech tagging of the topic text, and the part of speech is converted into a vector according to a mapping table from part of speech to vector, as shown in fig. 3.
Step S40: and selecting a corresponding text model pool and a corresponding relation sub-graph model pool according to the category information.
In a specific implementation, the category information represents the topic of which sub-domain of the specific science subjects to which the science subjects belong, and the text model pool comprises a syntax semantic model pool and an implicit relation model pool. For a graphic set of topic sets, a relational sub-graph model pool is prepared.
Step S50: and carrying out vector calculation matching on the vector sequence according to the text model pool to obtain a direct-aged relationship and/or an implicit relationship in the rational graph and text questions.
It is understood that the process of extracting the set of relationships contained in the title includes the acquisition of direct relationships and implicit relationships. And carrying out vector calculation matching on the vector sequence according to the text model pool to obtain a direct-old relationship and/or an implicit relationship in the vector sequence, namely, the direct-old relationship and/or the implicit relationship in the rational graph questions. The title is represented by a sequence of words and punctuation, and each word and punctuation is a vector, so it can also be said that the title is represented by a sequence of vectors, with the key words being noted. A graph inference process will take such a sequence of vectors as input to arrive at a set of direct aging relationships. The key and characteristic step of the graph reasoning is to match the syntactic semantic model with the topic to extract the relation, wherein the matching is performed by the operation among vectors, and the vectorized syntactic semantic model is embedded into the reasoning graph, so that the matching process is completely performed on the graph.
In the process of extracting the direct-stale relation, a starting point and a syntactic semantic model are firstly determined, matching calculation is carried out on the direct-stale relation and a vector sequence according to a matching rule of the model, and the obtained direct-stale relation and a matching value are put into a direct-stale relation candidate set to obtain the direct-stale relation. Here, the syntactic semantic model processes the straight-aged matching layer of the inference graph, and the nodes that store the straight-aged relationship candidate sets constitute the straight Chen Bufen of the candidate set layer of the inference graph.
In the process of obtaining the implicit relation, a starting point and an implicit relation model are firstly determined, matching calculation is carried out on the implicit relation and a vector sequence according to a matching rule of the model, an implicit relation and a matching value are obtained, and the obtained implicit relation is put into a direct-aged relation candidate set. The implicit relationship model matching process herein forms an implicit matching layer of the inference graph, and the nodes hosting the implicit relationship candidate sets form an implicit part of the candidate set layer of the inference graph. The core of the method for obtaining the implicit relation disclosed by the application is that an implicit model is converted into a vector which is fused into a graph calculation inference network to be calculated and matched with the vector of the topic text. The vectorized implicit relation model is embedded into the inference graph, and the matching process is completely carried out on the graph, so that the vectorized implicit relation model matching method is called graph inference.
Further, the step S50 includes:
according to the syntactic semantic model pool, a matching network is calculated on the vector sequence based on an inference graph embedded with the syntactic semantic model, and a direct-aging relation in the rational graph questions is obtained;
and/or the number of the groups of groups,
and calculating a matching network for the vector sequence based on an inference graph embedded with an implicit relation model according to the implicit relation model pool to obtain the implicit relation in the science graph text.
It should be noted that, this embodiment proposes a topic understanding method based on the matching calculation of the embedded model diagram. Specifically, a method for extracting the direct-stale relationship by a graph calculation matching network based on an embedded syntax semantic model is provided for the direct-stale relationship; a method for obtaining the implicit relation by a graph calculation matching network based on an embedded implicit relation model is provided for the implicit relation. The syntactic semantic model consists of a semantic part, a syntactic part, a matching rule and an output relation, wherein the semantic part is a keyword, the syntactic part is a change pattern of a part-of-speech representation symbol, and the matching rule is a matching object and an operation rule for specifying a vector of the model and a vector in a topic. A syntactic semantic model is defined as a tetrad M= (K, P, V, R), wherein K represents a keyword element, P is POS part of speech and punctuation mark, V is a matching and calculating rule of a vector in the model and a vector in a topic, and R is a relation between related entities output after the model is matched. Let Σ= { mi= (Ki, pi, vi, ri) |i=1, 2, …, m } denote a syntactic semantic model pool prepared for a subject of a science in a certain kind of basic education, and establishing such model pool for the subject type to be solved is a key problem for realizing direct statement relation extraction. The graph calculation matching network module embeds the model into a graph calculation network, converts the matching of the text level into the calculation of the vector in the graph network, and mainly comprises an input layer, a coding layer, a model layer, a selection layer and an output layer.
Further, the obtaining the direct-aging relation in the rational graph questions by calculating a matching network for the vector sequence based on the inference graph of the embedded syntactic semantic model according to the syntactic semantic model pool includes:
matching the vector sequence with each word in the vectorized syntactic semantic model in the syntactic semantic model pool as a starting point according to a matching rule of the syntactic semantic model to obtain a first matching confidence and a first relation;
if the matching is successful, recording the position of an entity in the syntactic semantic model corresponding to the entity in the rational graph questions, recording the first matching confidence coefficient and the first relation in a next layer of nodes of an inference graph of the syntactic semantic model, and eliminating the matching corresponding to the minimum value in the first matching confidence coefficient if the next layer of nodes have no vacant nodes;
circularly matching until all the matching starting points are matched with all the syntactic semantic models in the syntactic semantic model pool to obtain a direct-aging relation in the tally text;
the syntax semantic model is a tetrad M= (K, P, V, R), wherein K represents a keyword element, P is POS part of speech and is a change mode of punctuation marks, V is a calculation matching process, and R is a relation between related entities; the syntactic semantic model pool is Σ= { mi= (Ki, pi, vi, ri) |i=1, 2, …, m }.
It should be understood that the first confidence of matching is a degree of matching between the currently matched syntactic semantic model and the vector sequence, and the first relationship is a number relationship matched when the currently matched syntactic semantic model is matched with the vector sequence.
The syntactic semantic model is used to extract the direct statement relationship. The specific implementation process is as follows:
a. according to the category information of the title, loading a syntactic semantic model Chi = { Mi= (Ki, pi, vi, ri) |i=1, 2, …, m };
b. and calculating and matching by using each word of the model Mi and the topic as a starting point according to a matching rule of Mi, if the matching is successful, recording the position of an entity in the model corresponding to the entity in the topic, recording the matching confidence and the obtained relationship in the next layer of nodes of the graph, and if the next layer of nodes have no vacant nodes, eliminating the matching with the minimum confidence. For example, a model (mq per q; a=b×cqmq) is used to extract the mathematical relationship in "62 pieces per row", where "mq per q" is a mixture of syntax P and semantics K, part-of-speech tags q, m, q are nouns, numbers, and words, respectively, and "there is" is a mathematical keyword. "a=b×c" is the mathematical relationship R of the model match, where a, b, c are variables in the mathematical relationship, and "q m q" is a comparison table between the variables in the mathematical relationship and the entities in the sentence, which is a tie linking the sentence and the mathematical relationship group, where the first q's word corresponds to the variable a, the second m's word corresponds to the variable b, and the third q's word corresponds to the variable c. Matching is performed according to a matching start point loop defined by a matching rule V, and an algebraic relation 'row=62 x' is formed.
c. The cycle is broken until all the matching start points and all the models are matched.
Further, the obtaining, according to the text model pool, the implicit relationship in the science graph text question by calculating a matching network for the vector sequence based on the graph embedded with the implicit relationship model includes:
matching the vector sequence with each word in the implicit relation model pool as a starting point according to a matching rule of the implicit relation model to obtain a second matching confidence coefficient and a second relation;
if the matching is successful, recording the position of an entity in the implicit relation model corresponding to the entity in the rational graph questions, recording the second matching confidence coefficient and the second relation in the next layer of nodes of the graph of the implicit relation model, and eliminating the matching corresponding to the minimum value in the second matching confidence coefficient if the next layer of nodes have no vacant nodes;
a step of circularly matching until all the matching starting points are matched with all the implicit relation models in the implicit relation model pool;
wherein the implicit relation model is a triplet h= (F, V, R), F represents a feature set, V is a calculation matching process, and R is a relation between related entities; the pool of implicit relationship models is pi= { hi= (Fi, vi, ri) |i=1, 2, …, m }.
In a specific implementation, the second matching confidence is a matching degree between the currently matching implicit relation model and the vector sequence, and the second relation is a matched quantity relation when the currently matching implicit relation model is matched with the vector sequence.
And selecting a corresponding implicit relation model pool according to the implicit relation topic type to obtain the implicit relation. The specific implementation process is as follows:
a. according to the category information of the title, enabling an implicit relation model Chi = { hi= (Fi, vi, ri) |i=1, 2, …, m } corresponding to the category information;
b. and calculating and matching by using each word of the model Hi and the topic as a starting point according to a matching rule of Hi, if the matching is successful, recording the position of an entity in the model corresponding to the entity in the topic, recording the matching confidence and the obtained relationship in the next layer of nodes of the graph, and if the next layer of nodes have no vacant nodes, eliminating the matching with the minimum confidence. For example, using an implicit relationship topic type identification network, identify the topic "what is a square vegetable field, 12 meters on a side, and its area? The implicit relation to be added in "is the square area formula" s=a×a ", where s represents the area and a represents the side length. And then extracting the entities corresponding to the hidden relation variables in the topic text according to the matching model of the variables in the square area formula and the entity mapping relation, namely 's=how much' and 'a=12×m'. Further, an algebraic relation "how much= (12×m) × (12×m)" can be further obtained according to the area formula "s=a×a".
Step S60: and carrying out relationship sub-graph matching on the vectorized image according to the relationship sub-graph model pool to obtain a relationship sub-graph, and obtaining the graph relationship in the science graph questions from the relationship sub-graph.
In a specific implementation, in the process of mining relationships from a graph, a mining process is first determined for each model subgraph. This mining process mines, for each input graph, all the subgraphs defined by this model subgraph into the input graph, and then another process obtains graph relationships from the mined subgraphs. And putting the obtained graph relationship into a graph relationship candidate set.
Step S70: and forming a relation group according to the graph relation, the direct-aged relation and/or the hidden relation in the science-oriented graph questions, and selecting a subset from the relation group to serve as a question understanding result according to a selection rule corresponding to the category information.
It should be understood that the obtained direct-aged relationship and implicit relationship in the reasonable topic are text relationships obtained from texts, the text relationships obtained from texts and the graph relationships mined from graphs form a candidate relationship set, and the relationship set required for solving the topic is selected from the candidate relationship set to be used as a topic understanding result. The selection method is characterized in that firstly, unknown quantities required to be required are identified, a relation connection forest is built by taking the relation of the unknown quantities as a starting point, and then, a plurality of relation sets are selected as a plurality of results of topic understanding according to the limiting condition serving as the relation sets of the topic understanding.
Step S80: and solving the topic understanding result to obtain a solving process corresponding to the topic of the science graph.
It will be appreciated that for the results of the topic understanding, if the geometric theorem proves topics, they are input to the geometric proof module. If the algebraic relation set is, firstly, all the quantity entities in the relation set are found, the variables are distributed to all the quantity entities, the algebraic relation set is converted into an algebraic equation set, meanwhile, the comparison table of the entities and the variables is recorded, and the machine automatically solves the algebraic equation set as follows: the solvable part in the algebraic equation set is solved, then the part solution is substituted into the new solvable part, and the whole equation set is solved by repeating the process. In this embodiment, the step S80 specifically includes: if the category information of the science-style graphic questions is plane geometry proving graphic questions, proving the question understanding result through a geometry proving system to obtain a solving process corresponding to the science-style graphic questions; if the category information of the rational image-text questions is algebraic image-text questions, finding out all number entities in the relation group according to the question understanding result, distributing variables to all number entities, converting the algebraic relation group into an algebraic equation group, recording a comparison table of the entities and the variables, solving a resolvable part in the algebraic equation group, substituting part of the resolvable part into the algebraic equation group to obtain a new resolvable part, and repeating the solving process to solve the algebraic equation group to obtain a solving process corresponding to the rational image-text questions.
For example, the mathematical problem is a problem of the type containing a partial integral implicit relationship, the algebraic equation equivalent representation of which is a second order algebraic equation set, and the machine solution to the formed algebraic relationship set can be achieved by sequentially solving the equation set.
In this embodiment, the core of extracting the direct relation is to convert the syntactic semantic model into a vector, and blend the vector into a graph calculation inference network to perform calculation matching with the vector of the topic text. Unlike the approach of extracting the direct-stale relationship using text and symbol matching based on syntactic semantic models. The method comprises the steps of carrying out each operation step on graphs, namely a vectorization syntax semantic model matching method of graph reasoning, carrying out relationship matching on vectorization images according to a relationship sub-graph model pool to obtain graph relationships in the science graph questions, combining the graph relationships with the direct-aged relationships and the implicit relationships of texts to form a relationship group, improving the understanding accuracy of the science graph questions, and improving the range and efficiency of solving the science questions based on relationship evolution, a model pool, graph relationship sub-graphs, direct-aged relationships and the implicit relationships.
Based on the first embodiment of the above method, referring to fig. 4, a second embodiment of the method for solving the problem of the artificial intelligence science-type graphic and text is provided.
In a second embodiment, before the step S10, the method further includes:
step S01: and acquiring all image-text questions in the teaching branch fields in the science subjects as an image-text question set.
Step S02: and constructing a vectorized syntactic semantic model pool and a vectorized implication relation model pool for each teaching sub-field according to the text set in the graphic topic set.
It should be noted that, in the stage of preparing the vectorization model pool, for each defined graphic topic set (which includes all topics in a teaching domain in a science topic), a vectorization syntax semantic model pool and a vectorization implication relation model pool are prepared for the natural language used for each topic statement.
For each natural language set of each basic education science and technology domain, a syntactic semantic model pool is prepared, such as the text part in primary school mathematics graphic questions, the text part in primary school plane geometry graphic questions, the text part in primary school algebra graphic questions, the text part in primary school mechanics graphic questions, the text part in plane geometry proving graphic questions and the like. A syntactic semantic model is a tetrad M= (K, P, V, R), wherein K represents a keyword element, P is POS part of speech and is a change mode of punctuation marks, V is a calculation matching process, and R is a relation between related entities; one syntactic semantic model pool is Σ= { mi= (Ki, pi, vi, ri) |i=1, 2, …, m }.
A pool of implicit relationship models is prepared. This pool of implicit relationship models is the relationship represented by the acquisition formula and scenario. The feature of the method of finding and adding the implicit relation is to obtain the implicit relation and the mapping relation between the variable in the implicit relation and the entity in the topic text. This feature benefits from the use of implicit relationship model matching. A pool of implicit relationship models is prepared for this approach in advance. An implicit relation model is a triplet h= (F, V, R), where F represents the feature set, V calculates the matching process, R is the relation between related entities; one pool of implicit relationship models is pi= { hi= (Fi, vi, ri) |i=1, 2, …, m }.
For each subject matter teaching field, a word segmentation tool is established for each set of questions of natural language statements. For example, the Chinese and English topic sets can use NLPIR as a word segmentation tool. It is further established that a vectorization tool vectorizes all of the words and punctuation marks of the set of topics, i.e., each word class, and each punctuation mark in the set is given a vector to represent. For example, the BERT model may be employed as a vectorization tool for chinese and english topic sets. The two model pools mentioned above are converted into corresponding vectorized model pools by using a word segmentation tool and a vectorization tool.
A matching process is defined for each syntactic semantic model, the input of the process is a vector sequence and a matching starting point, the matching process mainly comprises a matching rule and a computing function, a relation is output, the corresponding position of an entity in the relation in the topic is matched with a quality value.
A matching process is also defined for each implicit relation model, the input of the process is a vector sequence and a matching starting point, the matching process mainly comprises a matching rule and a computing function, a relation is output, and the corresponding position of an entity in the relation in the topic matches the quality value.
Step S03: and constructing a relational sub-graph model pool according to the graph set in the graph-text question set.
In a specific implementation, a relational sub-graph model pool is prepared for a graph set of a set of graph topics. A relational subgraph is a subgraph of a graph set for which there is a deterministic method to get a set of relationships and some of the topics in the topic set must use the relationships to be solved. In the graph understanding process, for each input graph, all the relationship subgraphs are found, and then the relationship is obtained from all the relationship subgraphs to form a graph relationship group.
In this embodiment, the selecting a subset from the relationship group according to the selection rule corresponding to the category information as the topic understanding result specifically includes:
identifying unknown quantity in the science graph questions from the relation group, gradually adding points and edges by taking the relation where the unknown quantity is located as a starting point, and constructing a relation connection forest;
and selecting a subset from the relation group based on the relation connection forest as a topic understanding result according to a selection rule corresponding to the category information.
It should be understood that the topic understanding result is obtained by selecting the subset, and the specific process is as follows:
a. and forming a relation group by the obtained graph relation, the direct-aged relation and/or the implicit relation in the science-oriented graph questions, wherein the relation group is used as a candidate relation complete set. The unknowns of the questions are identified from the corpus, and the points and edges are added step by step starting from the relationship in which the unknowns reside. If an entity is already in the forest in a relationship, a point is added for each of its entities that are not in the forest and connected to the point of the entity that is already in the forest.
b. And forming a relation group by the graph relation, the direct-aged relation and/or the implicit relation in the science-oriented graph questions, wherein the relation group is used as a candidate relation complete set. From this corpus, known quantities of questions are identified, and the relationship of the parts of the forest connecting the known quantities to the known quantities constitutes a question understanding, such that a question may have a plurality of question understanding results.
It should be understood that the numerical value of the algebraic science-style graphic problem unknown quantity is obtained through the solution relation group, and the specific process is as follows:
a. all text entities in the relation group obtained by topic understanding are concentrated to obtain a list of the entities, and a variable is allocated to each entity, so that each relation is converted into an equation, and the whole relation group is converted into an equation group.
b. The resolvable portion is cycled to obtain all unknown values. Specifically, the equation set is divided into a solvable portion and a remaining portion, and the solvable portion is solved. The solution of the resolvable portion is substituted into the remaining portion, and the process is looped until all unknowns have an answer.
Firstly solving a solvable part in the algebraic equation set, realizing reduction and simplification after partial solution, and repeating the process to solve the whole equation set to improve the solving efficiency; the unknown quantity of the questions is identified from the corpus, and the relation between the unknown quantity and the known quantity is connected in the forest to form the question understanding, so that the accuracy of the question understanding is improved, and the accuracy of the question solving is improved.
In this embodiment, the step S60 specifically includes:
generating a corresponding excavation process according to each model sub-graph in the relational sub-graph model pool;
Performing relationship sub-graph matching on the vectorized image according to the mining process of each model sub-graph to obtain a relationship sub-graph corresponding to the vectorized image;
and excavating graph relations from the relation subgraph to obtain the graph relations in the science-oriented graph questions.
The relational sub-graph model pool comprises a plurality of model sub-graphs, each model sub-graph corresponds to a graph relationship, firstly, the mining process of each model sub-graph is determined, and relational sub-graph matching is carried out on the input vectorized image based on the determined mining process, so that relational sub-graphs contained in the vectorized image are mined. And carrying out relation decomposition and mining on the relation subgraphs to obtain corresponding graph relations, or searching for the corresponding graph relations based on a pre-established mapping relation table to obtain the graph relations corresponding to the vectorized image, namely the graph relations in the science-oriented graph questions.
In mining the relationships from the graph, a mining process is first determined for each model subgraph. This mining process mines, for each input graph, all the subgraphs defined by this model subgraph into the input graph, and then another process obtains graph relationships from the mined subgraphs. And putting the obtained graph relationship into a graph relationship candidate set.
In this embodiment, all graphic topics in each teaching sub-field in the science subjects are obtained and used as graphic topic sets, a vectorized syntactic semantic model pool and a vectorized implicit relation model pool are built for each teaching sub-field according to text sets in the graphic topic sets, a relation sub-model pool is built according to graphic sets in the graphic topic sets, and direct-old relations, implicit relations and graphic relations in the science topics are accurately matched based on abundant syntactic semantic models in the syntactic semantic model pool, abundant implicit relation models in the implicit relation model pool and abundant model sub-graphs in the relation sub-model pool, so that the solving range and accuracy of the science topics are improved.
In addition, the embodiment of the invention also provides a storage medium, wherein the storage medium is stored with an artificial intelligence science image-text question solving program, and the steps of the artificial intelligence science image-text question solving method are realized when the artificial intelligence science image-text question solving program is executed by a processor.
In addition, referring to fig. 5, the embodiment of the invention further provides an artificial intelligence science-oriented graphic-text question solving device, which comprises:
The recognition and extraction module 10 is used for acquiring the tally chart questions, recognizing and extracting the tally chart questions, and acquiring the electronic text and the vectorized image corresponding to the tally chart questions;
the classification module 20 is configured to classify the electronic text and the vectorized image by using a trained classifier, so as to obtain category information of the tally graphic questions;
the vector conversion module 30 is configured to perform word segmentation and part-of-speech tagging on the electronic text through a word segmentation tool, tag keywords of the topic content according to a keyword table, and convert the part-of-speech and the word into vectors according to a word-part-of-speech and word-to-vector correspondence table, so as to obtain a vector sequence of the electronic text of the topic of the science style;
a selecting module 40, configured to select a corresponding text model pool and a relational sub-graph model pool according to the category information;
the matching module 50 is configured to perform vector calculation matching on the vector sequence according to the text model pool, so as to obtain a direct-display relationship and/or an implicit relationship in the rational graph questions;
the matching module 50 is further configured to perform relationship subgraph matching on the vectorized image according to the relationship subgraph model pool, obtain a relationship subgraph, and obtain a graph relationship in the science-oriented graphic problem from the relationship subgraph;
The selecting module 60 is further configured to form a relationship group according to the graph relationship, the direct-aged relationship and/or the implicit relationship in the science-style graphic questions, and select a subset from the relationship group as a question understanding result according to a selecting rule corresponding to the category information;
and the solving module 70 is configured to solve the topic understanding result to obtain a solving process corresponding to the rational image-text topic.
Other embodiments or specific implementation manners of the artificial intelligence device for solving the graphic and text problems in the science of the invention can refer to the above method embodiments, and are not repeated here.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the terms first, second, third, etc. do not denote any order, but rather the terms first, second, third, etc. are used to interpret the terms as labels.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read only memory mirror (Read Only Memory image, ROM)/random access memory (Random Access Memory, RAM), magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (8)

1. The method for solving the image-text questions of the artificial intelligence science is characterized by comprising the following steps of:
acquiring a science image question, identifying and extracting the science image question, and acquiring an electronic text and a vectorization image corresponding to the science image question;
classifying the electronic text and the vectorized image through a trained classifier to obtain category information of the tally text;
the electronic text is subjected to word segmentation and part-of-speech tagging through a word segmentation tool, keywords of the topic content are tagged according to a keyword list, and the part-of-speech and the words are converted into vectors according to a word-part-of-speech and word-to-vector correspondence list, so that a vector sequence of the electronic text is obtained;
selecting a corresponding text model pool and a corresponding relation sub-graph model pool according to the category information;
Vector calculation matching is carried out on the vector sequence according to the text model pool, and a direct-display relation and/or an implicit relation in the rational graph questions are obtained;
performing relationship sub-graph matching on the vectorized image according to the relationship sub-graph model pool to obtain a relationship sub-graph, and obtaining a graph relationship in the science graph questions from the relationship sub-graph;
forming a relation group according to the graph relation, the direct-aged relation and/or the hidden relation in the science-style graph questions, and selecting a subset from the relation group to serve as a question understanding result according to a selection rule corresponding to the category information;
solving the topic understanding result to obtain a solving process corresponding to the topic of the science graph;
the text model pool comprises a syntactic semantic model pool and an implicit relation model pool;
before obtaining the science graph text, the method further comprises the following steps:
acquiring all image-text questions in each teaching branch field in science subjects as an image-text question set;
constructing a vectorized syntactic semantic model pool and a vectorized implication relation model pool for each teaching sub-field according to the text set in the graphic topic set;
constructing a relational sub-graph model pool according to the graph set in the graph-text question set;
The step of carrying out vector calculation matching on the vector sequence according to the text model pool to obtain a direct-aging relation and/or an implicit relation in the rational graph questions comprises the following steps:
according to the syntactic semantic model pool, a matching network is calculated on the vector sequence based on an inference graph embedded with the syntactic semantic model, and a direct-aging relation in the rational graph questions is obtained;
and/or the number of the groups of groups,
and calculating a matching network for the vector sequence based on an inference graph embedded with an implicit relation model according to the implicit relation model pool to obtain the implicit relation in the science graph text.
2. The method of claim 1, wherein the matching the relational subgraph of the vectorized image according to the relational subgraph model pool to obtain a relational subgraph, and obtaining the graph relationship in the rational topic from the relational subgraph comprises:
generating a corresponding excavation process according to each model sub-graph in the relational sub-graph model pool;
performing relationship sub-graph matching on the vectorized image according to the mining process of each model sub-graph to obtain a relationship sub-graph corresponding to the vectorized image;
And excavating graph relations from the relation subgraph to obtain the graph relations in the science-oriented graph questions.
3. The method for solving problems of artificial intelligence science graph according to claim 1, wherein the obtaining the direct-aged relationship in the science graph by calculating a matching network for the vector sequence based on an inference graph embedded with a syntactic semantic model according to the syntactic semantic model pool comprises:
matching the vector sequence with each word in the vectorized syntactic semantic model in the syntactic semantic model pool as a starting point according to a matching rule of the syntactic semantic model to obtain a first matching confidence and a first relation;
if the matching is successful, recording the position of an entity in the syntactic semantic model corresponding to the entity in the rational graph questions, recording the first matching confidence coefficient and the first relation in a next layer of nodes of an inference graph of the syntactic semantic model, and eliminating the matching corresponding to the minimum value in the first matching confidence coefficient if the next layer of nodes have no vacant nodes;
circularly matching until all the matching starting points are matched with all the syntactic semantic models in the syntactic semantic model pool to obtain a direct-aging relation in the tally text;
The syntax semantic model is a tetrad M= (K, P, V, R), wherein K represents a keyword element, P is POS part of speech and is a change mode of punctuation marks, V is a calculation matching process, and R is a relation between related entities; the syntactic semantic model pool is Σ= { mi= (Ki, pi, vi, ri) |i=1, 2, …, m }.
4. The method for solving problems of artificial intelligence science graph according to claim 1, wherein obtaining the implicit relation in the science graph by computing a matching network for the vector sequence based on the inference graph embedded with the implicit relation model according to the implicit relation model pool comprises:
matching the vector sequence with each word in the implicit relation model pool as a starting point according to a matching rule of the implicit relation model to obtain a second matching confidence coefficient and a second relation;
if the matching is successful, recording the position of an entity in the implicit relation model corresponding to the entity in the rational graph questions, recording the second matching confidence coefficient and the second relation in the next layer of nodes of the graph of the implicit relation model, and eliminating the matching corresponding to the minimum value in the second matching confidence coefficient if the next layer of nodes have no vacant nodes;
A step of circularly matching until all the matching starting points are matched with all the implicit relation models in the implicit relation model pool;
wherein the implicit relation model is a triplet h= (F, V, R), F represents a feature set, V is a calculation matching process, and R is a relation between related entities; the pool of implicit relationship models is pi= { hi= (Fi, vi, ri) |i=1, 2, …, m }.
5. The method for solving the problem of the artificial intelligence science graphic problem according to any one of claims 1 to 4, wherein the solving the problem understanding result to obtain the solving process corresponding to the science graphic problem comprises:
if the category information of the science-style graphic questions is plane geometry proving graphic questions, proving the question understanding result through a geometry proving system to obtain a solving process corresponding to the science-style graphic questions;
if the category information of the rational image-text questions is algebraic image-text questions, finding out all number entities in the relation group according to the question understanding result, distributing variables to all number entities, converting the algebraic relation group into an algebraic equation group, recording a comparison table of the entities and the variables, solving a resolvable part in the algebraic equation group, substituting part of the resolvable part into the algebraic equation group to obtain a new resolvable part, and repeating the solving process to solve the algebraic equation group to obtain a solving process corresponding to the rational image-text questions.
6. The utility model provides an artificial intelligence science picture and text problem solution device, its characterized in that, artificial intelligence science picture and text problem solution device includes:
the recognition and extraction module is used for acquiring the tally graphic questions, recognizing and extracting the tally graphic questions, and acquiring electronic texts and vectorized images corresponding to the tally graphic questions;
the classification module is used for classifying the electronic text and the vectorized image through a trained classifier to obtain the category information of the science-oriented graphic questions;
the vector conversion module is used for carrying out word segmentation and part-of-speech tagging on the electronic text through a word segmentation tool, tagging keywords of the topic content according to a keyword list, and converting the part-of-speech and the word into vectors according to a word-part-of-speech and word-to-vector correspondence list to obtain a vector sequence of the electronic text;
the selection module is used for selecting a corresponding text model pool and a corresponding relation sub-graph model pool according to the category information;
the matching module is used for carrying out relation matching on the vector sequence according to the text model pool to obtain a direct-display relation and/or an implicit relation in the science-oriented graphic questions;
the matching module is further used for matching the relational subgraph of the vectorized image according to the relational subgraph model pool to obtain a relational subgraph, and obtaining the graph relation in the science graph questions from the relational subgraph;
The selection module is used for forming a relation group according to the graph relation, the direct-aged relation and/or the hidden relation in the science-oriented graph questions, and selecting a subset from the relation group to serve as a question understanding result according to a selection rule corresponding to the category information;
the solving module is used for solving the topic understanding result to obtain a solving process corresponding to the topic of the science and the graph;
the text model pool comprises a syntactic semantic model pool and an implicit relation model pool;
further comprises:
the acquisition module is used for acquiring all the image-text questions in each teaching sub-field in the science subjects as an image-text question set before the science subjects are acquired;
the construction module is used for constructing a vectorized syntactic semantic model pool and a vectorized implicit relation model pool for each teaching sub-field according to the text set in the graphic and text question set;
the building module is also used for building a relational sub-graph model pool according to the graph set in the graph-text question set;
the matching module is further configured to calculate a matching network for the vector sequence according to the syntactic semantic model pool by using an inference graph based on an embedded syntactic semantic model, so as to obtain a direct-aging relationship in the rational graph questions; and/or the number of the groups of groups,
And calculating a matching network for the vector sequence based on an inference graph embedded with an implicit relation model according to the implicit relation model pool to obtain the implicit relation in the science graph text.
7. The utility model provides an artificial intelligence science picture and text problem solving equipment, its characterized in that, artificial intelligence science picture and text problem solving equipment includes: the system comprises a memory, a processor and an artificial intelligence science graphic problem solving program which is stored in the memory and can run on the processor, wherein the artificial intelligence science graphic problem solving program realizes the steps of the artificial intelligence science graphic problem solving method according to any one of claims 1 to 5 when being executed by the processor.
8. A storage medium, wherein an artificial intelligence science graphic problem solving program is stored on the storage medium, and the artificial intelligence science graphic problem solving program realizes the steps of the artificial intelligence science graphic problem solving method according to any one of claims 1 to 5 when being executed by a processor.
CN202110167638.4A 2021-02-05 2021-02-05 Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science Active CN112949421B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110167638.4A CN112949421B (en) 2021-02-05 2021-02-05 Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110167638.4A CN112949421B (en) 2021-02-05 2021-02-05 Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science

Publications (2)

Publication Number Publication Date
CN112949421A CN112949421A (en) 2021-06-11
CN112949421B true CN112949421B (en) 2023-07-25

Family

ID=76243147

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110167638.4A Active CN112949421B (en) 2021-02-05 2021-02-05 Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science

Country Status (1)

Country Link
CN (1) CN112949421B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423286A (en) * 2017-07-05 2017-12-01 华中师范大学 The method and system that elementary mathematics algebraically type topic is answered automatically
CN109886851A (en) * 2019-02-22 2019-06-14 科大讯飞股份有限公司 Mathematical problem corrects method and device
CN111428525A (en) * 2020-06-15 2020-07-17 华东交通大学 Implicit discourse relation identification method and system and readable storage medium
JP2020161111A (en) * 2019-03-27 2020-10-01 ワールド ヴァーテックス カンパニー リミテッド Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423286A (en) * 2017-07-05 2017-12-01 华中师范大学 The method and system that elementary mathematics algebraically type topic is answered automatically
CN109886851A (en) * 2019-02-22 2019-06-14 科大讯飞股份有限公司 Mathematical problem corrects method and device
JP2020161111A (en) * 2019-03-27 2020-10-01 ワールド ヴァーテックス カンパニー リミテッド Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus
CN111428525A (en) * 2020-06-15 2020-07-17 华东交通大学 Implicit discourse relation identification method and system and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于图文理解的电路题目自动解答方法;菅朋朋;何彬;王彦丽;夏盟;;通信技术(第03期);全文 *

Also Published As

Publication number Publication date
CN112949421A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN111914568B (en) Method, device and equipment for generating text sentence and readable storage medium
CN110851596B (en) Text classification method, apparatus and computer readable storage medium
CN110781276A (en) Text extraction method, device, equipment and storage medium
CN109614620B (en) HowNet-based graph model word sense disambiguation method and system
CN111597356B (en) Intelligent education knowledge map construction system and method
CN110781681B (en) Automatic first-class mathematic application problem solving method and system based on translation model
CN109933792A (en) Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method
CN110175334A (en) Text knowledge's extraction system and method based on customized knowledge slot structure
CN112149427B (en) Verb phrase implication map construction method and related equipment
CN116661805B (en) Code representation generation method and device, storage medium and electronic equipment
CN115858750A (en) Power grid technical standard intelligent question-answering method and system based on natural language processing
CN113220854B (en) Intelligent dialogue method and device for machine reading and understanding
CN112949410B (en) Method, device, equipment and storage medium for solving problems of character questions in artificial intelligence science
CN112559691B (en) Semantic similarity determining method and device and electronic equipment
CN114417785A (en) Knowledge point annotation method, model training method, computer device, and storage medium
CN113065352B (en) Method for identifying operation content of power grid dispatching work text
CN113901224A (en) Knowledge distillation-based secret-related text recognition model training method, system and device
CN116720520B (en) Text data-oriented alias entity rapid identification method and system
CN117313850A (en) Information extraction and knowledge graph construction system and method
CN112949421B (en) Method, device, equipment and storage medium for solving image-text questions of artificial intelligence science
CN115906818A (en) Grammar knowledge prediction method, grammar knowledge prediction device, electronic equipment and storage medium
CN116483314A (en) Automatic intelligent activity diagram generation method
CN113779202B (en) Named entity recognition method and device, computer equipment and storage medium
CN115658845A (en) Intelligent question-answering method and device suitable for open-source software supply chain
CN115358227A (en) Open domain relation joint extraction method and system based on phrase enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant