CN117633643A - Automatic middle school geometric problem solving method based on contrast learning - Google Patents
Automatic middle school geometric problem solving method based on contrast learning Download PDFInfo
- Publication number
- CN117633643A CN117633643A CN202410109877.8A CN202410109877A CN117633643A CN 117633643 A CN117633643 A CN 117633643A CN 202410109877 A CN202410109877 A CN 202410109877A CN 117633643 A CN117633643 A CN 117633643A
- Authority
- CN
- China
- Prior art keywords
- geometric
- feature
- middle school
- graphic
- questions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 239000013598 vector Substances 0.000 claims abstract description 94
- 238000013527 convolutional neural network Methods 0.000 claims description 63
- 238000012549 training Methods 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 14
- 230000007246 mechanism Effects 0.000 claims description 13
- 238000012795 verification Methods 0.000 claims description 12
- 230000002457 bidirectional effect Effects 0.000 claims description 10
- 238000013145 classification model Methods 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000010801 machine learning Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
- G06F18/15—Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Business, Economics & Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Economics (AREA)
- Image Analysis (AREA)
- General Business, Economics & Management (AREA)
- Probability & Statistics with Applications (AREA)
- Human Resources & Organizations (AREA)
Abstract
The invention discloses an automatic solving method of a middle school geometric problem based on contrast learning, which comprises the following steps: collecting a plurality of middle school geometric questions and answers to obtain a required middle school geometric data set; dividing the geometric problem data into a non-image geometric problem data set and an image geometric problem data set; inputting the non-graph geometric questions into a geometric image generator to obtain a geometric figure with high accuracy; inputting the multi-modal feature vector into a graphic thematic device to obtain a final multi-modal feature vector; a program decoder in the graphic thematic device obtains a solution question answer with high accuracy; and finally, the geometric image generator and the graphic problem solving device are tested together to form a unified large model for solving the self-drawing and graphic geometric problem types. The beneficial effects of the invention are as follows: from a brand new view point, the geometric problem of middle school is divided into two types of questions to be respectively and correspondingly solved, and the two types of questions are fused together to form a model capable of solving the geometric problem of middle school.
Description
Technical Field
The invention relates to the field of machine learning systems, in particular to an automatic solving method for a middle school geometric problem based on contrast learning.
Background
In recent years, machine learning systems developed by researchers have automatically solved mathematical application problems (MWP), and have been increasingly attracting attention due to their high academic value and great application potential in intelligent education. Most of the existing methods focus on solving arithmetic and algebraic problems, including traditional machine learning methods and network-based models, while solving geometric problems has been studied. Geometry is a classical mathematical problem that is a major part of middle education. Because of the challenges and data nature of geometric problems, it can also be used as a multi-modal numerical inference benchmark, requiring joint reasoning in both graphical and textual terms.
In general, a typical geometric question is composed mainly of text and geometric figures. Geometric themes present the following new challenges compared to mathematical application themes involving only question text. First, the additional question map provides basic information lacking in the question text, such as the relative positions of the lines and points; therefore, the solver should have the ability to resolve the relationship graph. Secondly, to solve a geometric problem, we need to understand and align the semantics of the text and the chart at the same time. However, the problem text often contains some ambiguous references and implicit relationships to the primitives, which increases the difficulty of joint reasoning of the text and the primitives. Third, many geometric problems require additional theorem knowledge in the process of solving the problem. Although some methods have been attempted to solve the above problems, the performance of the geometric problem solving system is far from satisfactory. They are highly dependent on limited manual rules and only verify on small-scale data sets, which makes it difficult to generalize to more complex and real-world situations. Furthermore, the solution process is complex, which means that it is difficult for a human to understand and verify its reliability.
Recently, many efforts have presented a unified model for various visual language reasoning and generation tasks, as the underlying visual/language understanding and reasoning capabilities are largely common. Inspired by the evolution of the main flow, we consider a unified geometric problem solving model also necessary. Firstly, the geometric problem type is divided into a problem type with a geometric figure, which is commonly called as a graph geometric problem; and the original text has no graph and needs to be plotted by itself to assist solving the problem, which is commonly called as a non-graph geometric problem. Whatever the topic type in the geometric topic, some basic skills and knowledge in the geometric inference are shared in the process of solving the topic. Thus, exploring the general understanding and reasoning capabilities of unified neural networks is a significant topic in the mathematical arts. Furthermore, the unified model does not require an auxiliary model to determine whether the problem is a graphed geometry problem or a non-graphed geometry problem. The problem solving efficiency is greatly improved, and errors caused by classifying the problem types are reduced, so that the model can better complete the problem solving task. For this reason, it would be valuable and desirable to build a framework that handles geometric problems uniformly at both the data and model layers.
Disclosure of Invention
In order to solve the technical problems, the invention provides an automatic solving method for a middle school geometric problem based on contrast learning, which divides the geometric problem into two types from the view point of the problem: the two types of geometric problems are respectively provided with corresponding solutions, and finally a unified large model of middle school geometric mathematics capable of solving the two types of problems is formed.
The technical scheme adopted by the invention is as follows: the automatic middle school geometric problem solving method based on contrast learning comprises the following steps:
step S1, constructing a data set: collecting a plurality of geometric questions and answers of the middle school; dividing the collected multiple geometric questions and answers according to the training set, the verification set and the test set respectively to obtain a required geometric data set of middle school;
step S2, task formalization definition: given a medium geometry dataset comprising N pieces, it is divided into: a non-pictorial geometric topic data set B and a pictorial geometric topic data set C;
step S3, there are y geometric questions B needing to be drawn by oneself in the study geometric question data set B in the no-graph y And geometric topic C with z-channel self-graphic in geometric topic data set C in graph z Inputting the geometric problems to a BERT feature encoder in an automatic middle school geometric problem solving model; acquiring embedded feature vectors of all words in the middle school geometric stem;
s4, inputting word embedded feature vectors in the non-graphic middle school geometric question stems obtained by the BERT feature encoder into a geometric image generator, wherein the geometric image generator is obtained based on comparison learning model training and fine tuning and is used for generating geometric figures required by the middle school geometric questions, the comparison learning model training is carried out by adopting a manual supervision mode, and geometric figure generation loss L is calculated by means of a mean square error loss function prior Optimizing and updating parameters of the BERT feature encoder and the geometric image generator to obtain geometric figures;
s5, inputting word embedded feature vectors in the geometric problem stems with graphics obtained by the BERT feature encoder and geometric figures generated by the geometric problem and geometric image generator without graphics into the graphic problem device, then encoding the geometric figures by the graphic encoder contained in the graphic problem device, extracting features, and performing alignment operation of the geometric problem stems with the geometric figures with the word embedded feature vectors of the BERT feature encoder to obtain final multi-mode feature vectors;
step S6, program decoder in the graphic thematic device sequentially generates solving questions under the guidance of multi-mode feature vectorsProgram, calculating the generation loss L of solving problem errors by adopting negative log likelihood loss function g Obtaining a question-solving answer with high accuracy;
and S7, testing the geometric image generator and the graphic problem solving device together to form a unified large model which can solve the problem of the graphic-free geometric problem needing to be drawn by the user and the geometric problem with the graphic.
Further, in step S1, a data set is constructed, a plurality of geometric questions and answers of the middle school are collected, and the following tasks are executed; the method comprises the following steps:
step S11, removing repeated geometric questions and answers of middle school;
step S12, classifying the geometric questions of middle school and answers into two geometric questions, automatically classifying the geometric questions of middle school with graphics, and classifying the geometric questions of middle school without graphics as geometric questions of middle school without graphics;
step S13, classifying and checking the geometric questions of the middle school, namely manually checking and checking the classification result of the geometric questions of the same middle school;
step S14, after manual inspection and verification, according to the training set: verification set: ratio of test set = 8:1:1, dividing the geometric questions and answers of the middle school;
step S15, after the middle school geometric questions and answers are divided, the middle school geometric questions of the training set and the verification set are manually marked; extracting the problem solving step according to the answer, and marking the problem solving step as a program language which can be identified by a computer in a manual mode.
Further, the BERT feature encoder in step S3 is composed of a multi-layer bi-directional encoder using an encoder module in a transducer model architecture, and the calculation process is shown in formula (1);
(1);
wherein e i w For the ith word token w i The corresponding word obtained through the BERT feature encoder is embedded into the feature vector.
Further, the geometric image generator in step S4 specifically includes:
s41, inputting data into a geometric image generator, wherein the input data is corresponding word embedded feature vectors in the geometric stem of the non-graphics primitive;
step S42, the geometric image generator is an adapted comparison learning model, the comparison learning model is a classification model of a text-image pair, and the comparison learning model is trained and fine-tuned to achieve a downstream task of generating geometric images;
step S43, training the comparison learning model:
collecting and arranging a graph geometric question data set to obtain a middle school geometric question stem and a corresponding geometric figure;
respectively extracting the characteristics of the geometric stem of middle school and the corresponding geometric figure to obtain a word embedded characteristic vector e i w And geometric feature h CNN Forming a text-image pair;
inputting the characteristics of the text-image pairs into a classification model of the text-image pairs for contrast learning, and marking the text-image pairs matched with each other as positive samples and the text-image pairs not matched as negative samples under the condition of manual supervision;
the classification model of the text-image pair can obtain the stem of the middle school geometry and the corresponding geometry through positive samples and negative samples, namely a word embedded feature vector e is given i w Find out the corresponding geometric figure feature h CNN ;
Step S44, fine tuning is carried out on the comparison learning model:
defining the text as x, the geometric figure as y, and introducing the generated graphic code into the primary:the calculation process is shown in a formula (2), wherein the graphic code generated by the primary is obtained by training the graphic code of the comparison learning model as a true value;
(2);
wherein P (y|x) represents generating geometry y from text x;is expressed as being capable of embedding a feature vector e according to a word after training by a contrast learning model i w To generate geometric feature h CNN ; Representing finding geometric feature h from text x CNN Then the geometric feature h CNN Decoding to generate a geometric figure y;represented as finding geometric feature h from text x CNN The corresponding geometry y after decoding,represented as finding geometric feature h from text x CNN ;
Step S45, geometric figure generation loss L prior : according to the primary in the fine tuning of the contrast learning model, predicting geometric figures by adopting a mean square error loss function, wherein the calculation process is shown in a formula (3);
(3);
wherein L is prior The loss of the generation of the geometric figure is represented,representing the sum of the geometric figure generation losses of the previous i times, T being the number of times, h (i) CNN Representing the geometric feature h generated for the ith time CNN ;Representing the ith generated geometric feature h with text x (i) CNN ;Representing the ith generated geometric feature h with text x (i) CNN And geometric figure feature h CNN Doing a difference, whereinIs an adjustable parameter.
Further, in step S5, there is a graphic problem device comprising a bidirectional LSTM layer, a graphic encoder, a joint reasoning module and a program decoder; the specific contents include:
step S51, inputting data into a graphic problem solving device, wherein the input data comprises word embedded feature vectors in graphic geometric problem stems obtained by a BERT feature encoder and corresponding geometric figures generated by a non-graphic geometric problem and a geometric image generator;
step S52, bi-directional LSTM layer: word embedded feature vector e in the stem of the geometric question in the diagram obtained by the BERT feature encoder i w Inputting the text into a bidirectional LSTM layer, and acquiring a context semantic feature vector corresponding to the ith word in the mathematical geometry text by utilizing the bidirectional LSTM layer, namely embedding the word into the feature vector e i w Respectively and correspondingly inputting the data into a forward LSTM layer and a backward LSTM layer, as shown in a formula (4);
(4);
wherein h is i LSTM Context semantic feature vector corresponding to the i-th word in the mathematical geometry text, LSTM f 、LSTM b Representing the output vector of the forward LSTM layer and the output vector of the backward LSTM layer respectively,representing a cascading operation;
step S53, the graphics encoder: extracting geometric image features h by adopting CNN convolutional neural network CNN The CNN convolutional neural network comprises a convolutional layer, a nonlinear activation function and a pooling layer component; convolution operation is carried out on geometric images through sliding convolution check in a convolution layer, and local features are capturedThe method comprises the steps of carrying out a first treatment on the surface of the Meanwhile, a nonlinear activation function is introduced, the expression capacity of the CNN convolutional neural network is increased, the pooling layer assembly reduces the dimension of the feature map, and key features of geometric figures are reserved; the multi-layer stacked convolution layers enable the CNN convolution neural network to gradually extract geometric features of higher levels; obtaining geometric figure characteristics h through full connection layer CNN ;
Step S54, a joint reasoning module: contextual semantic feature vector h corresponding to the ith word through an attention mechanism i LSTM And geometric figure feature h CNN Fusion is carried out to realize cross-boundary semantic fusion and alignment, and a context semantic feature vector h corresponding to the ith word containing an attention mechanism is obtained i LSTM And geometric figure feature h CNN Multimodal feature vector M corresponding to the ith word of information i The calculation process is as formula (5) and formula (6);
(5);
(6);
where Attention represents the Attention mechanism, Q, K, V represents the query vector, key vector and value vector, respectively, softmax is the normalized exponential function, dd is the second dimension of the query vector Q, key vector K,、、projection parameter matrixes of a query vector Q, a key vector K and a value vector V corresponding to the ith word in a self-attention mechanism are respectively represented; order the、WhereinA parameter matrix learned for a linear layer, D representing a transpose;
step S55, program decoder, multi-modal feature vector M i Feeding the linear layer to obtain an initial state s 0 Hidden state s of bi-directional LSTM layer at time step t t Concatenated with the result of interest, feed into the linear layer with Softmax function to predict the next program token P t Is a distribution of (a).
Further, a loss L is generated in step S6 g Adopting the negative log likelihood of the target program, wherein the calculation formula is shown as (7);
) (7);
where θ is a parameter of the loss function, P t Is a program token, y t To generate the target program at time t, y t-1 To generate the target program at time t-1, M i Is a multi-modal feature vector.
Furthermore, the automatic middle school geometric problem solving model is divided into four modules, namely a BERT feature encoder, a geometric image generator, a graphic problem generator and a unified big model, wherein the BERT feature encoder respectively serially connects the geometric image generator and the graphic problem generator, the geometric image generator and the graphic problem generator are in parallel structures, and then serially connects the big models.
The beneficial effects of the invention are as follows: (1) Firstly, collecting a data set from a human teaching version junior middle school math teaching material and a test paper, and cleaning and normalizing questions and answers in the data set, thereby constructing a middle school geometric data set. Then, the geometric data set of middle school is classified into two types of questions by a program, and after classification, manual supervision and detection are carried out to verify the rationality of classification of the invention. Secondly, the two question types after classification are respectively processed. For a graphic geometric question with a graphic in the question: solving the problem by using a graphic problem model; no-graph geometry problem that requires self-mapping for no graph in the problem: the question type needs to be trained and fine-tuned by means of the comparison learning model, so that geometric figures required by texts in the questions can be generated better. Then, the geometric figure generated by the geometric image generator and the questions in the non-image geometric questions are input into the image questions decoder together for model test. Finally, the two trained modules are fused together to form a unified large model capable of simultaneously solving the two geometric problem types.
(2) Aiming at the non-graphic geometric problems to be mapped, the invention adopts a mode of fine tuning and pre-training a contrast learning model to generate the geometric figures corresponding to the problems, realizes the process from non-graphic to graphic, and lays a foundation for the subsequent problem solving.
(3) Aiming at the graphic geometry problem solved by the self-provided graph, the invention firstly carries out feature extraction on the text and the geometry graph by using an encoder, then carries out cross-boundary semantic fusion and alignment by using a collaborative attention mechanism, and finally solves the problem of cross-mode joint reasoning.
(4) From a brand new view angle, the invention divides the geometry problem of middle school into two types of questions to be respectively and correspondingly solved, and fuses the questions to form a unified large model capable of solving the geometry problem of middle school.
Drawings
FIG. 1 is a flow chart of the overall model structure of the present invention.
Detailed Description
The invention works and implements in this way, a middle school geometric problem automatic solving method based on contrast learning, the method steps are as follows:
step S1, constructing a data set: collecting a plurality of geometric questions and answers of the middle school; dividing the collected multiple geometric questions and answers according to the training set, the verification set and the test set respectively to obtain a required geometric data set of middle school;
step S2, task formalization definition: given a medium geometry dataset comprising N pieces, it is divided into: a non-pictorial geometric topic data set B and a pictorial geometric topic data set C;
step S3, there are y geometric questions B needing to be drawn by oneself in the study geometric question data set B in the no-graph y And geometric topic C with z-channel self-graphic in geometric topic data set C in graph z Inputting the geometric problems to a BERT feature encoder in an automatic middle school geometric problem solving model; acquiring embedded feature vectors of all words in the middle school geometric stem;
s4, inputting word embedded feature vectors in the non-graphic middle school geometric question stems obtained by the BERT feature encoder into a geometric image generator, wherein the geometric image generator is obtained based on comparison learning model training and fine tuning and is used for generating geometric figures required by the middle school geometric questions, the comparison learning model training is carried out by adopting a manual supervision mode, and geometric figure generation loss L is calculated by means of a mean square error loss function prior Optimizing and updating parameters of the BERT feature encoder and the geometric image generator to obtain geometric figures;
s5, inputting word embedded feature vectors in the geometric problem stems with graphics obtained by the BERT feature encoder and geometric figures generated by the geometric problem and geometric image generator without graphics into the graphic problem device, then encoding the geometric figures by the graphic encoder contained in the graphic problem device, extracting features, and performing alignment operation of the geometric problem stems with the geometric figures with the word embedded feature vectors of the BERT feature encoder to obtain final multi-mode feature vectors;
step S6, the program decoder in the diagrammatical question device sequentially generates a question solving program under the guidance of the multi-mode feature vector, and calculates the generation loss L of the question solving error by adopting the negative log likelihood loss function g Obtaining a question-solving answer with high accuracy;
and S7, testing the geometric image generator and the graphic problem solving device together to form a unified large model which can solve the problem of the graphic-free geometric problem needing to be drawn by the user and the geometric problem with the graphic.
Further, in step S1, the data set is constructed, and 16201 geometric questions and answers are manually collected, wherein the geometric questions and answers are derived from teaching materials, examination paper examination lines and teaching plan data in the new teaching edition, and the following tasks are executed; the method comprises the following steps:
step S11, removing repeated geometric question types and answers;
step S12, classifying the geometric questions and answers into two geometric questions, automatically classifying the geometric questions with graphics into graphic geometric questions, and classifying the geometric questions without any geometric figures into non-graphic geometric questions;
step S13, classifying and checking the geometric questions, namely checking the classification result of the same geometric questions by adopting manual checking, so as to ensure reasonable classification;
step S14, after manual inspection and verification, reserving 14334 geometric questions and answers, automatically classifying graphics into 9922 with graphics geometric questions, automatically classifying no geometric graphics into 4412 without graphics geometric questions, and according to a training set: verification set: ratio of test set = 8:1:1, dividing geometric questions and answers;
step S15, after the geometric question type and the answer are divided, the geometric question types of the training set and the verification set are manually marked; extracting the problem solving step according to the answer, and then marking the problem solving step into a program language which can be identified by a computer in a manual mode.
Further, the BERT feature encoder in step S3 is composed of a multi-layer bi-directional encoder using an encoder module in a transducer model architecture, and the calculation process is shown in formula (1);
(1);
wherein e i w For the ith word token w i The corresponding word obtained through the BERT feature encoder is embedded into the feature vector.
Further, the geometric image generator in step S4 specifically includes:
s41, inputting data into a geometric image generator, wherein the input data is corresponding word embedded feature vectors in the non-graphic geometric stem;
step S42, the geometric image generator is an adapted comparison learning model, the comparison learning model is a classification model of a text-image pair, and the comparison learning model is trained and fine-tuned to achieve a downstream task of generating geometric images;
step S43, training the comparison learning model:
collecting and arranging a graph geometric question data set to obtain 9922 geometric question stems and corresponding geometric figures;
respectively extracting features of 9922 geometric stems and corresponding geometric figures to obtain word embedded feature vectors e i w And geometric feature h CNN I.e. 9922 text-image pairs are formed;
inputting the features of 9922 text-image pairs into a classification model of the text-image pairs for contrast learning, and marking the text-image pairs matched with each other as positive samples and the unmatched text-image pairs as negative samples under manual supervision;
the classification model of the text-image pair can obtain the geometric stem and the corresponding geometric figure through positive samples and negative samples, namely a word embedded feature vector e is given i w Find out the corresponding geometric figure feature h CNN ;
Step S44, fine tuning is carried out on the comparison learning model:
defining the text as x, the geometric figure as y, and introducing the generated graphic code into the primary:the calculation process is shown in a formula (2), wherein the graphic code generated by the primary is obtained by training the graphic code of the comparison learning model as a true value;
(2);
wherein P (y|x) represents generating geometry y from text x;is expressed as being capable of embedding a feature vector e according to a word after training by a contrast learning model i w To generate geometric feature h CNN ; Representing finding geometric feature h from text x CNN Then the geometric feature h CNN Decoding to generate a geometric figure y;represented as finding geometric feature h from text x CNN And its decoded corresponding geometry y,represented as finding geometric feature h from text x CNN ;
Step S45, geometric figure generation loss L prior : according to the primary in the fine tuning of the contrast learning model, predicting geometric figures by adopting a mean square error loss function, wherein the calculation process is shown in a formula (3);
(3);
wherein L is prior The loss of the generation of the geometric figure is represented,representing the sum of the geometric figure generation losses of the previous i times, T being the number of times, h (i) CNN Representing geometric features generated for the ith time;representing the ith generated geometric feature h with text x (i) CNN ;Representing the ith generated geometric feature h with text x (i) CNN And geometric figure feature h CNN Doing a difference, whereinIs an adjustable parameter.
Further, in step S5, there is a graphic problem device comprising a bidirectional LSTM layer, a graphic encoder, a joint reasoning module and a program decoder; the specific contents include:
step S51, inputting data into a graphic solution device, wherein the input data comprises word embedded feature vectors in graphic geometric question stems obtained by a BERT feature encoder and corresponding geometric figures generated by a graphic-free geometric question and geometric image generator;
step S52, bi-directional LSTM layer: word embedded feature vector e in the patterned geometric stem obtained by the BERT feature encoder i w Inputting the text into a bidirectional LSTM layer, and acquiring a context semantic feature vector corresponding to the ith word in the mathematical geometry text by utilizing the bidirectional LSTM layer, namely embedding the word into the feature vector e i w Respectively and correspondingly inputting the data into a forward LSTM layer and a backward LSTM layer, as shown in a formula (4);
(4);
wherein h is i LSTM Context semantic feature vector corresponding to the i-th word in the mathematical geometry text, LSTM f 、LSTM b Representing the output vector of the forward LSTM layer and the output vector of the backward LSTM layer respectively,representing a cascading operation;
step S53, the graphics encoder: extracting image characteristics h by adopting CNN convolutional neural network CNN Specifically, feature extraction of geometric figures is realized through components such as a convolution layer, a nonlinear activation function, a pooling layer and the like. In the convolution layer, geometric images are convolved by sliding a convolution kernel, capturing lines and points of local features such as edges. And meanwhile, a nonlinear activation function is introduced, so that the expression capacity of the network is increased. The pooling layer reduces the dimension of the feature map and retains the key features of the geometry. The multi-layered stacked convolution layers allow the network to progressively extract higher levels of geometric features such as geometry shape and dot, line positional relationships. Finally, the characteristics of the geometric figure are obtained through the full connection layer;
step S54, a joint reasoning module: contextual semantic feature vector h corresponding to the ith word through an attention mechanism i LSTM And geometric figure feature h CNN Fusion is carried out to realize cross-boundary semantic fusion and alignment, and a context semantic feature vector h corresponding to the ith word containing an attention mechanism is obtained (i) CNN And geometric figure feature h CNN Multimodal feature vector M corresponding to the ith word of information i The calculation process is as formula (5) and formula (6);
(5);
(6);
where Attention represents the Attention mechanism, Q, K, V represents the query vector, key vector and value vector, respectively, softmax is the normalized exponential function, dd is the second dimension of the query vector Q, key vector K,、、projection parameter matrixes of a query vector Q, a key vector K and a value vector V corresponding to the ith word in a self-attention mechanism are respectively represented; order the、WhereinA parameter matrix learned for a linear layer, D representing a transpose;
step S55, program decoder, multi-modal feature vector M i Feed-in linear layerTo obtain an initial state s 0 Hidden state s of bi-directional LSTM layer at time step t t Concatenated with the result of interest, feed into the linear layer with Softmax function to predict the next program token P t Is a distribution of (a).
Further, a loss L is generated in step S6 g Adopting the negative log likelihood of the target program, wherein the calculation formula is shown as (7);
) (7);
where θ is a parameter of the loss function, P t Is a program token, y t To generate the target program at time t, y t-1 To generate the target program at time t-1, M i Is a multi-modal feature vector.
Furthermore, the automatic middle school geometric problem solving model is divided into four modules, namely a BERT feature encoder, a geometric image generator, a graphic problem generator and a unified big model, wherein the BERT feature encoder respectively serially connects the geometric image generator and the graphic problem generator, the geometric image generator and the graphic problem generator are in parallel structures, and then serially connects the big models.
Further, the unified big model is trained by a geometric image generator and a graphic thematic device, then a big model test is carried out, if a middle school geometric theme is input, whether the geometric theme is graphic or non-graphic, if the geometric theme is graphic, the geometric theme is directly input into the graphic thematic device through a feature encoder for theme solving; and if the geometric problem is non-graphic, the geometric image generator can generate a geometric image so that the geometric image is changed into a graphic geometric problem, and then the graphic geometric problem is input into a graphic problem generator for solving. Therefore, we have made a unified large model that can handle both graphic-bearing geometric questions and non-graphic geometric questions without geometric figures.
As shown in FIG. 1, FIG. 1 is a flow chart of model prediction, wherein the flow chart is that an N-channel middle school geometrical data set is input into an automatic middle school geometrical problem solving model, and then N-channel middle school geometrical data sets are classified into non-image geometrical problems and image geometrical problems; for a pair ofIn the no-graph geometric questions, firstly, the characteristics of the BERT characteristic encoder are extracted and then the extracted characteristics are input into a geometric image generator, the geometric image generator can generate geometric figures described in the questions according to the questions through comparison learning, training and fine tuning modes, wherein the prediction of the figures adopts a mean square error loss function to obtain a prediction loss L prior The parameters in the geometric image generator are optimized and updated, the accuracy of the generated graph is improved, and finally the geometric graph generated by the geometric image generator and the non-graph geometric questions are input into a graph solving device for solving and testing; for the graphic geometric questions, the method comprises the steps of firstly extracting the characteristics of a BERT characteristic encoder, then continuously inputting the extracted characteristics into the graphic geometric questions for solving, wherein during training, a loss L is generated g The method is the negative log likelihood of the target program, and can improve the accuracy of the graphic thematic device; in the training phase, the automatic solution model of the middle school geometric problem calculates the joint total loss L=L prior +L g To optimize parameters in both the non-graphic and graphic themes, enhancing the information interaction of the two modules; in the test stage, the geometric image generator and the graphic thematic device are fused together to form a unified large model capable of solving the two types of geometric themes.
Claims (7)
1. An automatic solving method of a middle school geometric problem based on contrast learning is characterized by comprising the following steps: the method comprises the following steps:
step S1, constructing a data set: collecting a plurality of geometric questions and answers of the middle school; dividing the collected multiple geometric questions and answers according to the training set, the verification set and the test set respectively to obtain a required geometric data set of middle school;
step S2, task formalization definition: given a medium geometry dataset comprising N pieces, it is divided into: a non-pictorial geometric topic data set B and a pictorial geometric topic data set C;
step S3, there are y geometric questions B needing to be drawn by oneself in the study geometric question data set B in the no-graph y And geometric topic C with z-channel self-graphic in geometric topic data set C in graph z Input to automatic solving of geometric problems in middle schoolIn the BERT signature encoder in the solution model; acquiring embedded feature vectors of all words in the middle school geometric stem;
s4, inputting word embedded feature vectors in the non-graphic middle school geometric question stems obtained by the BERT feature encoder into a geometric image generator, wherein the geometric image generator is obtained based on comparison learning model training and fine tuning and is used for generating geometric figures required by the middle school geometric questions, the comparison learning model training is carried out by adopting a manual supervision mode, and geometric figure generation loss L is calculated by means of a mean square error loss function prior Optimizing and updating parameters of the BERT feature encoder and the geometric image generator to obtain geometric figures;
s5, inputting word embedded feature vectors in the geometric problem stems with graphics obtained by the BERT feature encoder and geometric figures generated by the geometric problem and geometric image generator without graphics into the graphic problem device, then encoding the geometric figures by the graphic encoder contained in the graphic problem device, extracting features, and performing alignment operation of the geometric problem stems with the geometric figures with the word embedded feature vectors of the BERT feature encoder to obtain final multi-mode feature vectors;
step S6, the program decoder in the diagrammatical question device sequentially generates a question solving program under the guidance of the multi-mode feature vector, and calculates the generation loss L of the question solving error by adopting the negative log likelihood loss function g Obtaining a question-solving answer with high accuracy;
and S7, testing the geometric image generator and the graphic problem solving device together to form a unified large model which can solve the problem of the graphic-free geometric problem needing to be drawn by the user and the geometric problem with the graphic.
2. The automatic middle school geometric problem solving method based on contrast learning according to claim 1, wherein the method is characterized in that: step S1, constructing a data set, collecting geometric questions and answers of a plurality of roads, and executing the following tasks; the method comprises the following steps:
step S11, removing repeated geometric questions and answers of middle school;
step S12, classifying the geometric questions of middle school and answers into two geometric questions, automatically classifying the geometric questions of middle school with graphics, and classifying the geometric questions of middle school without graphics as geometric questions of middle school without graphics;
step S13, classifying and checking the geometric questions of the middle school, namely manually checking and checking the classification result of the geometric questions of the same middle school;
step S14, after manual inspection and verification, according to the training set: verification set: ratio of test set = 8:1:1, dividing the geometric questions and answers of the middle school;
step S15, after the middle school geometric questions and answers are divided, the middle school geometric questions of the training set and the verification set are manually marked; extracting the problem solving step according to the answer, and marking the problem solving step as a program language which can be identified by a computer in a manual mode.
3. The automatic middle school geometric problem solving method based on contrast learning according to claim 2, wherein the method is characterized in that: the BERT feature encoder in the step S3 is composed of a plurality of layers of bidirectional encoders by using an encoder module in a transducer model framework, and the calculation process is shown in a formula (1);
(1);
wherein e i w For the ith word token w i The corresponding word obtained through the BERT feature encoder is embedded into the feature vector.
4. The automatic middle school geometric problem solving method based on contrast learning according to claim 3, wherein the method is characterized in that: the geometric image generator in the step S4 comprises the following specific contents:
s41, inputting data into a geometric image generator, wherein the input data is corresponding word embedded feature vectors in the geometric stem of the non-graphics primitive;
step S42, the geometric image generator is an adapted comparison learning model, the comparison learning model is a classification model of a text-image pair, and the comparison learning model is trained and fine-tuned to achieve a downstream task of generating geometric images;
step S43, training the comparison learning model:
collecting and arranging a graph geometric question data set to obtain a middle school geometric question stem and a corresponding geometric figure;
respectively extracting the characteristics of the geometric stem of middle school and the corresponding geometric figure to obtain a word embedded characteristic vector e i w And geometric feature h CNN Forming a text-image pair;
inputting the characteristics of the text-image pairs into a classification model of the text-image pairs for contrast learning, and marking the text-image pairs matched with each other as positive samples and the text-image pairs not matched as negative samples under the condition of manual supervision;
the classification model of the text-image pair can obtain the stem of the middle school geometry and the corresponding geometry through positive samples and negative samples, namely a word embedded feature vector e is given i w Find out the corresponding geometric figure feature h CNN ;
Step S44, fine tuning is carried out on the comparison learning model:
defining the text as x, the geometric figure as y, and introducing the generated graphic code into the primary:the calculation process is shown in a formula (2), wherein the graphic code generated by the primary is obtained by training the graphic code of the comparison learning model as a true value;
(2);
wherein,representing the generation of a geometry y from the text x; />Is expressed as being capable of embedding a feature vector e according to a word after training by a contrast learning model i w To generate geometric feature h CNN ; />Representing finding geometric feature h from text x CNN Then the geometric feature h CNN Decoding to generate a geometric figure y; />Represented as finding geometric feature h from text x CNN And the decoded corresponding geometry y, < ->Represented as finding geometric feature h from text x CNN ;
Step S45, geometric figure generation loss L prior : according to the primary in the fine tuning of the contrast learning model, predicting geometric figures by adopting a mean square error loss function, wherein the calculation process is shown in a formula (3);
(3);
wherein L is prior The loss of the generation of the geometric figure is represented,representing the sum of the geometric figure generation losses of the previous i times, T being the number of times, h (i) CNN Representing the geometric feature h generated for the ith time CNN ;/>Representing the ith generated geometric feature h with text x (i) CNN ;/>Representing the ith generated geometric feature h with text x (i) CNN And geometric figure feature h CNN Do bad, wherein->Is an adjustable parameter.
5. The automatic middle school geometric problem solving method based on contrast learning according to claim 4, wherein the method is characterized in that: the step S5 is provided with a graphic problem device which comprises a bidirectional LSTM layer, a graphic encoder, a joint reasoning module and a program decoder; the specific contents include:
step S51, inputting data into a graphic problem solving device, wherein the input data comprises word embedded feature vectors in graphic geometric problem stems obtained by a BERT feature encoder and corresponding geometric figures generated by a non-graphic geometric problem and a geometric image generator;
step S52, bi-directional LSTM layer: word embedded feature vector e in the stem of the geometric question in the diagram obtained by the BERT feature encoder i w Inputting the text into a bidirectional LSTM layer, and acquiring a context semantic feature vector corresponding to the ith word in the mathematical geometry text by utilizing the bidirectional LSTM layer, namely embedding the word into the feature vector e i w Respectively and correspondingly inputting the data into a forward LSTM layer and a backward LSTM layer, as shown in a formula (4);
(4);
wherein h is i LSTM Context semantic feature vector corresponding to the i-th word in the mathematical geometry text, LSTM f 、LSTM b Representing the output vector of the forward LSTM layer and the output vector of the backward LSTM layer respectively,representing a cascading operation;
step S53, the graphics encoder: adopting CNN convolutional neural network mode to realizeExtracting geometric image features h CNN The CNN convolutional neural network comprises a convolutional layer, a nonlinear activation function and a pooling layer component; performing convolution operation on the geometric image through sliding convolution check in the convolution layer, and capturing local features; meanwhile, a nonlinear activation function is introduced, the expression capacity of the CNN convolutional neural network is increased, the pooling layer assembly reduces the dimension of the feature map, and key features of geometric figures are reserved; the multi-layer stacked convolution layers enable the CNN convolution neural network to gradually extract geometric features of higher levels; obtaining geometric figure characteristics h through full connection layer CNN ;
Step S54, a joint reasoning module: contextual semantic feature vector h corresponding to the ith word through an attention mechanism i LSTM And geometric figure feature h CNN Fusion is carried out to realize cross-boundary semantic fusion and alignment, and a context semantic feature vector h corresponding to the ith word containing an attention mechanism is obtained i LSTM And geometric figure feature h CNN Multimodal feature vector M corresponding to the ith word of information i The calculation process is as formula (5) and formula (6);
(5);
(6);
where Attention represents the Attention mechanism, Q, K, V represents the query vector, key vector and value vector, respectively, softmax is the normalized exponential function, dd is the second dimension of the query vector Q, key vector K,、/>、/>projection parameter matrixes of a query vector Q, a key vector K and a value vector V corresponding to the ith word in a self-attention mechanism are respectively represented; let->、Wherein->A parameter matrix learned for a linear layer, D representing a transpose;
step S55, program decoder, multi-modal feature vector M i Feeding the linear layer to obtain an initial state s 0 Hidden state s of bi-directional LSTM layer at time step t t Concatenated with the result of interest, feed into the linear layer with Softmax function to predict the next program token P t Is a distribution of (a).
6. The automatic middle school geometric problem solving method based on contrast learning according to claim 4, wherein the method is characterized in that: generating loss L in step S6 g Adopting the negative log likelihood of the target program, wherein the calculation formula is shown as (7);
) (7);
where θ is a parameter of the loss function,is a program token, y t To generate the target program at time t, y t-1 To generate the target program at time t-1, M i Is a multi-modal feature vector.
7. The automatic middle school geometric problem solving method based on contrast learning according to claim 6, wherein the method is characterized in that: the automatic solving model of the geometric problem in middle school is divided into four modules of a BERT feature encoder, a geometric image generator, a graphic problem generator and a unified big model, wherein the BERT feature encoder respectively serially connects the geometric image generator and the graphic problem generator, the geometric image generator and the graphic problem generator are in parallel structures, and then serially unifies the big model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410109877.8A CN117633643B (en) | 2024-01-26 | 2024-01-26 | Automatic middle school geometric problem solving method based on contrast learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410109877.8A CN117633643B (en) | 2024-01-26 | 2024-01-26 | Automatic middle school geometric problem solving method based on contrast learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117633643A true CN117633643A (en) | 2024-03-01 |
CN117633643B CN117633643B (en) | 2024-05-14 |
Family
ID=90037971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410109877.8A Active CN117633643B (en) | 2024-01-26 | 2024-01-26 | Automatic middle school geometric problem solving method based on contrast learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117633643B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423287A (en) * | 2017-07-05 | 2017-12-01 | 华中师范大学 | The automatic answer method and system of Proving Plane Geometry topic |
CN107967318A (en) * | 2017-11-23 | 2018-04-27 | 北京师范大学 | A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets |
CN113672716A (en) * | 2021-08-25 | 2021-11-19 | 中山大学·深圳 | Geometric question answering method and model based on deep learning and multi-mode numerical reasoning |
KR20220075489A (en) * | 2020-11-30 | 2022-06-08 | 정재훈 | Training system for auto generating and providing question |
CN115841156A (en) * | 2022-11-16 | 2023-03-24 | 科大讯飞股份有限公司 | Method, device, storage medium and equipment for solving plane geometry problem |
CN116028888A (en) * | 2023-01-09 | 2023-04-28 | 西交利物浦大学 | Automatic problem solving method for plane geometry mathematics problem |
CN116778518A (en) * | 2022-03-10 | 2023-09-19 | 暗物智能科技(广州)有限公司 | Intelligent solving method and device for geometric topics, electronic equipment and storage medium |
CN116955419A (en) * | 2022-11-18 | 2023-10-27 | 暗物智能科技(广州)有限公司 | Geometric question answering method, system and electronic equipment |
-
2024
- 2024-01-26 CN CN202410109877.8A patent/CN117633643B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423287A (en) * | 2017-07-05 | 2017-12-01 | 华中师范大学 | The automatic answer method and system of Proving Plane Geometry topic |
CN107967318A (en) * | 2017-11-23 | 2018-04-27 | 北京师范大学 | A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets |
KR20220075489A (en) * | 2020-11-30 | 2022-06-08 | 정재훈 | Training system for auto generating and providing question |
CN113672716A (en) * | 2021-08-25 | 2021-11-19 | 中山大学·深圳 | Geometric question answering method and model based on deep learning and multi-mode numerical reasoning |
CN116778518A (en) * | 2022-03-10 | 2023-09-19 | 暗物智能科技(广州)有限公司 | Intelligent solving method and device for geometric topics, electronic equipment and storage medium |
CN115841156A (en) * | 2022-11-16 | 2023-03-24 | 科大讯飞股份有限公司 | Method, device, storage medium and equipment for solving plane geometry problem |
CN116955419A (en) * | 2022-11-18 | 2023-10-27 | 暗物智能科技(广州)有限公司 | Geometric question answering method, system and electronic equipment |
CN116028888A (en) * | 2023-01-09 | 2023-04-28 | 西交利物浦大学 | Automatic problem solving method for plane geometry mathematics problem |
Non-Patent Citations (2)
Title |
---|
VENBIN GAN ET AL: "Automatic understanding and formalization of natural language geometry problems using syntax-semantics models", 《INTERNATIONAL JOURNAL OF INNOVATIVE》, vol. 14, no. 1, 28 February 2018 (2018-02-28), pages 83 - 98 * |
王奕然: "基于自学习的自动解题系统设计与实现", 《中国优秀硕士学位论文全文数据库(电子期刊)》, vol. 2023, no. 01, 15 January 2023 (2023-01-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN117633643B (en) | 2024-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ding et al. | Open-vocabulary universal image segmentation with maskclip | |
CN113656570B (en) | Visual question-answering method and device based on deep learning model, medium and equipment | |
US20190279074A1 (en) | Semantic Class Localization Digital Environment | |
EP4348506A1 (en) | Systems and methods for vision-and-language representation learning | |
CN111274800A (en) | Inference type reading understanding method based on relational graph convolution network | |
CN109214006B (en) | Natural language reasoning method for image enhanced hierarchical semantic representation | |
CN111966812B (en) | Automatic question answering method based on dynamic word vector and storage medium | |
Ayyadevara et al. | Modern Computer Vision with PyTorch: Explore deep learning concepts and implement over 50 real-world image applications | |
CN114419642A (en) | Method, device and system for extracting key value pair information in document image | |
CN115438215A (en) | Image-text bidirectional search and matching model training method, device, equipment and medium | |
CN117151222B (en) | Domain knowledge guided emergency case entity attribute and relation extraction method thereof, electronic equipment and storage medium | |
US11948078B2 (en) | Joint representation learning from images and text | |
CN115221369A (en) | Visual question-answer implementation method and visual question-answer inspection model-based method | |
CN113536798B (en) | Multi-instance document key information extraction method and system | |
CN114328943A (en) | Question answering method, device, equipment and storage medium based on knowledge graph | |
US20220222956A1 (en) | Intelligent visual reasoning over graphical illustrations using a mac unit | |
CN117633643B (en) | Automatic middle school geometric problem solving method based on contrast learning | |
CN113536797B (en) | Method and system for extracting key information sheet model of slice document | |
CN115936115A (en) | Knowledge graph embedding method based on graph convolution contrast learning and XLNet | |
CN113010712B (en) | Visual question answering method based on multi-graph fusion | |
Dinov | Deep Learning, Neural Networks | |
CN113869349B (en) | Schematic question-answering method based on hierarchical multi-task learning | |
CN117952206B (en) | Knowledge graph link prediction method | |
Jiang et al. | Machine learning: Training model with the case study | |
Wang et al. | ChangeMinds: Multi-task Framework for Detecting and Describing Changes in Remote Sensing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |