CN116645680A

CN116645680A - Mathematical formula recognition system based on convolutional neural network

Info

Publication number: CN116645680A
Application number: CN202310646942.6A
Authority: CN
Inventors: 王娜; 高庆忠; 刘永楠
Original assignee: Shenyang Institute of Engineering
Current assignee: Shenyang Institute of Engineering
Priority date: 2023-06-02
Filing date: 2023-06-02
Publication date: 2023-08-25

Abstract

The invention discloses a mathematical formula identification system based on a convolutional neural network, which relates to the technical field of convolutional neural networks, and the invention extracts characteristic data of a preset mathematical formula data set by constructing a convolutional neural network model; extracting characteristic data of an image to be identified by calling a convolutional neural network model, comparing the characteristic data of the image to be identified with characteristic data in a preset mathematical formula data set one by using a traversal algorithm, measuring the similarity of the data by calculating a comparison difference value, and directly outputting an identification result by an output end if the comparison difference value reaches a preset range; otherwise, the result processing unit receives the comparison result and performs corresponding processing; meanwhile, the result processing unit temporarily stores the identification result, and when the storage quantity reaches a preset value, the image data is classified and uploaded to a preset mathematical formula data set so as to update and expand the preset mathematical formula data set in real time, so that the growth of the mathematical formula identification system is greatly increased.

Description

Mathematical formula recognition system based on convolutional neural network

Technical Field

The invention relates to the technical field of convolutional neural networks, in particular to a mathematical formula identification system based on a convolutional neural network.

Background

The Convolutional Neural Network (CNN) is a deep feed-forward neural network with the characteristics of local connection, weight sharing and the like, is one of representative algorithms of deep learning, is good at processing related machine learning problems of images, particularly image recognition and the like, has obvious lifting effects in various visual tasks such as image classification, target detection, image segmentation and the like, and is one of the most widely applied models at present.

In the mathematical field, the convolutional neural network can be used for identifying mathematical formulas, an image containing a handwriting mathematical formula is preprocessed, and then the mathematical formulas are identified by using the convolutional neural network model, so that a machine can quickly read mathematical formula information. However, the existing recognition technology focuses on the recognition and reading of a single mathematical formula, and lacks certain intelligence and growth in the classification of the mathematical formula.

In view of the above technical drawbacks, a solution is now proposed.

Disclosure of Invention

The invention aims at: extracting characteristic data of a preset mathematical formula dataset by constructing a convolutional neural network model; extracting characteristic data of an image to be identified by calling a convolutional neural network model, then comparing the characteristic data of the image to be identified with characteristic data in a preset mathematical formula data set one by using a traversal algorithm, measuring the similarity of the data by calculating a comparison difference value, and directly outputting an identification result by an output end if the comparison difference value reaches a preset range; if the comparison difference value does not reach the preset range, the result processing unit receives the comparison result and performs corresponding processing; meanwhile, the result processing unit temporarily stores the identification result, and when the stored data quantity reaches a preset value, the image data is classified and uploaded to a preset mathematical formula data set, so that the real-time updating and the expansion of the preset mathematical formula data set are realized, and the growth of the mathematical formula identification system is greatly increased.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a mathematical formula recognition system based on a convolutional neural network comprises a model construction unit, a model storage unit, an input end, a feature extraction unit, a similarity comparison unit, an output end and a result processing unit;

the model construction unit is used for establishing an initial data set for preprocessing and generating processing data, training and testing the convolutional neural network through a plurality of processing data, adjusting and determining a convolutional neural network model through a test result, and transmitting the convolutional neural network model to the model storage unit when the convolutional neural network model is generated; the model storage unit is used for storing the constructed convolutional neural network model and a preset mathematical formula data set;

the input end is used for inputting the image of the formula to be identified and sending the image to the feature extraction unit, and the feature extraction unit extracts the feature data of the image of the formula to be identified by calling the convolutional neural network model of the model storage unit and sends the feature data to the similarity comparison unit; the similarity comparison unit compares the characteristic data of the formula image to be identified with the image characteristic data in the preset mathematical formula data set to generate corresponding signals and sends the corresponding signals to the output end for output; the result processing unit receives different signals from the output end and then respectively performs corresponding processing.

Further, the process of establishing an initial data set and preprocessing is as follows:

collecting a preset number of mathematical formula images, dividing the images into training images, verification images and test images, correspondingly storing the training images, the verification sets and the test sets, and preprocessing image data; wherein the pretreatment process is as follows: adjusting the sizes of the images according to a preset value by a zooming or clipping method to unify the sizes of all the images, performing data enhancement operation on the images of the training set, and performing normalization processing on the training set images subjected to data enhancement, the verification set images which are not subjected to data enhancement operation and the test images; the data enhancement mode is to randomly transform the training image, including rotation, flipping, translation, scaling, shearing, brightness and contrast, so as to increase the diversity of the data.

Further, the normalization process is as follows:

the pixel values of the input image are extracted as original data through a computer vision library, the original data are mapped between intervals [0,1] through linear transformation, and the transformation function is as follows:

where output represents an output image pixel value, input represents an input image pixel value, and max (input) and min (input) represent a maximum value and a minimum value of an input pixel, respectively.

Further, the process of constructing the convolutional neural network model is as follows:

defining a framework of a convolutional neural network, which mainly comprises a convolutional layer, a pooling layer and a full-connection layer, and inputting corresponding parameters according to preset values, wherein the parameters comprise weight and bias of a convolutional kernel, pooling type, size and step length of the pooling kernel and weight matrix and bias vector of the full-connection layer; defining a cross entropy loss function for measuring the difference between the predicted value and the true value, wherein the cross entropy loss function has a functional expression as follows:

wherein CE is cross entropy, N is total number of samples, C is total number of categories, y _ij Is the true probability (typically 0 or 1) that the ith sample belongs to the jth class, y _ij Is the predicted probability that the ith sample belongs to the jth class, where the predicted probability is derived by a soft max function.

Further, the process of training the convolutional neural network is as follows:

initializing the weight of the convolutional neural network, namely giving an initial value to each weight parameter through a normal distribution initializing method before training; the input training set image is transmitted forward through a convolution layer, a pooling layer and a full connection layer to obtain an output value; solving an error between an output value and a target value of the network through a cross entropy loss function; when the error is larger than a preset value, the error is transmitted back to the network, the errors of the full connection layer, the pooling layer and the convolution layer are sequentially obtained, weight updating is carried out according to the obtained errors, and the training set image calculation error is input again; ending training when the error is equal to or less than a preset value; and in the training process, each time the loop calculation of the preset times is completed, verifying the convolutional neural network by using the verification set image, and adjusting the super parameters of the convolutional neural network.

Further, the procedure for testing convolutional neural networks is as follows:

randomly selecting a preset number of test set images, inputting the test set images into a convolutional neural network, calculating and analyzing the identification accuracy of the convolutional neural network, and calibrating the convolutional neural network as a convolutional neural network model by test when the accuracy is greater than or equal to a preset value; otherwise, re-selecting the training set image to train the convolutional neural network until the test passes; wherein, the calculation process formula for identifying the accuracy is as followsWherein P is the identification accuracy, m is the number of images of the test set input in the test process, and n is the number of correct identification results.

Further, the construction process of the preset mathematical formula data set is as follows:

collecting a preset number of mathematical formula images, classifying according to mathematical concepts represented by the images, preprocessing and analyzing the classified mathematical formulas according to a preprocessing mode of training set images, inputting the preprocessed preset mathematical formula images into a convolutional neural network model, extracting image characteristic data through convolutional calculation in the model, and storing the extracted preset mathematical formula image characteristic data into a model storage unit.

Further, the working process of the feature extraction unit is as follows:

the input end is connected with a feature extraction unit, after the feature extraction unit receives the formula image data to be identified sent by the input end, preprocessing analysis is carried out on the input image according to a training set image preprocessing mode, and then a convolutional neural network model of a model storage unit is called to further analyze the input image, specifically, the feature data of the formula image to be identified is extracted through a convolutional layer, a pooling layer and a full connection layer.

Further, the working process of the similarity comparison unit is as follows:

the similarity comparison unit receives the feature data of the formula image to be identified, which is sent by the feature extraction unit, uses a traversal algorithm to compare the feature data with the image feature data in the preset mathematical formula data set one by one, calculates a comparison difference value between the feature data and the image feature data in the preset mathematical formula data set, and generates a corresponding signal, wherein the smaller the comparison difference value is, the higher the similarity is; wherein, the calculation formula of the comparison difference value is delta=max { |a _ij -b _ij I } where a _ij 、b _ij Respectively obtaining the numerical value of the ith row and the jth column in the characteristic data of the formula image to be identified and the image characteristic data in the preset mathematical formula data set, wherein the value of the (x) is an absolute value function;

the signal generation process specifically comprises the following steps: the minimum value of the comparison difference is recorded as delta _min Respectively and preset critical value delta ₀ 、δ ₁ Comparing if delta _min ≤δ ₀ Generating a first signal, and directly outputting an identification result by the output end at the moment; if delta ₀ ＜Δ _min ≤δ ₁ Generating a second signal and sending the second signal to the output end; if delta _min ＞δ ₁ Generating a third signal and sending the third signal to the output end;

the recognition result comprises a mathematical formula contained in the formula image to be recognized and a category corresponding to the mathematical formula.

Further, the working process of the result processing unit is as follows:

the result processing unit is connected with an output end, and when the output end receives the second signal or the third signal, the second signal or the third signal is sent to the result processing unit;

when the result processing unit receives the second signal, the comparison result in n (n is an odd number) preset mathematical formula data sets with the smallest comparison difference value with the formula image feature data to be identified is called, the comparison result and the formula image feature data to be identified are sent to the feature extraction unit together to carry out feature data re-extraction, the re-extracted comparison result image feature data is compared with the formula image feature data to be identified again, and n-2 comparison results with the smallest comparison difference value are output; executing the operation again on the n-2 comparison results until only one comparison result with the minimum comparison difference value can be output finally, and sending the comparison result to the output end to generate an identification result;

and when the result processing unit receives a third signal, temporarily storing the extracted image characteristic data of the formula to be identified, and when the stored data quantity reaches a preset value, uploading the data to a preset mathematical formula data set according to mathematical concept classification represented by the formula so as to update and expand the preset mathematical formula data set in real time.

In summary, due to the adoption of the technical scheme, the beneficial effects of the invention are as follows:

extracting characteristic data of a preset mathematical formula dataset by constructing a convolutional neural network model; extracting characteristic data of an image to be identified by calling a convolutional neural network model, then comparing the characteristic data of the image to be identified with characteristic data in a preset mathematical formula data set one by using a traversal algorithm, measuring the similarity of the data by calculating a comparison difference value, and directly outputting an identification result by an output end if the comparison difference value reaches a preset range; if the comparison difference value does not reach the preset range, the result processing unit receives the comparison result and performs corresponding processing; meanwhile, the result processing unit temporarily stores the identification result, and when the stored data quantity reaches a preset value, the image data is classified and uploaded to a preset mathematical formula data set, so that the real-time updating and the expansion of the preset mathematical formula data set are realized, and the growth of the mathematical formula identification system is greatly increased.

Drawings

Fig. 1 shows a flow chart of the operation of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Examples:

the specific working principle is as follows:

step one, establishing an initial data set and preprocessing initial image data to generate processing data, wherein the specific establishment process is as follows:

collecting a preset number of mathematical formula images, wherein the preset number is not less than 50000, dividing the images into training images, verification images and test images according to the ratio of 6:2:2, and correspondingly storing the training images, the verification images and the test set folders which are established in advance; then adjusting the sizes of the images according to a preset value by a zooming or cutting method so that the sizes of all the images are unified and convenient for a computer to read; performing data enhancement operation on the images of the training set, and performing normalization processing on the training set images subjected to data enhancement, the verification set images which are not subjected to data enhancement operation and the test set images;

the training set is a sample for model fitting and is used in the convolutional neural network training process;

the verification set is a sample for adjusting parameters, and can be used for checking the fitting degree of the model in the convolutional neural network training process and carrying out preliminary evaluation on the model capacity;

the test set is a sample for testing the accuracy of the model, and after training is finished, the image of the test set is used for testing the identification accuracy of the model;

the data enhancement mode is to randomly transform the training image, including rotation, turnover, translation, scaling, shearing, brightness and contrast, so as to increase the diversity of data, improve the generalization capability of the model and prevent overfitting;

the parameters of random transformation can be set in advance, for example, the direction angle of each rotation is set to be 90 degrees clockwise, and each picture rotates 3 times; the overturning angle is 180 degrees; the translation direction is right translation and upward translation in turn; scaling is 10% outward scaling and 20% inward scaling; the cutting times are 3 times, and the picture is divided into four parts, namely an upper part, a lower part, a left part and a right part; the brightness is respectively enhanced by 20% and reduced by 20% on the basis of the original image, and the contrast is respectively enhanced by 30% and reduced by 50% on the basis of the original image.

The normalization process is as follows:

Step two, preliminarily constructing a convolutional neural network, wherein the specific construction process is as follows:

wherein CE is cross entropy, N is total number of samples, C is total number of categories, y _ij Is the true probability (typically 0 or 1) that the ith sample belongs to the jth class,is that the ith sample belongs to the jth classA predictive probability, wherein the predictive probability is derived by a soft max function.

The convolution layer is used for extracting information of an input picture, the extracted information is image characteristic data, and the specific extraction mode is as follows:

a matrix of 3 multiplied by 3 is preset in a convolution layer to serve as a convolution kernel, convolution operation is carried out on the matrix of each channel of the image from left to right and from top to bottom through convolution check, and finally the values of the three channels of the image are correspondingly added up to obtain a value; wherein the channels of the image comprise three channels R, G and B, and each channel is equivalent to a single-channel picture and is stored in a computer in a digital matrix mode;

the formula of the convolution operation is as follows:wherein Output is the Output value, w _i 、x _i The values of the corresponding positions in the convolution kernel and the image digital matrix are respectively, and b is a preset offset value;

the pooling layer screens the features extracted from the convolution layer through pooling operation, wherein the pooling type comprises maximum pooling and average pooling, and sliding calculation is performed on the feature data extracted from the convolution layer through an n multiplied by n matrix window; the maximum pooling mode is the maximum value of an n multiplied by n matrix, and the average pooling mode is the average value of the n multiplied by n matrix;

the full-connection layer performs dimension reduction operation on the data transmitted by the convolution layer and the pooling layer, wherein the dimension reduction operation mode is to convert all feature matrixes into one-dimensional feature large vectors for output.

Step three, training a convolutional neural network, wherein the specific training process is as follows:

initializing the weight of the convolutional neural network, namely giving an initial value to each weight parameter through a normal distribution initializing method before training; the input training set image is transmitted forward through a convolution layer, a pooling layer and a full connection layer to obtain an output value; solving an error between an output value and a target value of the network through a cross entropy loss function; when the error is larger than a preset value, the error is transmitted back to the network, the errors of the full connection layer, the pooling layer and the convolution layer are sequentially obtained, weight updating is carried out according to the obtained errors, and the training set image calculation error is input again; ending training when the error is equal to or less than a preset value; and in the training process, each time the loop calculation of the preset times is completed, verifying the convolutional neural network by using the verification set image, and adjusting the parameters of the convolutional neural network.

The normal distribution initialization is specifically to sample normal distribution with a mean value of 0 and a variance of 1 as an initial weight;

the preset number of cycle computation may be set to 100, that is, each time 100 cycle computation is completed in the training process, verification is performed by using the verification set image.

Testing the convolutional neural network, wherein the specific testing process is as follows:

Step five, constructing a preset mathematical formula data set, wherein the specific process is as follows:

preprocessing and analyzing the pre-collected mathematical formula images according to a preprocessing mode of the training set images, classifying the preprocessed mathematical formula images according to mathematical concepts represented by the preprocessed mathematical formula images, inputting the classified mathematical formula images into a convolutional neural network model, extracting image characteristic data through convolutional calculation in the model, and storing the extracted characteristic data of the preset mathematical formula images into a model storage unit to generate a preset mathematical formula data set;

wherein, the mathematical formulas are classified according to the mathematical concepts represented by the formulas into the following categories: geometric formulas, algebraic formulas, trigonometric formulas, calculus formulas, probability theory and statistics formulas, linear algebraic formulas, discrete mathematical formulas, and numerical analysis formulas.

Step six, inputting an image to be identified and extracting feature data of the image to be identified through a feature extraction unit, wherein the specific process is as follows:

Step seven, carrying out similarity comparison on the characteristic data of the formula image to be identified and the characteristic data of the image in the preset mathematical formula data set through a similarity comparison unit, wherein the specific process is as follows:

Step eight, the result processing unit carries out corresponding processing on different signals from the output end, and the specific process is as follows:

and when the result processing unit receives a third signal, temporarily storing the extracted image characteristic data of the formula to be identified, and after the stored data quantity reaches a preset value, uploading the stored data quantity to a preset mathematical formula data set according to mathematical concept classification represented by the formula so as to update and expand the preset mathematical formula data set in real time.

In summary, the characteristic data of a preset mathematical formula dataset is extracted by constructing a convolutional neural network model; extracting characteristic data of an image to be identified by calling a convolutional neural network model, then comparing the characteristic data of the image to be identified with characteristic data in a preset mathematical formula data set one by using a traversal algorithm, measuring the similarity of the data by calculating a comparison difference value, and directly outputting an identification result by an output end if the comparison difference value reaches a preset range; if the comparison difference value does not reach the preset range, the result processing unit receives the comparison result and performs corresponding processing; meanwhile, the result processing unit temporarily stores the identification result, and when the stored data quantity reaches a preset value, the image data is classified and uploaded to a preset mathematical formula data set, so that the real-time updating and the expansion of the preset mathematical formula data set are realized, and the growth of the mathematical formula identification system is greatly increased.

The above formulas are all formulas with dimensions removed and numerical values calculated, the formulas are formulas with a large amount of data collected for software simulation to obtain the latest real situation, and preset parameters in the formulas are set by those skilled in the art according to the actual situation.

The size of the threshold is set for ease of comparison, and regarding the size of the threshold, the number of cardinalities is set for each set of sample data depending on how many sample data are and the person skilled in the art; as long as the proportional relation between the parameter and the quantized value is not affected.

The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims

1. The mathematical formula recognition system based on the convolutional neural network is characterized by comprising a model construction unit, a model storage unit, an input end, a feature extraction unit, a similarity comparison unit, an output end and a result processing unit;

the model construction unit is used for establishing an initial data set, preprocessing the initial data set to generate processing data, training and testing the convolutional neural network through a plurality of processing data, adjusting and determining a convolutional neural network model through a test result, and transmitting the convolutional neural network model to the model storage unit when the convolutional neural network model is generated; the model storage unit is used for storing the constructed convolutional neural network model and a preset mathematical formula data set;

2. The mathematical formula identification system based on convolutional neural network of claim 1, wherein the process of creating an initial data set and preprocessing is as follows:

collecting a preset number of mathematical formula images, dividing the images into training images, verification images and test images, correspondingly storing the training images, the verification sets and the test sets, and preprocessing image data; wherein the pretreatment process is as follows: adjusting the sizes of the images according to a preset value by a zooming or clipping method to unify the sizes of all the images, performing data enhancement operation on the images of the training set, and performing normalization processing on the training set images subjected to data enhancement, the verification set images which are not subjected to data enhancement operation and the test set images; wherein the data enhancement mode is to perform random transformation processing on the training image.

3. A mathematical formula identification system based on convolutional neural network as claimed in claim 2, wherein the normalization process is as follows:

4. The mathematical formula identification system based on convolutional neural network of claim 1, wherein the process of constructing the convolutional neural network model is as follows:

wherein CE is cross entropy, N is total number of samples, C is total number of categories, y _ij Is the true probability (typically 0 or 1) that the ith sample belongs to the jth class,is the predicted probability that the ith sample belongs to the jth class, where the predicted probability is derived by a soft max function.

5. The mathematical formula identification system based on convolutional neural network of claim 1, wherein the process of training the convolutional neural network is as follows:

initializing the weight of the convolutional neural network, namely giving an initial value to each weight parameter through a normal distribution initializing method before training; the input training set image is transmitted forward through a convolution layer, a pooling layer and a full connection layer to obtain an output value; solving an error between an output value and a target value of the network through a cross entropy loss function; when the error is larger than a preset value, the error is transmitted back to the network, the errors of the full connection layer, the pooling layer and the convolution layer are sequentially obtained, weight updating is carried out according to the obtained errors, and the training set image calculation error is input again; ending training when the error is equal to or less than a preset value; and verifying the convolutional neural network by using the verification set image every time the loop calculation of the preset times is completed in the training process, and adjusting parameters of the convolutional neural network.

6. The mathematical formula identification system based on convolutional neural network of claim 1, wherein the process of testing the convolutional neural network is as follows:

7. The mathematical formula identification system based on convolutional neural network as claimed in claim 1, wherein the construction process of the preset mathematical formula data set is as follows:

8. The mathematical formula recognition system based on convolutional neural network according to claim 1, wherein the feature extraction unit operates as follows:

9. The mathematical formula identification system based on convolutional neural network as claimed in claim 1, wherein the similarity comparison unit operates as follows:

10. The mathematical formula identification system based on convolutional neural network of claim 8, wherein the result processing unit operates as follows: