CN107886082A - Mathematical formulae detection method, device, computer equipment and storage medium in image - Google Patents
Mathematical formulae detection method, device, computer equipment and storage medium in image Download PDFInfo
- Publication number
- CN107886082A CN107886082A CN201711190154.1A CN201711190154A CN107886082A CN 107886082 A CN107886082 A CN 107886082A CN 201711190154 A CN201711190154 A CN 201711190154A CN 107886082 A CN107886082 A CN 107886082A
- Authority
- CN
- China
- Prior art keywords
- text filed
- convolutional neural
- mathematical formulae
- neural networks
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The present invention relates to mathematical formulae detection method, device, computer equipment and storage medium in a kind of image.Methods described includes:Obtain image to be detected;Text segmentation is carried out to described image to be detected, obtained multiple text filed;In the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, the text filed comprising mathematical formulae of multiple roughings will be exported;Being input to comprising the text filed of mathematical formulae in the input variable of the second convolutional neural networks trained for the roughing, output is finally text filed comprising mathematical formulae.By image to be detected be divided into it is multiple it is text filed after be input in convolutional neural networks, first to text filed carry out roughing, the text filed mode screened again gone out again to roughing, it more can accurately obtain including the text filed of mathematical formulae, improve the accuracy that mathematical formulae whether is included in detection image.
Description
Technical field
The present invention relates to field of computer technology, more particularly to mathematical formulae detection method, device, meter in a kind of image
Calculate machine equipment and storage medium.
Background technology
OCR (optical character recognition, Text region) refer to electronic equipment (such as scanner or
Digital camera) character printed on paper is checked, then shape is translated into the process of computword with character identifying method;
That is, text information is scanned, then image file analyzed and processed, obtain the process of word and layout information.Such as
What is the friendly of the most important problems of OCR except wrong or using auxiliary information improve recognition correct rate, the stability of product, easily
With property and feasibility etc..
In traditional OCR technique, if wanting to detect the mathematical formulae in image, the method for use is that image is partitioned into
Character carry out floor projection and upright projection, obtain the projection properties of character.Character is obtained by the character position being partitioned into
Between architectural feature, the character being partitioned into and given character are carried out to the contrast of projection properties, the structure for the intercharacter being partitioned into
Feature is contrasted with given structure, determines whether mathematical formulae image.But the premise of this detection mode is segmentation essence
Accurate character, partitioning algorithm that can be all at present can not all ensure the accuracy of segmentation, therefore can not ensure the accuracy of detection.
The content of the invention
Based on this, it is necessary to for above-mentioned technical problem, there is provided a kind of image that can improve detection mathematical formulae accuracy
Middle mathematical formulae detection method, device, computer equipment and storage medium.
Mathematical formulae detection method, methods described include in a kind of image:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, export
Multiple roughings include the text filed of mathematical formulae;
Text filed comprising mathematical formulae of the roughing is input to the defeated of second convolutional neural networks that train
Enter in variable, output is finally text filed comprising mathematical formulae.
Mathematical formulae detection means, described device include in a kind of image:
Acquisition module, for obtaining image to be detected;
Split module, for carrying out text segmentation to described image to be detected, obtain multiple text filed;
First convolution neural network module, for by the multiple text filed the first convolution for being sequentially inputted to train
In the input variable of neutral net, the text filed comprising mathematical formulae of multiple roughings is exported;
Second convolution neural network module, for the roughing to be input into training comprising the text filed of mathematical formulae
In the input variable of the second good convolutional neural networks, output is finally text filed comprising mathematical formulae.
A kind of computer equipment, including memory, processor and storage can be run on a memory and on a processor
Computer program, following steps are realized during the computing device described program:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, export
Multiple roughings include the text filed of mathematical formulae;
Text filed comprising mathematical formulae of the roughing is input to the defeated of second convolutional neural networks that train
Enter in variable, output is finally text filed comprising mathematical formulae.
A kind of computer-readable recording medium, computer program is stored thereon with, it is real when described program is executed by processor
Existing following steps:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, export
Multiple roughings include the text filed of mathematical formulae;
Text filed comprising mathematical formulae of the roughing is input to the defeated of second convolutional neural networks that train
Enter in variable, output is finally text filed comprising mathematical formulae.
Mathematical formulae detection method, device, computer equipment and storage medium in above-mentioned image, by obtaining mapping to be checked
Picture, text segmentation is carried out to image to be detected, obtained multiple text filed;Text filed it is sequentially inputted to what is trained by multiple
In the input variable of first convolutional neural networks, the text filed comprising mathematical formulae of multiple roughings is exported;By the bag of roughing
Text filed containing mathematical formulae is input in the input variable of the second convolutional neural networks trained, then can export final bag
Containing the text filed of mathematical formulae.By image to be detected be divided into it is multiple it is text filed after be input in convolutional neural networks, first
To text filed carry out roughing, then the text filed mode screened again for including mathematical formulae gone out to roughing, can be more
Whether final text filed comprising mathematical formulae accurately to obtain, it is accurate comprising mathematical formulae in detection image to improve
Property.
Brief description of the drawings
Fig. 1 is the applied environment figure of mathematical formulae detection method in image in one embodiment;
Fig. 2 is the internal structure schematic diagram of one embodiment Computer equipment;
Fig. 3 is the schematic flow sheet of mathematical formulae detection method in image in one embodiment;
Fig. 4 is the schematic flow sheet that the second convolutional neural networks are trained in one embodiment;
Fig. 5 is the schematic flow sheet that deep learning neutral net is trained in one embodiment;
Fig. 6 is the schematic diagram for splitting image pattern in one embodiment;
Fig. 7 is the text filed schematic diagram that deep learning neutral net exports in one embodiment;
Fig. 8 is the structured flowchart of mathematical formulae detection means in image in one embodiment;
Fig. 9 is the structured flowchart of training module in one embodiment.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
Fig. 1 shows the applied environment figure of mathematical formulae detection method in image in one embodiment.Reference picture 1, the figure
Mathematical formulae detection method can be applied in image in mathematical formulae detecting system as in, and the system includes multiple Hes of terminal 110
Server 120, the first convolutional neural networks and the second convolutional neural networks can be run in server 120, terminal 110 passes through network
It is connected with server 120.Terminal 110 can be but not limited to the personal meter of mathematical formulae detection method in various energy operation images
Calculation machine, notebook computer, personal digital assistant, smart mobile phone, tablet personal computer and portable wearable device etc..Server 120
It can be the server for realizing simple function or the server for realizing multiple functions, can be specifically independent physics
Server or physical server cluster.It can be serviced in terminal 110 by specifically applying display data inputting interface
Device 120 can largely receive the image to be detected uploaded by terminal 110, and image to be detected that server 120 uploads terminal 110 is defeated
Enter into the first convolutional neural networks input variable.Specifically, after server 120 gets image to be detected, can be to be checked
Altimetric image carries out text segmentation, obtain it is multiple text filed, then by multiple text filed first volumes for being sequentially inputted to train
In the input variable of product neutral net, the text filed comprising mathematical formulae of multiple roughings is exported;Roughing is included into number again
Learn the text filed of formula to be input in the input variable of the second convolutional neural networks trained, output is finally public comprising mathematics
Formula it is text filed.
Fig. 2 is the internal structure schematic diagram of one embodiment Computer equipment.The computer equipment can be specifically as
Server 120 in Fig. 1.As shown in Fig. 2 computer equipment includes the processor, memory, network connected by system bus
Interface, display screen and input unit.Wherein, the processor is used to provide calculating and control ability, supports the operation of whole terminal.
The memory of computer equipment includes non-volatile memory medium and built-in storage, and non-volatile memory medium is stored with operation system
System and computer program, when the computer program is executed by processor so that processor realizes mathematical formulae in a kind of image
Detection method.Built-in storage in computer equipment can also store computer program, and the computer program is executed by processor
When, it may be such that mathematical formulae detection method in a kind of image of computing device.The network interface of computer equipment is used for and terminal
110 communications.The input unit of computer equipment can be the touch layer that is covered on display screen or external keyboard, touch
Plate or mouse etc. are controlled, input unit can obtain to be instructed caused by the operation interface that user is shown using finger to display screen, such as
Obtain user and input image to be detected etc. by the particular options clicked in terminal.Display screen can be used for display inputting interface or defeated
What is gone out is text filed.
It will be understood by those skilled in the art that the structure shown in Fig. 2, the only part related to the present invention program knot
The block diagram of structure, does not form the restriction for the terminal being applied thereon to the present invention program, and specific terminal can be included than figure
Shown in more or less parts, either combine some parts or arranged with different parts.
As shown in figure 3, in one embodiment, there is provided mathematical formulae detection method in a kind of image, this method with should
For being illustrated in server 120 as shown in Figure 1.Including:
Step 302, image to be detected is obtained.
Step 304, text segmentation is carried out to image to be detected, obtained multiple text filed.
Image to be detected is generally a complete picture, and image to be detected can derive from multiple as in Fig. 1
Shown terminal., can will be to be detected during obtaining the mathematical formulae in image when terminal needs to detect image to be detected
Image is sent to server, and server first can carry out text segmentation, text after image to be detected is got to image to be detected
Segmentation is generally to carry out cutting operation to image according to the line of text on image.For example general text can be used
Detection model is detected to the line of text on image, and multiple text areas are divided the image into further according to the line of text detected
Domain.For example, when the line of text on general text detection model inspection to image has 10 row, 10 will be divided the image into
It is text filed.Ordinary circumstance hypograph can not only include a line of text, therefore typically result in multiple text areas after image segmentation
Domain.
Step 306, by the input variable of multiple text filed the first convolutional neural networks for being sequentially inputted to train,
Export the text filed comprising mathematical formulae of multiple roughings.
Image to be detected can't be directly inputted in the first convolutional neural networks, but is first divided into image to be detected more
It is individual text filed, then text filed be sequentially inputted to multiple in the first convolutional neural networks.The master of first convolutional neural networks
Act on is to carry out preliminary screening, after in the text filed input variable for being input to the first convolutional neural networks, the first convolution
Neutral net can screen to text filed, " will think " the text filed output for including mathematical formulae, you can obtain more
Individual roughing includes the text filed of mathematical formulae.The first convolutional neural networks that directly can be screened to image to be detected are
The good neutral net of training in advance, is trained using mathematical formulae training sample, possesses feature learning energy after training
The neutral net of power, the data of output just possess reliability.
Step 308, text filed comprising mathematical formulae of roughing is input to second convolutional neural networks trained
Input variable in, output is finally text filed comprising mathematical formulae.
Text filed comprising mathematical formulae of roughing be the first convolutional neural networks to input it is multiple it is text filed enter
What is obtained after row preliminary screening is text filed comprising mathematical formulae, i.e. the first convolutional neural networks are to multiple texts of input
One's respective area is filtered, and it is text filed comprising mathematical formulae to obtain multiple predictions.It is again that multiple roughings are public comprising mathematics
The text filed of formula is input in the second convolutional neural networks, the input data as the second convolutional neural networks input variable.
Second convolutional neural networks can be screened again to the text filed of input, will be predicted as text filed comprising mathematical formulae
Output, and it is predicted as will not then export not comprising mathematical formulae.Directly to the text for including mathematical formulae of the roughing of input
The second convolutional neural networks that region is predicted output are also that training in advance is good, can specifically be instructed the first convolutional neural networks
The input exported during as the second convolution neural metwork training when practicing, to be trained to the second convolutional neural networks, so,
The neutral net trained could be combined with actual demand, the accurate output data according to corresponding to being predicted input data.
By image to be detected be divided into it is multiple it is text filed after be input in the convolutional neural networks trained, first using instruction
The first convolutional neural networks perfected include the text filed defeated of mathematical formulae to text filed carry out roughing, then by what roughing went out
Enter the mode screened again to the second convolutional neural networks trained, more can accurately obtain finally including mathematics public affairs
Formula it is text filed, also improve in detection image whether include the accuracy of mathematical formulae using neutral net, while also can
Time consume is reduced, lifts detection efficiency.
In one embodiment, the first convolutional neural networks train to obtain in the following way:Obtain mathematical formulae training
Sample;Each mathematical formulae training sample is divided into multiple text filed;Text filed it is divided into predetermined quantity part by multiple
Training sample set;Chosen successively in every a training sample set it is multiple text filed, successively by the text filed defeated of selection
Enter into the input variable of the first convolutional neural networks and be trained, choose and train until all training sample sets and finish, obtain
To the first convolutional neural networks trained.
Before being screened to image to be detected, first the first convolutional neural networks can be trained.In training process
Substantial amounts of training sample is needed, training sample extracts from mathematical formulae Sample Storehouse, and mathematical formulae training sample is mathematics public affairs
Formula image pattern.Generally, the mathematical formulae image pattern got is the textual image for including mathematical formulae.When carrying
When getting multiple mathematical formulae image patterns, first mathematical formulae image pattern is split, each image pattern is divided
It is cut into multiple text filed.When splitting to mathematical formulae image pattern, it is manually operable and divides the image into multiple texts
One's respective area.By multiple mathematical formulae image patterns be each divided into it is multiple it is text filed after, can be to text filed carry out normalizing
Change is handled.Normalized is included according to the word size in text filed and the direction of line of text, to text filed progress
Scaling, rotation so that all text filed font sizes and image size are consistent, and text line direction is unified.Portion
When the font size of single cent one's respective area is smaller, font can be amplified to processing.When the text filed size in part is smaller,
Can be text filed consistent with other by this text filed size of mode to be filled in the blanks in text filed surrounding.Return
After one change processing, then this text filed big collection is divided into predetermined quantity part.Predetermined quantity can be researcher according to item
Depending on mesh actual demand.By it is all it is text filed be divided into predetermined quantity part after, text filed will can be input to per a successively
In first convolutional neural networks.
But be not every portion it is text filed in full text region can all be transfused to, but it is every it is a in selection portion
Point, then the multiple of selection text filed are sequentially inputted in the first convolutional neural networks.Such as by mathematical formulae image pattern
Obtained after segmentation 100 it is text filed, then by 100 text filed be divided into predetermined quantity 10 parts, then included per in a
Text filed quantity be 10., can be from 10 texts when that will be input to per portion is text filed in the first convolutional neural networks
Selected section inputs in one's respective area.Assuming that the quantity of selection is set as 5, then text filed be input to the first volume by 10 parts successively
When in product neutral net, it can choose that wherein part is text filed to be inputted successively.The first convolutional Neural is input to by text filed
When in network, although can select each time multiple text filed as input in portion, it is really input to the really
It is still successively by each text filed input, the first convolutional neural networks are also right successively when in one convolutional neural networks
Each is text filed to be screened.
After predetermined quantity part text filed has chosen part to be input in the first convolutional neural networks, the first volume
Product neutral net, which is then trained, to be finished.For example predetermined quantity part is 10 parts, then text filed has also been chosen part when the 10th part
It is text filed be input in the first convolutional neural networks after, then the first convolutional neural networks that can be trained.
By training sample be divided into it is multiple it is text filed after be input in the first convolutional neural networks and be trained again, can make
Obtain feature extraction of first convolutional neural networks for region definitely, can also improve being trained to for the first convolutional neural networks
Fruit, improve the accuracy of prediction.
In one embodiment, the text filed of selection is input in the input variable of the first convolutional neural networks successively
It is trained, including:For each text filed of selection, each text filed weights are calculated;Each time will be text filed
When being input in the input variable of the first convolutional neural networks, according to text filed weights in the first convolutional neural networks
Weights are adjusted;The first convolutional neural networks are trained according to the weights after each regulation.
Training to neutral net is actually that the structure of neutral net is adjusted by certain algorithm, generally,
What is referred to is exactly to adjust weights, the output of neutral net is consistent with desired value, such process is exactly neural metwork training.For
That chooses is each text filed, can calculate it is each it is text filed corresponding to weights, text filed be input to first by each
When in convolutional neural networks, the corresponding weights by the first convolutional neural networks are adjusted to text filed corresponding weights.So,
When each time by text filed be input in the first convolutional neural networks, the weights of the first convolutional neural networks all may be used each time
It can change, i.e., weights are being adjusted always.Generally, each text filed corresponding weights are according to by each
It is text filed to be processed into what is be calculated after vector format.The first convolutional neural networks are entered according to each text filed weights
Row training, more improve the validity of training.
In one embodiment, each mathematical formulae training sample is divided into it is multiple it is text filed after, in addition to:
The each text filed addition numeric label obtained to segmentation.
After multiple mathematical formulae image patterns (i.e. mathematical formulae training sample) are got from mathematical formulae Sample Storehouse, meeting
Each mathematical formulae image pattern is split, is divided into multiple text filed.After segmentation it is multiple it is text filed
Be input to before being trained in the first convolutional neural networks, to according to each text filed situation comprising mathematical formulae to every
Individual text filed addition numeric label.For example setting includes the text filed addition numeric label 1 of mathematical formulae, without wrapping
Text filed then addition numeric label 0 containing mathematical formulae.So neutral net can be according to respective text filed addition
Numeric label judges that this article one's respective area is actually no to include mathematical formulae.
In addition, as long as the standard for determining whether mathematical formulae can must include the formula or bag of equation
Formula containing oeprator, this criterion can be depending on the considerations of researcher.When researcher wishes to sentence
The more strict of standard setting is determined, it is taken as that when the mathematics formula for only including equation is just mathematical formulae, then when
The formula included in text filed only has addition subtraction multiplication and division, or when symbol is the sign of inequality, all without thinking this article one's respective area Zhong Bao
Contain mathematical formulae.Now, the numeric label of this article one's respective area can then be set as 0, that is, showing not including in this article one's respective area has
Mathematical formulae.According to mark by mathematical formulae image pattern be processed into it is multiple with numeric label it is text filed after, then most
Whole training sample be exactly in fact separated one by one carry the small picture of numeric label, these small pictures can be pure text
Sheet or pure formula.
As shown in figure 4, roughing in each sample comprising the text filed of mathematical formulae is input to the second convolutional Neural
It is trained in the input variable of network, the second convolutional neural networks trained, including:
Step 402, after text filed comprising mathematical formulae for obtaining roughing in multiple samples each time, by multiple samples
Middle roughing is sequentially inputted to be trained in the input variable of the second convolutional neural networks comprising the text filed of mathematical formulae,
Export text filed comprising mathematical formulae in sample.
After text filed in sample is sequentially inputted in the first convolutional neural networks each time, the first convolution nerve net
Network can include prediction in sample in the text filed output of mathematical formulae, i.e., the text for including mathematical formulae of roughing in sample
Region.Again by the text filed input variable for being input to the second convolutional neural networks for including mathematical formulae of roughing in sample
In, the second convolutional neural networks are trained, the second convolutional neural networks including for roughing can then count in the sample of input
Text filed middle will predict for learning formula includes the text filed output of mathematical formulae, that is, exports in sample and finally include mathematics
Formula it is text filed.
Step 404, when in the sample of output include mathematical formulae text filed numeric label show it is text filed not
During comprising mathematical formulae, then handle text filed.
Step 406, text filed after processing is added into lower portion needs to be input to the defeated of the first convolutional neural networks
The training sample entered in variable is concentrated.
The output of second convolutional neural networks is to be predicted as finally including the text filed of mathematical formulae, therefore fuller
The result of meaning should be that the text filed corresponding numeric label of output is all to show that this article one's respective area is to include mathematical formulae
's.When it is not include to have mathematical formulae that the text filed numeric label of output, which shows this article one's respective area, then illustrate volume Two
There is deviation, it is necessary to continue to be trained this article one's respective area in prediction of the product neutral net to this article one's respective area.
Therefore when the text filed numeric label of output shows that this article one's respective area does not include mathematical formulae, the output
It is text filed that the training that lower portion needs to be input in the input variable of the first convolutional neural networks can be added to after processed
In sample set.
Step 408, the text filed training sample after processing is added concentrates selection multiple text filed, will choose
Text filed be input in the input variable of the first convolutional neural networks trained again, until trained second
Convolutional neural networks.
After text filed after processing is added into the lower portion training sample set to be inputted, a new training is constituted
Sample set.Concentrated again in new training sample and choose multiple text filed be input in the input variable of the first convolutional neural networks
Trained again.It is text filed per including 50 in a for example one share 10 parts of training sample sets.In first part of instruction
Practice 40 text filed are input in the input variable of the first convolutional neural networks of extraction in sample set to be trained, obtain 20
Roughing it is text filed comprising mathematical formulae, i.e., the first convolution neural network prediction 40 it is text filed in include mathematics public affairs
The text filed of formula has 20.Then this 20 roughings are input to the second convolutional Neural comprising the text filed of mathematical formulae
In network, as the input data of the second convolutional neural networks, the second convolutional neural networks export 5 predictions and finally include number
Learn the text filed of formula.However, in this 5 text filed, the numeric label for having 2 texts is 0, i.e. the two text areas
Domain is actually not include to have mathematical formulae.Now, then second part of training will can be then added to after this 2 text filed processing
In sample set.
The text filed 50+2=52 that reformed into that so second part of training sample is concentrated is individual, then in second part of training sample
Concentrate selection 40 is text filed to be input in first convolutional neural networks, circulated with this, until the 10th part of training sample set
In it is text filed also be chosen input finish.In this example, the quantity selected in every a training sample is all 10, that is, is selected
The text filed quantity selected is fixed, but in practical operation, researcher can set a ratio, such as 3/4, then the
50 text filed quantity for being actually chosen to be input in the first convolutional neural networks in a training set are 37, and the
Two parts of training be concentrated with 52 it is text filed when, select quantity then to become 39, rather than a fixed numerical value.This selection
Numerical value can be depending on the consideration of project demands or researcher.The mode of this selection, the randomness of sample is ensure that,
Improve the accuracy for training the convolutional neural networks come.
In one embodiment, handled text filed, including:The text filed of mathematical formulae will be included in sample
Carry out rotate counterclockwise predetermined angle;Text filed addition comprising mathematical formulae in sample after rotation predetermined angle is preset
The gaussian noise of value.
In the training process, when the text filed number that mathematical formulae is included in the sample of the second convolutional neural networks output
When value label shows that text region does not include mathematical formulae, then illustrate the prediction text filed to this of the second convolutional neural networks
Result is wrong.It is so vicious text filed for this prediction, it is being again inputted into first nerves network
, it is necessary to handle text filed before input variable.Processing procedure includes two steps, first, inverse to text filed progress
Hour hands rotate predetermined angle, second, the gaussian noise to text filed addition preset value.For example, can be by the predetermined angle of rotation
Be set to 2 degree, the preset value of gaussian noise is set to 0.2, i.e., to it is text filed carried out 2 degree of rotate counterclockwise after, then to text area
Domain adds the gaussian noise of σ=0.2, then text filed after processing is added into lower portion needs to be input to first convolution
Training sample in the input variable of neutral net is concentrated.
Training sample is always limited, but various complicated situation is can be potentially encountered in actual application, because
This in the training process it is this be then added to after to text filed handle in lower a training sample set by way of,
Also the diversity of sample can be more enriched, improves the severity of training process, and then the neutral net for training and can be improved
For the accuracy of output result prediction.
In one embodiment, each text filed corresponding weights are calculated by following steps:
1st, each sample text region is processed into row vector, the row vector in t-th of sample text region is designated as xt,
Corresponding numeric label is expressed as at, at∈ [0,1].Numeric label is to include mathematical formulae according to each sample text region
Situation and manually mark, numeric label 1 is then added when comprising mathematical formulae, then adds numerical value during not comprising mathematical formulae
Label is 0.
It is theoretical according to Q-learning (intensified learning), the future returns in t-th of sample text region can be calculated:
Wherein, T is the total amount in all sample text regions, γt’-tRepresent t-th of sample pair
The incentive discount coefficient of the individual samples of t '.Q-learning purpose is one control strategy of construction so that Agent (in IT field,
Agent can refer to the software or hardware entities of autonomic activities) behavioral performance reaches maximum.Agent is from complex environment
Perception information, information is handled.Agent is by the performance of learning improvement itself and housing choice behavior, so as to produce colony's row
For selection, individual behavior selection and group behavior select to cause that Agent makes decisions a certain action of selection, and then influence ring
Border.Incentive discount coefficient is used for representing the incentive discount that is obtained at the moment afterwards, same reward, more early, Q- at the time of acquisition
The reward experienced in learning is higher.
2nd, (x, the expectation maximal rewards value in t-th of sample text region a) is calculated using optimal action-value function Q:
Qt(xt, at)=maxπE[rt|xt=x, at=a, π], wherein, xtRefer to the row vector in sample text region, at
Numeric label is referred to, π refers to the mapping function of sample text region and numeric label.
3rd, the target that t-th of sample text region is calculated according to below equation exports y:
Wherein, θt-1Represent t-1
Weighting parameter during sample text region, γ are incentive discount coefficient, θt-1For weights corresponding to the t-1 sample text region.
4th, by minimizing loss function Lt(xt, at) update the weighting parameter of deep learning neutral net.
Wherein, ρ (x, a) refers to sample text
The row vector x in regiontWith label atProbability distribution, E [] is asks expectation.
5th, loss function L is minimizedt(xt, at) to weights θtDerivation, you can obtain the weights of deep learning neutral net more
New equation:
Using each text filed weights of algorithm being calculated step by step so that can when inputting text filed every time
Effectively the weights of the first convolutional network are adjusted, improve the validity of training so that the first convolutional network trained
More accurately detect text filed comprising mathematical formulae in image.
In one embodiment, the second convolutional neural networks train to obtain in the following way:Obtaining will choose each time
Text filed be input to the public comprising mathematics of roughing in the sample for exporting to obtain in the input variable of the first convolutional neural networks
Formula it is text filed;By text filed second convolutional neural networks of being input to comprising mathematical formulae of roughing in each sample
It is trained in input variable, the second convolutional neural networks trained.
The input that can be regarded as the second convolutional neural networks is the output of the first convolutional neural networks, i.e., including for roughing counts
Learn the text filed of formula.When by each it is text filed be input in the input variable of the first convolutional neural networks after, first
Convolutional neural networks can export prediction and include the text filed of mathematical formulae, i.e., roughing is text filed comprising mathematical formulae.
When the roughing that the first convolutional neural networks export the second convolutional neural networks are input to comprising the text filed of mathematical formulae
After in input variable, the second convolutional neural networks being screened again comprising the text filed of mathematical formulae to roughing, it is defeated
Go out predictive more accurate final text filed comprising mathematical formulae.
Although the first convolutional neural networks and the second convolutional neural networks are all the texts that output prediction contains mathematical formulae
One's respective area, but it is actually still distinguishing.What the first convolutional neural networks were done is a screening, numerous text filed
In exclude and be entirely free of the text filed of mathematical formulae, and prediction is included into the text filed of mathematical formulae and filtered out
Come.Compare the first convolutional neural networks, and the requirement of the second convolutional neural networks is then more harsh, because the first convolution nerve net
Picked out as long as network will likely include the text filed of mathematical formulae, but the second convolutional neural networks are possible in a pile
Include mathematical formulae it is text filed in, accurately pick out and really include the text filed of mathematical formulae.Therefore to the
When two convolutional neural networks are trained, the data of input are also first to filter out obtained roughing by the first convolutional neural networks
It is text filed, so enable to the second convolutional neural networks carry out feature extraction with learn when more precisely with it is quick
Sense, so can guarantee that the second convolutional neural networks trained when predicting output data more quick and precisely.
In one embodiment, roughing in each sample comprising the text filed of mathematical formulae is input to the second convolution
It is trained in the input variable of neutral net, including:Obtain each text filed each text being calculated according to selection
The weights of one's respective area;By text filed second convolutional neural networks of being input to comprising mathematical formulae of roughing in each sample
When in input variable, the weights in the second convolutional neural networks are adjusted according to text filed weights;Adjusted according to each
Weights after section are trained to the second convolutional neural networks.
Training to neutral net is actually the weights of constantly adjustment neutral net so that the prediction of neutral net
As a result constantly close to the standard figures of setting.When being trained to the first convolutional neural networks, pass through each text area
Each text filed weights have been calculated in domain, and the input of the second convolutional neural networks is also equally text filed, only
Be these it is text filed be text filed by the select roughing of the first convolutional neural networks but substantially not any
Change, input be it is text filed, output is still text filed.Therefore the second convolutional neural networks are in the training process,
The process for calculating each text filed weights can be dispensed, can be used directly what the first convolutional neural networks had calculated
Each text filed weights.
Therefore, roughing in each sample comprising the text filed of mathematical formulae is being input to the second convolutional neural networks
Input variable in when, the weights in the second convolutional neural networks are adjusted according to text filed weights, weights regulation
Process be the second convolution neural metwork training process.The continuous change of weights also can be improved preferably for nerve net
The training effectiveness of network.
In one embodiment, the first convolutional neural networks and the second convolutional neural networks are included in a deep learning god
Through in network.
First convolutional neural networks and the second convolutional neural networks are included in same deep learning neutral net, therefore
In actual mechanical process, it is only necessary to by image to be predicted be divided into it is multiple it is text filed after be input to deep learning nerve net
In network, and need not be successively using the first convolutional neural networks and the second convolutional neural networks to according to image to be predicted point
That cuts to obtain text filed screened and is extracted respectively.But it is to be understood that to text inside deep learning neutral net
When one's respective area is detected, two steps are divided into, the first step is first to screen, and second step is extraction, the two steps difference
Completed by the first convolutional neural networks in deep learning neutral net and the second convolutional neural networks.This mode causes in reality
Operating procedure can be more saved in the operating process of border, can also save detection time, improves detection efficiency.
In one embodiment, as shown in figure 5, deep learning neutral net trains to obtain in the following way:
Step 502, image pattern is obtained, and image pattern is divided into multiple text filed.
Image pattern can be obtained from mathematical formulae Sample Storehouse, and multiple images sample is got from mathematical formulae Sample Storehouse
Afterwards, image pattern first can be divided into multiple sample text regions.As shown in fig. 6, first manual operation divide the image into it is multiple
Sample text region, artificial segmentation figure as when have predetermined segmentation standard, this segmentation standard can according to the actual requirements depending on.
Step 504, it is normalized to text filed, and is each text filed addition numeric label.
After dividing the image into multiple sample text regions, sample text region can be normalized.Normalization
Processing is included according to the word size in sample text region and the direction of line of text, sample text region is zoomed in and out,
Rotation so that all sample text region font sizes and image size are consistent, and text line direction is unified.Part
When the font size in sample text region is smaller, font can be amplified to processing.When the size in part sample text region
When smaller, the size in this sample text region of mode to be filled in the blanks in sample text region surrounding and other can be passed through
It is text filed consistent.
After normalized, numeric label will be added for each sample text region of sample.Numeric label is to make in advance
Rule is set, for example numeric label 1, the sample not comprising mathematical formulae are added into the sample text region comprising mathematical formulae
It is text filed then to add numeric label 0.The standard whether identification sample text region includes mathematical formulae can be according to researcher
Consideration depending on, such as, by standard set it is more harsh, it is believed that when only the formula comprising equation is just mathematical formulae,
It is y >=x that so those, which contain the formula included in other oeprators, such as sample text region, due to not wrapped in the formula
Containing equal to number, then the numeric label of this sample text region addition is 0, represents not including mathematics public affairs in this sample text region
Formula.After it with the addition of corresponding numeric label for each sample text region, as long as can accurately be known according to numeric label
Whether the road sample text region includes mathematical formulae.
Step 506, sample text region is processed into the form of row vector, by the sample text region of row vector form
It is divided into predetermined quantity part training sample set.
Sample text region is not the input data directly as deep learning neutral net in the training process,
But the form by sample text regional processing into row vector, then be input in the input variable of deep learning neutral net.Tool
Body, all pixels in each sample text region are pressed into coordinate sequence, each pixel is exactly a value in row vector,
Equivalent to the array two dimension, become one-dimensional.Each corresponding sample text region of row vector, therefore have corresponding number
It is worth label, corresponding numeric label is 0 or 1.The sample text region after row vector will be processed into again is divided into predetermined quantity part
Training sample set.Predetermined quantity is generally to be set by researcher according to actual items or experience, will owned
Sample text regional processing embark on journey after vector format, these sample text regions can be divided into the training of predetermined quantity part at random
Sample set.For example the total quantity in sample text region is 100, when the predetermined quantity set is 10, then will be by 100
Sample text region is divided into 10 parts of training sample sets.
Step 508, multiple sample text regions are randomly choosed from every a training sample set successively and is sequentially inputted to depth
Spend in learning neural network.
It is not to choose a training sample each time after sample text region to be divided into the training sample set of predetermined quantity part
This collection, all sample text regions in sample set are all input in deep learning neutral net.But trained from every portion
Randomly select multiple in sample set, then multiple sample text regions of selection are input in deep learning neutral net, to depth
Degree learning neural network is trained.It is real when sample text region to be input to the input variable of deep learning neutral net
It is to be input to sample text region in the input variable for the first convolutional neural networks that deep learning neutral net includes on border.
The sample text region of input can pass through the first convolutional neural networks and carry out preliminary screening, and the roughing exported includes mathematics
Formula it is text filed, the second convolutional neural networks then included in deep learning neutral net are again to the first convolution nerve net
The roughing that network screens accurately is extracted comprising the text filed of mathematical formulae.
Step 510, the weights according to corresponding to each sample text region are trained to deep learning neutral net.
After each sample text region is processed into row vector, then by the row vector in t-th of sample text region
It is designated as xt, corresponding numeric label is expressed as at, at∈ [0,1].Weights corresponding to calculating sample text region are divided into 5 steps, have
Body is as follows:
1st, it is theoretical according to Q-learning, the future returns in t-th of sample text region can be calculated:
Wherein, T is the total amount in all sample text regions, γt’-tRepresent t-th of sample pair
The incentive discount coefficient of the individual samples of t '.
2nd, (x, the expectation maximal rewards value in t-th of sample text region a) is calculated using optimal action-value function Q:
Qt(xt, at)=maxπE[rt|xt=x, at=a, π], wherein, xtRefer to the row vector in sample text region, at
Numeric label is referred to, π refers to the mapping function of sample text region and numeric label.
3rd, the target that t-th of sample text region is calculated according to below equation exports y:
Wherein, θt-1Represent t-1
Weighting parameter during sample text region, γ are incentive discount coefficient, θt-1For weights corresponding to the t-1 sample text region.
4th, by minimizing loss function Lt(xt, at) update the weighting parameter of deep learning neutral net.
Wherein, ρ (x, a) refers to sample text
The row vector x in regiontWith label atProbability distribution, E [] is asks expectation.
5th, loss function L is minimizedt(xt, at) to weights θtDerivation, you can obtain the weights of deep learning neutral net more
New equation:
After the weights in each sample text region are calculated, sample text region is input to deep learning each time
When in neutral net, deep learning neutral net can be all adjusted according to the weights in each sample text region, so it is right
Deep learning neutral net is trained.Here regardless of during deep learning neutral net is trained, be it is specific such as
What the first convolutional neural networks for including of training deep learning neutral net and the second convolutional neural networks, because the first volume
There are weights and share in product neutral net and the second convolutional neural networks, i.e., shared partial parameters.
Step 512, whether the numeric label for detecting the sample text region of deep learning neutral net output shows the sample
Originally it is text filed to include mathematical formulae, if it is not, then performing step 514;If so, then perform step 518.
When sample text region is input in deep learning network, eventually exporting deep learning neural network prediction is
Include the text filed of mathematical formulae.As shown in fig. 7, deep learning neutral net can include this part the text of mathematical formulae
One's respective area exports.But the prediction result of deep learning neutral net is probably wrong.Judge the correctness of prediction result,
Can directly according to output it is text filed corresponding to numeric label judged.Such as the text filed corresponding number when output
When to be worth label be 0, then illustrate this it is text filed be actually not comprising mathematical formulae, but deep learning nerve net Luoque will
This is text filed to be used as output data, then illustrates difference be present between the output of deep learning neutral net and real data,
So just need to handle the sample text region of these prediction errors.
Step 514, after numeric label being shown into the sample text region not comprising mathematical formulae is handled, it is added to down
A training sample is concentrated.
Processing mode to sample text region is specifically, by the output of last deep learning neutral net and actual number
2 ° of rotate counterclockwise is first carried out according to all sample text regions having differences, the gaussian noise of σ=0.2 is added, with herein
These sample text regions are added in lower a training sample set again after reason.
Step 516, until the training sample set of predetermined quantity part is chosen to finish.
Finished when the training sample set of predetermined quantity part is chosen, i.e., all training sample sets are selected at random
Multiple sample text regions, which are added in deep learning neutral net, to be trained.Although it there may be in predetermined quantity part
When in last a training sample set input deep learning neutral net, however it remains the numerical value mark in the sample text region of output
Label show that this article one's respective area does not include mathematical formulae, but such case is also receptible.Because when training sample is more,
Even if it is also acceptable for finally there are a point tolerance.Furthermore training to a certain extent after, continue training go down result would not
Convergence, does not continue to restrain with the error of test set MTD learning neural network, then training can also finish.
Step 518, do not deal with.
When the numeric label in the sample text region of deep learning neutral net output shows that the sample text region is not wrapped
When containing mathematical formulae, then the prediction output for illustrating deep learning neutral net is correct, then without to these sample texts
Region is handled.
Deep learning neutral net before the use, must will according to reality project in advance to deep learning neutral net
An effective training is carried out, trains the deep learning neutral net finished to be preferably used for detecting image to be detected,
It can accurately extract text filed comprising mathematical formulae in image to be detected.
In one embodiment, the first above-mentioned convolutional neural networks are RPN convolutional neural networks, the second convolution nerve net
Network is fast-rcnn convolutional neural networks.
RPN (RegionProposal Networks), i.e. region convolutional neural networks.Inputted by sample text region
During into deep learning neutral net, just it is enter into fact in RPN input variable, RPN convolutional neural networks are receiving
During sample text region, RPN convolutional neural networks can be detected and screened to sample text region.RPN convolutional neural networks
The sample text area comprising mathematical formulae text filed, and that these roughings are gone out for being judged as including mathematical formulae can be filtered out
Domain passes to fast-rcnn (Regions with CNN features, region quick detection) convolutional neural networks.Can letter
Singly it is interpreted as, RPN convolutional neural networks are exactly a formula detection model in fact, can distinguish common text and mathematics is public
Formula.
The sample text that fast-rcnn convolutional neural networks then can be screened further to RPN convolutional neural networks
Region is accurately detected, and the text filed output that will be predicted as including mathematical formulae.RPN convolutional neural networks and
The combination of fast-rcnn convolutional neural networks, the accuracy of detection can be largely improved, improves testing result
Reliability.
In one embodiment, as shown in Figure 8, there is provided mathematical formulae detection means, device include in a kind of image:
Acquisition module 802, for obtaining image to be detected;
Split module 804, for carrying out text segmentation to image to be detected, obtain multiple text filed;
First convolution neural network module 806, for by multiple text filed the first convolution for being sequentially inputted to train
In the input variable of neutral net, the text filed comprising mathematical formulae of multiple roughings is exported;
Second convolution neural network module 808, for roughing to be input into training comprising the text filed of mathematical formulae
In the input variable of the second good convolutional neural networks, output is finally text filed comprising mathematical formulae.
In one embodiment, mathematical formulae detection means also includes training module 900 in above-mentioned image, for first
Convolutional neural networks are trained.Specifically, as shown in figure 9, training module 900 includes:
Training sample acquisition module 902, for obtaining mathematical formulae training sample;
Sample decomposition module 904, it is multiple text filed for each mathematical formulae training sample to be divided into;
Sample distribution module 906, for by multiple text filed training sample sets for being divided into predetermined quantity part;
First training module 908, chosen for successively in every a training sample set it is multiple text filed, successively will choosing
Text filed be input in the input variable of the first convolutional neural networks taken is trained, until all training sample sets are chosen
And train and finish, the first convolutional neural networks trained.
In one embodiment, above-mentioned training module 900 also includes:Computing module (not shown), for for choosing
What is taken is each text filed, calculates each text filed weights;First training module 908 is additionally operable to each time will be text filed
When being input in the input variable of the first convolutional neural networks, according to text filed weights in the first convolutional neural networks
Weights are adjusted, and the first convolutional neural networks are trained according to the weights after each regulation.
In one embodiment, above-mentioned first training module 908 is additionally operable to obtain each time by the text filed defeated of selection
Enter into the input variable of the first convolutional neural networks the text filed comprising mathematical formulae of roughing in the sample for exporting to obtain;
Roughing in each sample comprising the text filed of mathematical formulae is input in the input variable of the second convolutional neural networks
Row training, the second convolutional neural networks trained.
In one embodiment, above-mentioned sample decomposition module 904 is additionally operable to each text filed addition obtained to segmentation
Numeric label;Training module 900 also includes:
Second training module (not shown), for obtain each time roughing in multiple samples comprising mathematical formulae
After text filed, roughing in multiple samples comprising the text filed of mathematical formulae is sequentially inputted to the second convolutional neural networks
Input variable in be trained, export text filed comprising mathematical formulae in sample;Detection includes when in the sample of output
The text filed numeric label of mathematical formulae shows text filed when not including mathematical formulae, then locates to text filed
Reason;To by the text filed instruction for being added to lower portion and needing to be input in the input variable of the first convolutional neural networks after processing
Practice in sample set;Text filed training sample after processing is added concentrates selection multiple text filed, by the text of selection
One's respective area is input in the input variable of the first convolutional neural networks and trained again, until the second convolution trained
Neutral net.
In one embodiment, the second training module is additionally operable to the text filed carry out that mathematical formulae is included in sample is inverse
Hour hands rotate predetermined angle;To the height of the text filed addition preset value comprising mathematical formulae in the sample after rotation predetermined angle
This noise.
In one embodiment, above-mentioned second training module is additionally operable to obtain and text filed calculated according to each of selection
The each text filed weights arrived;Roughing in each sample comprising the text filed of mathematical formulae is input to the second convolution
When in the input variable of neutral net, the weights in the second convolutional neural networks are adjusted according to text filed weights;
The second convolutional neural networks are trained according to the weights after each regulation.
In one embodiment, there is provided a kind of computer-readable recording medium, be stored thereon with computer program, calculate
Machine program realizes following steps when being executed by processor:Obtain image to be detected;Text segmentation is carried out to image to be detected, obtained
It is multiple text filed;It is defeated by the input variable of multiple text filed the first convolutional neural networks for being sequentially inputted to train
Go out the text filed comprising mathematical formulae of multiple roughings;Text filed be input to comprising mathematical formulae of roughing is trained
The second convolutional neural networks input variable in, output is finally text filed comprising mathematical formulae.
In one embodiment, above-mentioned first convolution god is trained when computer program is executed by processor by following steps
Through network:Obtain mathematical formulae training sample;Each mathematical formulae training sample is divided into multiple text filed;Will be multiple
The text filed training sample set for being divided into predetermined quantity part;Chosen successively in every a training sample set it is multiple text filed,
Text filed be input in the input variable of the first convolutional neural networks of selection is trained successively, until all training samples
This collection is chosen and trains the first convolutional neural networks for finishing, being trained.
In one embodiment, computer program is executed by processor is input to the first volume by the text filed of selection successively
During the step being trained in the input variable of product neutral net, including for each text filed of selection, calculate each text
The weights of one's respective area;Each time by it is text filed be input in the input variable of the first convolutional neural networks when, according to text area
Weights in first convolutional neural networks are adjusted the weights in domain;According to the weights after each regulation to the first convolutional Neural
Network is trained.
In one embodiment, above-mentioned second convolution god is trained when computer program is executed by processor by following steps
Through network:Obtain and the text filed of selection is input to what output in the input variable of the first convolutional neural networks obtained each time
Roughing is text filed comprising mathematical formulae in sample;By in each sample roughing it is text filed defeated comprising mathematical formulae
Enter into the input variable of the second convolutional neural networks and be trained, the second convolutional neural networks trained.
In one embodiment, computer program be executed by processor each mathematical formulae training sample is divided into it is multiple
After text filed step, in addition to:The each text filed addition numeric label obtained to segmentation;
Computer program, which is executed by processor, is input to comprising the text filed of mathematical formulae roughing in each sample
It is trained in the input variable of second convolutional neural networks, during the step of the second convolutional neural networks trained, bag
Include:After text filed comprising mathematical formulae for obtaining roughing in multiple samples each time, roughing in multiple samples is included
The text filed of mathematical formulae is sequentially inputted to be trained in the input variable of the second convolutional neural networks, exports and is wrapped in sample
Containing the text filed of mathematical formulae;When the text filed numeric label for including mathematical formulae in the sample of output shows text area
When domain does not include mathematical formulae, then handle text filed;By text filed after processing be added to it is lower it is a need it is defeated
Enter to the training sample in the input variable of the first convolutional neural networks and concentrate;Text filed training after processing is added
Choose multiple text filed in sample set, text filed by selection is input in the input variable of the first convolutional neural networks
Row is trained again, until the second convolutional neural networks trained.
In one embodiment, when computer program is executed by processor the step handled text filed, including:
The text filed carry out rotate counterclockwise predetermined angle of mathematical formulae will be included in sample;To in the sample after rotation predetermined angle
The gaussian noise of text filed addition preset value comprising mathematical formulae.
In one embodiment, computer program be executed by processor by each sample roughing comprising mathematical formulae
It is text filed when being input to the step being trained in the input variable of the second convolutional neural networks, including:Obtain according to selection
Each text filed each text filed weights being calculated;By the text for including mathematical formulae of roughing in each sample
When one's respective area is input in the input variable of the second convolutional neural networks, according to text filed weights to the second convolution nerve net
Weights in network are adjusted;The second convolutional neural networks are trained according to the weights after each regulation.
One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with
The hardware of correlation is instructed to complete by computer program, described program can be stored in a non-volatile computer and can be read
In storage medium, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, it is provided herein
Each embodiment used in any reference to memory, storage, database or other media, may each comprise non-volatile
And/or volatile memory.Nonvolatile memory may include that read-only storage (ROM), programming ROM (PROM), electricity can be compiled
Journey ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include random access memory
Or external cache (RAM).By way of illustration and not limitation, RAM is available in many forms, such as static RAM
(SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhanced SDRAM
(ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) directly RAM (RDRAM), straight
Connect memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above example can be combined arbitrarily, to make description succinct, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, lance is not present in the combination of these technical characteristics
Shield, all it is considered to be the scope of this specification record.
Above example only expresses the several embodiments of the present invention, and its description is more specific and detailed, but can not
Therefore it is construed as limiting the scope of the patent.It should be pointed out that for the person of ordinary skill of the art,
On the premise of not departing from present inventive concept, various modifications and improvements can be made, these belong to protection scope of the present invention.
Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (15)
1. mathematical formulae detection method, methods described include in a kind of image:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, output is multiple
Roughing includes the text filed of mathematical formulae;
The text filed input for being input to the second convolutional neural networks trained comprising mathematical formulae of the roughing is become
In amount, output is finally text filed comprising mathematical formulae.
2. according to the method for claim 1, it is characterised in that first convolutional neural networks are trained in the following way
Obtain:
Obtain mathematical formulae training sample;
Each mathematical formulae training sample is divided into multiple text filed;
By the multiple text filed training sample set for being divided into predetermined quantity part;
Chosen successively in every a training sample set multiple text filed, the text filed of selection is input to the first volume successively
It is trained in the input variable of product neutral net, chooses and train until all training sample sets and finish, trained
First convolutional neural networks.
3. according to the method for claim 2, it is characterised in that described that the text filed of selection is input to the first volume successively
It is trained in the input variable of product neutral net, including:
For each text filed of selection, each text filed weights are calculated;
Each time by it is described it is text filed be input in the input variable of first convolutional neural networks when, according to the text
Weights in first convolutional neural networks are adjusted the weights in region;
First convolutional neural networks are trained according to the weights after each regulation.
4. according to the method for claim 2, it is characterised in that second convolutional neural networks are trained in the following way
Obtain:
Obtain the text filed sample that exports to obtain in the input variable of the first convolutional neural networks of being input to of selection each time
Roughing is text filed comprising mathematical formulae in this;
Text filed comprising mathematical formulae of roughing in each sample is input to second convolutional neural networks
It is trained in input variable, the second convolutional neural networks trained.
5. according to the method for claim 4, it is characterised in that it is described each mathematical formulae training sample is divided into it is more
It is individual it is text filed after, in addition to:The each text filed addition numeric label obtained to segmentation;
It is described that roughing in each sample comprising the text filed of mathematical formulae is input to the second convolution nerve net
It is trained in the input variable of network, the second convolutional neural networks trained, including:
, will be thick in the multiple sample after text filed comprising mathematical formulae for obtaining roughing in the multiple sample each time
Choosing is sequentially inputted to be trained in the input variable of second convolutional neural networks comprising the text filed of mathematical formulae,
Export final text filed comprising mathematical formulae in sample;
When in the sample of output finally comprising mathematical formulae text filed numeric label show it is described it is text filed not
During comprising mathematical formulae, then text filed handled to described;
By after processing it is described it is text filed be added to lower portion and need to be input to the inputs of first convolutional neural networks become
Training sample in amount is concentrated;
The text filed training sample after processing is added concentrates selection multiple text filed, by the text area of selection
Domain is input in the input variable of the first convolutional neural networks and trained again, until the second convolutional Neural trained
Network.
6. according to the method for claim 5, it is characterised in that it is described by it is described it is text filed handled, including:
By the final text filed carry out rotate counterclockwise predetermined angle for including mathematical formulae in the sample of output;
To the gaussian noise of the text filed addition preset value comprising mathematical formulae in the sample after rotation predetermined angle.
7. according to the method for claim 4, it is characterised in that it is described by each sample roughing it is public comprising mathematics
Text filed be input in the input variable of second convolutional neural networks of formula is trained, including:
Obtain each text filed each text filed weights being calculated according to selection;
Text filed comprising mathematical formulae of roughing in each sample is input to second convolutional neural networks
When in input variable, the weights in second convolutional neural networks are adjusted according to the text filed weights;
Second convolutional neural networks are trained according to the weights after each regulation.
8. mathematical formulae detection means in a kind of image, it is characterised in that described device includes:
Acquisition module, for obtaining image to be detected;
Split module, for carrying out text segmentation to described image to be detected, obtain multiple text filed;
First convolution neural network module, for by the multiple text filed the first convolutional Neural for being sequentially inputted to train
In the input variable of network, the text filed comprising mathematical formulae of multiple roughings is exported;
Second convolution neural network module, for the roughing to be input into what is trained comprising the text filed of mathematical formulae
In the input variable of second convolutional neural networks, output is finally text filed comprising mathematical formulae.
9. device according to claim 8, it is characterised in that described device also includes training module, for the first volume
Product neutral net is trained, and the training module includes:
Training sample acquisition module, for obtaining mathematical formulae training sample;
Sample decomposition module, it is multiple text filed for each mathematical formulae training sample to be divided into;
Sample distribution module, for by multiple text filed training sample sets for being divided into predetermined quantity part;
First training module, chosen for successively in every a training sample set it is multiple text filed, successively by the text of selection
One's respective area is input in the input variable of the first convolutional neural networks and is trained, until all training sample sets are chosen and train
Finish, the first convolutional neural networks trained.
10. device according to claim 9, it is characterised in that the training module also includes:
Computing module, for for each text filed of selection, calculating each text filed weights;
First training module is additionally operable to, and is input to each time by text filed in the input variable of the first convolutional neural networks
When, the weights in the first convolutional neural networks are adjusted according to text filed weights, according to the weights after each regulation
First convolutional neural networks are trained.
11. device according to claim 9, it is characterised in that first training module 908 is additionally operable to, and is obtained each
The secondary text filed bag for being input to roughing in the sample for exporting to obtain in the input variable of the first convolutional neural networks by selection
Containing the text filed of mathematical formulae;The text filed of mathematical formulae that include of roughing in each sample is input to the second convolution god
It is trained in input variable through network, the second convolutional neural networks trained.
12. device according to claim 11, it is characterised in that the sample decomposition module 904 is additionally operable to splitting
The each text filed addition numeric label arrived;
The training module 900 also includes:
Second training module, will be more after text filed comprising mathematical formulae for obtaining roughing in multiple samples each time
Roughing is sequentially inputted to enter in the input variable of the second convolutional neural networks comprising the text filed of mathematical formulae in individual sample
Row training, export text filed comprising mathematical formulae in sample;Detection is when the text that mathematical formulae is included in the sample of output
The numeric label in region shows text filed when not including mathematical formulae, then handles text filed;To by after processing
The text filed training sample concentration for being added to lower portion and needing to be input in the input variable of the first convolutional neural networks;From adding
Enter the text filed training sample after processing and concentrated to choose multiple text filed, text filed by selection is input to first
Trained again in the input variable of convolutional neural networks, until the second convolutional neural networks trained.
13. device according to claim 11, it is characterised in that second training module is additionally operable to, and is obtained according to choosing
The each text filed each text filed weights being calculated taken;By in each sample roughing comprising mathematical formulae
It is text filed when being input in the input variable of the second convolutional neural networks, according to text filed weights to the second convolutional Neural
Weights in network are adjusted;The second convolutional neural networks are trained according to the weights after each regulation.
14. a kind of computer equipment, including memory, processor and storage are on a memory and the meter that can run on a processor
Calculation machine program, it is characterised in that realized during the computing device described program in claim 1-7 any one described images
The step of mathematical formulae detection method.
15. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that described program is processed
The step of device realizes mathematical formulae detection method in claim 1-7 any one described images when performing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711190154.1A CN107886082B (en) | 2017-11-24 | 2017-11-24 | Method and device for detecting mathematical formulas in images, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711190154.1A CN107886082B (en) | 2017-11-24 | 2017-11-24 | Method and device for detecting mathematical formulas in images, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107886082A true CN107886082A (en) | 2018-04-06 |
CN107886082B CN107886082B (en) | 2023-07-04 |
Family
ID=61774866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711190154.1A Active CN107886082B (en) | 2017-11-24 | 2017-11-24 | Method and device for detecting mathematical formulas in images, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107886082B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145956A (en) * | 2018-07-26 | 2019-01-04 | 上海慧子视听科技有限公司 | Methods of marking, device, computer equipment and storage medium |
CN110796137A (en) * | 2019-10-10 | 2020-02-14 | 中国建设银行股份有限公司 | Method and device for identifying image |
CN110942067A (en) * | 2019-11-29 | 2020-03-31 | 上海眼控科技股份有限公司 | Text recognition method and device, computer equipment and storage medium |
CN111652145A (en) * | 2020-06-03 | 2020-09-11 | 广东小天才科技有限公司 | Formula detection method and device, electronic equipment and storage medium |
CN111695377A (en) * | 2019-03-13 | 2020-09-22 | 杭州海康威视数字技术股份有限公司 | Text detection method and device and computer equipment |
CN111814798A (en) * | 2020-07-14 | 2020-10-23 | 深圳中兴网信科技有限公司 | Method for digitizing titles and readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101329731A (en) * | 2008-06-06 | 2008-12-24 | 南开大学 | Automatic recognition method pf mathematical formula in image |
CN104298976A (en) * | 2014-10-16 | 2015-01-21 | 电子科技大学 | License plate detection method based on convolutional neural network |
CN106355573A (en) * | 2016-08-24 | 2017-01-25 | 北京小米移动软件有限公司 | Target object positioning method and device in pictures |
CN106845406A (en) * | 2017-01-20 | 2017-06-13 | 深圳英飞拓科技股份有限公司 | Head and shoulder detection method and device based on multitask concatenated convolutional neutral net |
US20170177965A1 (en) * | 2015-12-17 | 2017-06-22 | Xerox Corporation | Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks |
-
2017
- 2017-11-24 CN CN201711190154.1A patent/CN107886082B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101329731A (en) * | 2008-06-06 | 2008-12-24 | 南开大学 | Automatic recognition method pf mathematical formula in image |
CN104298976A (en) * | 2014-10-16 | 2015-01-21 | 电子科技大学 | License plate detection method based on convolutional neural network |
US20170177965A1 (en) * | 2015-12-17 | 2017-06-22 | Xerox Corporation | Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks |
CN106355573A (en) * | 2016-08-24 | 2017-01-25 | 北京小米移动软件有限公司 | Target object positioning method and device in pictures |
CN106845406A (en) * | 2017-01-20 | 2017-06-13 | 深圳英飞拓科技股份有限公司 | Head and shoulder detection method and device based on multitask concatenated convolutional neutral net |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145956A (en) * | 2018-07-26 | 2019-01-04 | 上海慧子视听科技有限公司 | Methods of marking, device, computer equipment and storage medium |
CN109145956B (en) * | 2018-07-26 | 2021-12-14 | 上海慧子视听科技有限公司 | Scoring method, scoring device, computer equipment and storage medium |
CN111695377A (en) * | 2019-03-13 | 2020-09-22 | 杭州海康威视数字技术股份有限公司 | Text detection method and device and computer equipment |
CN111695377B (en) * | 2019-03-13 | 2023-09-29 | 杭州海康威视数字技术股份有限公司 | Text detection method and device and computer equipment |
CN110796137A (en) * | 2019-10-10 | 2020-02-14 | 中国建设银行股份有限公司 | Method and device for identifying image |
CN110942067A (en) * | 2019-11-29 | 2020-03-31 | 上海眼控科技股份有限公司 | Text recognition method and device, computer equipment and storage medium |
CN111652145A (en) * | 2020-06-03 | 2020-09-11 | 广东小天才科技有限公司 | Formula detection method and device, electronic equipment and storage medium |
CN111652145B (en) * | 2020-06-03 | 2023-09-26 | 广东小天才科技有限公司 | Formula detection method and device, electronic equipment and storage medium |
CN111814798A (en) * | 2020-07-14 | 2020-10-23 | 深圳中兴网信科技有限公司 | Method for digitizing titles and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107886082B (en) | 2023-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107886082A (en) | Mathematical formulae detection method, device, computer equipment and storage medium in image | |
US9129190B1 (en) | Identifying objects in images | |
US9613297B1 (en) | Identifying objects in images | |
EP3712812A1 (en) | Recognizing typewritten and handwritten characters using end-to-end deep learning | |
CN107169485B (en) | Mathematical formula identification method and device | |
CN106897363B (en) | Text recommendation method based on eye movement tracking | |
CN107798299A (en) | Billing information recognition methods, electronic installation and readable storage medium storing program for executing | |
US20210034905A1 (en) | Apparatus, method and computer program for analyzing image | |
CN108446621A (en) | Bank slip recognition method, server and computer readable storage medium | |
CN108985232A (en) | Facial image comparison method, device, computer equipment and storage medium | |
CN108710866A (en) | Chinese mold training method, Chinese characters recognition method, device, equipment and medium | |
CN109101469A (en) | The information that can search for is extracted from digitized document | |
CN110178139A (en) | Use the system and method for the character recognition of the full convolutional neural networks with attention mechanism | |
CN112990180B (en) | Question judging method, device, equipment and storage medium | |
CN109614973A (en) | Rice seedling and Weeds at seedling image, semantic dividing method, system, equipment and medium | |
CN109840524B (en) | Text type recognition method, device, equipment and storage medium | |
CN107609575A (en) | Calligraphy evaluation method, calligraphy evaluating apparatus and electronic equipment | |
CN108875769A (en) | Data mask method, device and system and storage medium | |
CN110413961A (en) | The method, apparatus and computer equipment of text scoring are carried out based on disaggregated model | |
CN110929746A (en) | Electronic file title positioning, extracting and classifying method based on deep neural network | |
CN108897750A (en) | Merge the personalized location recommendation method and equipment of polynary contextual information | |
US10489427B2 (en) | Document classification system, document classification method, and document classification program | |
CN105354845B (en) | A kind of semi-supervised change detecting method of remote sensing image | |
CN109101984A (en) | A kind of image-recognizing method and device based on convolutional neural networks | |
CN113420116B (en) | Medical document analysis method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |