CN107886082A - Mathematical formulae detection method, device, computer equipment and storage medium in image - Google Patents

Mathematical formulae detection method, device, computer equipment and storage medium in image Download PDF

Info

Publication number
CN107886082A
CN107886082A CN201711190154.1A CN201711190154A CN107886082A CN 107886082 A CN107886082 A CN 107886082A CN 201711190154 A CN201711190154 A CN 201711190154A CN 107886082 A CN107886082 A CN 107886082A
Authority
CN
China
Prior art keywords
text filed
convolutional neural
mathematical formulae
neural networks
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711190154.1A
Other languages
Chinese (zh)
Other versions
CN107886082B (en
Inventor
黄鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd, Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201711190154.1A priority Critical patent/CN107886082B/en
Publication of CN107886082A publication Critical patent/CN107886082A/en
Application granted granted Critical
Publication of CN107886082B publication Critical patent/CN107886082B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The present invention relates to mathematical formulae detection method, device, computer equipment and storage medium in a kind of image.Methods described includes:Obtain image to be detected;Text segmentation is carried out to described image to be detected, obtained multiple text filed;In the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, the text filed comprising mathematical formulae of multiple roughings will be exported;Being input to comprising the text filed of mathematical formulae in the input variable of the second convolutional neural networks trained for the roughing, output is finally text filed comprising mathematical formulae.By image to be detected be divided into it is multiple it is text filed after be input in convolutional neural networks, first to text filed carry out roughing, the text filed mode screened again gone out again to roughing, it more can accurately obtain including the text filed of mathematical formulae, improve the accuracy that mathematical formulae whether is included in detection image.

Description

Mathematical formulae detection method, device, computer equipment and storage medium in image
Technical field
The present invention relates to field of computer technology, more particularly to mathematical formulae detection method, device, meter in a kind of image Calculate machine equipment and storage medium.
Background technology
OCR (optical character recognition, Text region) refer to electronic equipment (such as scanner or Digital camera) character printed on paper is checked, then shape is translated into the process of computword with character identifying method; That is, text information is scanned, then image file analyzed and processed, obtain the process of word and layout information.Such as What is the friendly of the most important problems of OCR except wrong or using auxiliary information improve recognition correct rate, the stability of product, easily With property and feasibility etc..
In traditional OCR technique, if wanting to detect the mathematical formulae in image, the method for use is that image is partitioned into Character carry out floor projection and upright projection, obtain the projection properties of character.Character is obtained by the character position being partitioned into Between architectural feature, the character being partitioned into and given character are carried out to the contrast of projection properties, the structure for the intercharacter being partitioned into Feature is contrasted with given structure, determines whether mathematical formulae image.But the premise of this detection mode is segmentation essence Accurate character, partitioning algorithm that can be all at present can not all ensure the accuracy of segmentation, therefore can not ensure the accuracy of detection.
The content of the invention
Based on this, it is necessary to for above-mentioned technical problem, there is provided a kind of image that can improve detection mathematical formulae accuracy Middle mathematical formulae detection method, device, computer equipment and storage medium.
Mathematical formulae detection method, methods described include in a kind of image:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, export Multiple roughings include the text filed of mathematical formulae;
Text filed comprising mathematical formulae of the roughing is input to the defeated of second convolutional neural networks that train Enter in variable, output is finally text filed comprising mathematical formulae.
Mathematical formulae detection means, described device include in a kind of image:
Acquisition module, for obtaining image to be detected;
Split module, for carrying out text segmentation to described image to be detected, obtain multiple text filed;
First convolution neural network module, for by the multiple text filed the first convolution for being sequentially inputted to train In the input variable of neutral net, the text filed comprising mathematical formulae of multiple roughings is exported;
Second convolution neural network module, for the roughing to be input into training comprising the text filed of mathematical formulae In the input variable of the second good convolutional neural networks, output is finally text filed comprising mathematical formulae.
A kind of computer equipment, including memory, processor and storage can be run on a memory and on a processor Computer program, following steps are realized during the computing device described program:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, export Multiple roughings include the text filed of mathematical formulae;
Text filed comprising mathematical formulae of the roughing is input to the defeated of second convolutional neural networks that train Enter in variable, output is finally text filed comprising mathematical formulae.
A kind of computer-readable recording medium, computer program is stored thereon with, it is real when described program is executed by processor Existing following steps:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, export Multiple roughings include the text filed of mathematical formulae;
Text filed comprising mathematical formulae of the roughing is input to the defeated of second convolutional neural networks that train Enter in variable, output is finally text filed comprising mathematical formulae.
Mathematical formulae detection method, device, computer equipment and storage medium in above-mentioned image, by obtaining mapping to be checked Picture, text segmentation is carried out to image to be detected, obtained multiple text filed;Text filed it is sequentially inputted to what is trained by multiple In the input variable of first convolutional neural networks, the text filed comprising mathematical formulae of multiple roughings is exported;By the bag of roughing Text filed containing mathematical formulae is input in the input variable of the second convolutional neural networks trained, then can export final bag Containing the text filed of mathematical formulae.By image to be detected be divided into it is multiple it is text filed after be input in convolutional neural networks, first To text filed carry out roughing, then the text filed mode screened again for including mathematical formulae gone out to roughing, can be more Whether final text filed comprising mathematical formulae accurately to obtain, it is accurate comprising mathematical formulae in detection image to improve Property.
Brief description of the drawings
Fig. 1 is the applied environment figure of mathematical formulae detection method in image in one embodiment;
Fig. 2 is the internal structure schematic diagram of one embodiment Computer equipment;
Fig. 3 is the schematic flow sheet of mathematical formulae detection method in image in one embodiment;
Fig. 4 is the schematic flow sheet that the second convolutional neural networks are trained in one embodiment;
Fig. 5 is the schematic flow sheet that deep learning neutral net is trained in one embodiment;
Fig. 6 is the schematic diagram for splitting image pattern in one embodiment;
Fig. 7 is the text filed schematic diagram that deep learning neutral net exports in one embodiment;
Fig. 8 is the structured flowchart of mathematical formulae detection means in image in one embodiment;
Fig. 9 is the structured flowchart of training module in one embodiment.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
Fig. 1 shows the applied environment figure of mathematical formulae detection method in image in one embodiment.Reference picture 1, the figure Mathematical formulae detection method can be applied in image in mathematical formulae detecting system as in, and the system includes multiple Hes of terminal 110 Server 120, the first convolutional neural networks and the second convolutional neural networks can be run in server 120, terminal 110 passes through network It is connected with server 120.Terminal 110 can be but not limited to the personal meter of mathematical formulae detection method in various energy operation images Calculation machine, notebook computer, personal digital assistant, smart mobile phone, tablet personal computer and portable wearable device etc..Server 120 It can be the server for realizing simple function or the server for realizing multiple functions, can be specifically independent physics Server or physical server cluster.It can be serviced in terminal 110 by specifically applying display data inputting interface Device 120 can largely receive the image to be detected uploaded by terminal 110, and image to be detected that server 120 uploads terminal 110 is defeated Enter into the first convolutional neural networks input variable.Specifically, after server 120 gets image to be detected, can be to be checked Altimetric image carries out text segmentation, obtain it is multiple text filed, then by multiple text filed first volumes for being sequentially inputted to train In the input variable of product neutral net, the text filed comprising mathematical formulae of multiple roughings is exported;Roughing is included into number again Learn the text filed of formula to be input in the input variable of the second convolutional neural networks trained, output is finally public comprising mathematics Formula it is text filed.
Fig. 2 is the internal structure schematic diagram of one embodiment Computer equipment.The computer equipment can be specifically as Server 120 in Fig. 1.As shown in Fig. 2 computer equipment includes the processor, memory, network connected by system bus Interface, display screen and input unit.Wherein, the processor is used to provide calculating and control ability, supports the operation of whole terminal. The memory of computer equipment includes non-volatile memory medium and built-in storage, and non-volatile memory medium is stored with operation system System and computer program, when the computer program is executed by processor so that processor realizes mathematical formulae in a kind of image Detection method.Built-in storage in computer equipment can also store computer program, and the computer program is executed by processor When, it may be such that mathematical formulae detection method in a kind of image of computing device.The network interface of computer equipment is used for and terminal 110 communications.The input unit of computer equipment can be the touch layer that is covered on display screen or external keyboard, touch Plate or mouse etc. are controlled, input unit can obtain to be instructed caused by the operation interface that user is shown using finger to display screen, such as Obtain user and input image to be detected etc. by the particular options clicked in terminal.Display screen can be used for display inputting interface or defeated What is gone out is text filed.
It will be understood by those skilled in the art that the structure shown in Fig. 2, the only part related to the present invention program knot The block diagram of structure, does not form the restriction for the terminal being applied thereon to the present invention program, and specific terminal can be included than figure Shown in more or less parts, either combine some parts or arranged with different parts.
As shown in figure 3, in one embodiment, there is provided mathematical formulae detection method in a kind of image, this method with should For being illustrated in server 120 as shown in Figure 1.Including:
Step 302, image to be detected is obtained.
Step 304, text segmentation is carried out to image to be detected, obtained multiple text filed.
Image to be detected is generally a complete picture, and image to be detected can derive from multiple as in Fig. 1 Shown terminal., can will be to be detected during obtaining the mathematical formulae in image when terminal needs to detect image to be detected Image is sent to server, and server first can carry out text segmentation, text after image to be detected is got to image to be detected Segmentation is generally to carry out cutting operation to image according to the line of text on image.For example general text can be used Detection model is detected to the line of text on image, and multiple text areas are divided the image into further according to the line of text detected Domain.For example, when the line of text on general text detection model inspection to image has 10 row, 10 will be divided the image into It is text filed.Ordinary circumstance hypograph can not only include a line of text, therefore typically result in multiple text areas after image segmentation Domain.
Step 306, by the input variable of multiple text filed the first convolutional neural networks for being sequentially inputted to train, Export the text filed comprising mathematical formulae of multiple roughings.
Image to be detected can't be directly inputted in the first convolutional neural networks, but is first divided into image to be detected more It is individual text filed, then text filed be sequentially inputted to multiple in the first convolutional neural networks.The master of first convolutional neural networks Act on is to carry out preliminary screening, after in the text filed input variable for being input to the first convolutional neural networks, the first convolution Neutral net can screen to text filed, " will think " the text filed output for including mathematical formulae, you can obtain more Individual roughing includes the text filed of mathematical formulae.The first convolutional neural networks that directly can be screened to image to be detected are The good neutral net of training in advance, is trained using mathematical formulae training sample, possesses feature learning energy after training The neutral net of power, the data of output just possess reliability.
Step 308, text filed comprising mathematical formulae of roughing is input to second convolutional neural networks trained Input variable in, output is finally text filed comprising mathematical formulae.
Text filed comprising mathematical formulae of roughing be the first convolutional neural networks to input it is multiple it is text filed enter What is obtained after row preliminary screening is text filed comprising mathematical formulae, i.e. the first convolutional neural networks are to multiple texts of input One's respective area is filtered, and it is text filed comprising mathematical formulae to obtain multiple predictions.It is again that multiple roughings are public comprising mathematics The text filed of formula is input in the second convolutional neural networks, the input data as the second convolutional neural networks input variable. Second convolutional neural networks can be screened again to the text filed of input, will be predicted as text filed comprising mathematical formulae Output, and it is predicted as will not then export not comprising mathematical formulae.Directly to the text for including mathematical formulae of the roughing of input The second convolutional neural networks that region is predicted output are also that training in advance is good, can specifically be instructed the first convolutional neural networks The input exported during as the second convolution neural metwork training when practicing, to be trained to the second convolutional neural networks, so, The neutral net trained could be combined with actual demand, the accurate output data according to corresponding to being predicted input data.
By image to be detected be divided into it is multiple it is text filed after be input in the convolutional neural networks trained, first using instruction The first convolutional neural networks perfected include the text filed defeated of mathematical formulae to text filed carry out roughing, then by what roughing went out Enter the mode screened again to the second convolutional neural networks trained, more can accurately obtain finally including mathematics public affairs Formula it is text filed, also improve in detection image whether include the accuracy of mathematical formulae using neutral net, while also can Time consume is reduced, lifts detection efficiency.
In one embodiment, the first convolutional neural networks train to obtain in the following way:Obtain mathematical formulae training Sample;Each mathematical formulae training sample is divided into multiple text filed;Text filed it is divided into predetermined quantity part by multiple Training sample set;Chosen successively in every a training sample set it is multiple text filed, successively by the text filed defeated of selection Enter into the input variable of the first convolutional neural networks and be trained, choose and train until all training sample sets and finish, obtain To the first convolutional neural networks trained.
Before being screened to image to be detected, first the first convolutional neural networks can be trained.In training process Substantial amounts of training sample is needed, training sample extracts from mathematical formulae Sample Storehouse, and mathematical formulae training sample is mathematics public affairs Formula image pattern.Generally, the mathematical formulae image pattern got is the textual image for including mathematical formulae.When carrying When getting multiple mathematical formulae image patterns, first mathematical formulae image pattern is split, each image pattern is divided It is cut into multiple text filed.When splitting to mathematical formulae image pattern, it is manually operable and divides the image into multiple texts One's respective area.By multiple mathematical formulae image patterns be each divided into it is multiple it is text filed after, can be to text filed carry out normalizing Change is handled.Normalized is included according to the word size in text filed and the direction of line of text, to text filed progress Scaling, rotation so that all text filed font sizes and image size are consistent, and text line direction is unified.Portion When the font size of single cent one's respective area is smaller, font can be amplified to processing.When the text filed size in part is smaller, Can be text filed consistent with other by this text filed size of mode to be filled in the blanks in text filed surrounding.Return After one change processing, then this text filed big collection is divided into predetermined quantity part.Predetermined quantity can be researcher according to item Depending on mesh actual demand.By it is all it is text filed be divided into predetermined quantity part after, text filed will can be input to per a successively In first convolutional neural networks.
But be not every portion it is text filed in full text region can all be transfused to, but it is every it is a in selection portion Point, then the multiple of selection text filed are sequentially inputted in the first convolutional neural networks.Such as by mathematical formulae image pattern Obtained after segmentation 100 it is text filed, then by 100 text filed be divided into predetermined quantity 10 parts, then included per in a Text filed quantity be 10., can be from 10 texts when that will be input to per portion is text filed in the first convolutional neural networks Selected section inputs in one's respective area.Assuming that the quantity of selection is set as 5, then text filed be input to the first volume by 10 parts successively When in product neutral net, it can choose that wherein part is text filed to be inputted successively.The first convolutional Neural is input to by text filed When in network, although can select each time multiple text filed as input in portion, it is really input to the really It is still successively by each text filed input, the first convolutional neural networks are also right successively when in one convolutional neural networks Each is text filed to be screened.
After predetermined quantity part text filed has chosen part to be input in the first convolutional neural networks, the first volume Product neutral net, which is then trained, to be finished.For example predetermined quantity part is 10 parts, then text filed has also been chosen part when the 10th part It is text filed be input in the first convolutional neural networks after, then the first convolutional neural networks that can be trained.
By training sample be divided into it is multiple it is text filed after be input in the first convolutional neural networks and be trained again, can make Obtain feature extraction of first convolutional neural networks for region definitely, can also improve being trained to for the first convolutional neural networks Fruit, improve the accuracy of prediction.
In one embodiment, the text filed of selection is input in the input variable of the first convolutional neural networks successively It is trained, including:For each text filed of selection, each text filed weights are calculated;Each time will be text filed When being input in the input variable of the first convolutional neural networks, according to text filed weights in the first convolutional neural networks Weights are adjusted;The first convolutional neural networks are trained according to the weights after each regulation.
Training to neutral net is actually that the structure of neutral net is adjusted by certain algorithm, generally, What is referred to is exactly to adjust weights, the output of neutral net is consistent with desired value, such process is exactly neural metwork training.For That chooses is each text filed, can calculate it is each it is text filed corresponding to weights, text filed be input to first by each When in convolutional neural networks, the corresponding weights by the first convolutional neural networks are adjusted to text filed corresponding weights.So, When each time by text filed be input in the first convolutional neural networks, the weights of the first convolutional neural networks all may be used each time It can change, i.e., weights are being adjusted always.Generally, each text filed corresponding weights are according to by each It is text filed to be processed into what is be calculated after vector format.The first convolutional neural networks are entered according to each text filed weights Row training, more improve the validity of training.
In one embodiment, each mathematical formulae training sample is divided into it is multiple it is text filed after, in addition to: The each text filed addition numeric label obtained to segmentation.
After multiple mathematical formulae image patterns (i.e. mathematical formulae training sample) are got from mathematical formulae Sample Storehouse, meeting Each mathematical formulae image pattern is split, is divided into multiple text filed.After segmentation it is multiple it is text filed Be input to before being trained in the first convolutional neural networks, to according to each text filed situation comprising mathematical formulae to every Individual text filed addition numeric label.For example setting includes the text filed addition numeric label 1 of mathematical formulae, without wrapping Text filed then addition numeric label 0 containing mathematical formulae.So neutral net can be according to respective text filed addition Numeric label judges that this article one's respective area is actually no to include mathematical formulae.
In addition, as long as the standard for determining whether mathematical formulae can must include the formula or bag of equation Formula containing oeprator, this criterion can be depending on the considerations of researcher.When researcher wishes to sentence The more strict of standard setting is determined, it is taken as that when the mathematics formula for only including equation is just mathematical formulae, then when The formula included in text filed only has addition subtraction multiplication and division, or when symbol is the sign of inequality, all without thinking this article one's respective area Zhong Bao Contain mathematical formulae.Now, the numeric label of this article one's respective area can then be set as 0, that is, showing not including in this article one's respective area has Mathematical formulae.According to mark by mathematical formulae image pattern be processed into it is multiple with numeric label it is text filed after, then most Whole training sample be exactly in fact separated one by one carry the small picture of numeric label, these small pictures can be pure text Sheet or pure formula.
As shown in figure 4, roughing in each sample comprising the text filed of mathematical formulae is input to the second convolutional Neural It is trained in the input variable of network, the second convolutional neural networks trained, including:
Step 402, after text filed comprising mathematical formulae for obtaining roughing in multiple samples each time, by multiple samples Middle roughing is sequentially inputted to be trained in the input variable of the second convolutional neural networks comprising the text filed of mathematical formulae, Export text filed comprising mathematical formulae in sample.
After text filed in sample is sequentially inputted in the first convolutional neural networks each time, the first convolution nerve net Network can include prediction in sample in the text filed output of mathematical formulae, i.e., the text for including mathematical formulae of roughing in sample Region.Again by the text filed input variable for being input to the second convolutional neural networks for including mathematical formulae of roughing in sample In, the second convolutional neural networks are trained, the second convolutional neural networks including for roughing can then count in the sample of input Text filed middle will predict for learning formula includes the text filed output of mathematical formulae, that is, exports in sample and finally include mathematics Formula it is text filed.
Step 404, when in the sample of output include mathematical formulae text filed numeric label show it is text filed not During comprising mathematical formulae, then handle text filed.
Step 406, text filed after processing is added into lower portion needs to be input to the defeated of the first convolutional neural networks The training sample entered in variable is concentrated.
The output of second convolutional neural networks is to be predicted as finally including the text filed of mathematical formulae, therefore fuller The result of meaning should be that the text filed corresponding numeric label of output is all to show that this article one's respective area is to include mathematical formulae 's.When it is not include to have mathematical formulae that the text filed numeric label of output, which shows this article one's respective area, then illustrate volume Two There is deviation, it is necessary to continue to be trained this article one's respective area in prediction of the product neutral net to this article one's respective area.
Therefore when the text filed numeric label of output shows that this article one's respective area does not include mathematical formulae, the output It is text filed that the training that lower portion needs to be input in the input variable of the first convolutional neural networks can be added to after processed In sample set.
Step 408, the text filed training sample after processing is added concentrates selection multiple text filed, will choose Text filed be input in the input variable of the first convolutional neural networks trained again, until trained second Convolutional neural networks.
After text filed after processing is added into the lower portion training sample set to be inputted, a new training is constituted Sample set.Concentrated again in new training sample and choose multiple text filed be input in the input variable of the first convolutional neural networks Trained again.It is text filed per including 50 in a for example one share 10 parts of training sample sets.In first part of instruction Practice 40 text filed are input in the input variable of the first convolutional neural networks of extraction in sample set to be trained, obtain 20 Roughing it is text filed comprising mathematical formulae, i.e., the first convolution neural network prediction 40 it is text filed in include mathematics public affairs The text filed of formula has 20.Then this 20 roughings are input to the second convolutional Neural comprising the text filed of mathematical formulae In network, as the input data of the second convolutional neural networks, the second convolutional neural networks export 5 predictions and finally include number Learn the text filed of formula.However, in this 5 text filed, the numeric label for having 2 texts is 0, i.e. the two text areas Domain is actually not include to have mathematical formulae.Now, then second part of training will can be then added to after this 2 text filed processing In sample set.
The text filed 50+2=52 that reformed into that so second part of training sample is concentrated is individual, then in second part of training sample Concentrate selection 40 is text filed to be input in first convolutional neural networks, circulated with this, until the 10th part of training sample set In it is text filed also be chosen input finish.In this example, the quantity selected in every a training sample is all 10, that is, is selected The text filed quantity selected is fixed, but in practical operation, researcher can set a ratio, such as 3/4, then the 50 text filed quantity for being actually chosen to be input in the first convolutional neural networks in a training set are 37, and the Two parts of training be concentrated with 52 it is text filed when, select quantity then to become 39, rather than a fixed numerical value.This selection Numerical value can be depending on the consideration of project demands or researcher.The mode of this selection, the randomness of sample is ensure that, Improve the accuracy for training the convolutional neural networks come.
In one embodiment, handled text filed, including:The text filed of mathematical formulae will be included in sample Carry out rotate counterclockwise predetermined angle;Text filed addition comprising mathematical formulae in sample after rotation predetermined angle is preset The gaussian noise of value.
In the training process, when the text filed number that mathematical formulae is included in the sample of the second convolutional neural networks output When value label shows that text region does not include mathematical formulae, then illustrate the prediction text filed to this of the second convolutional neural networks Result is wrong.It is so vicious text filed for this prediction, it is being again inputted into first nerves network , it is necessary to handle text filed before input variable.Processing procedure includes two steps, first, inverse to text filed progress Hour hands rotate predetermined angle, second, the gaussian noise to text filed addition preset value.For example, can be by the predetermined angle of rotation Be set to 2 degree, the preset value of gaussian noise is set to 0.2, i.e., to it is text filed carried out 2 degree of rotate counterclockwise after, then to text area Domain adds the gaussian noise of σ=0.2, then text filed after processing is added into lower portion needs to be input to first convolution Training sample in the input variable of neutral net is concentrated.
Training sample is always limited, but various complicated situation is can be potentially encountered in actual application, because This in the training process it is this be then added to after to text filed handle in lower a training sample set by way of, Also the diversity of sample can be more enriched, improves the severity of training process, and then the neutral net for training and can be improved For the accuracy of output result prediction.
In one embodiment, each text filed corresponding weights are calculated by following steps:
1st, each sample text region is processed into row vector, the row vector in t-th of sample text region is designated as xt, Corresponding numeric label is expressed as at, at∈ [0,1].Numeric label is to include mathematical formulae according to each sample text region Situation and manually mark, numeric label 1 is then added when comprising mathematical formulae, then adds numerical value during not comprising mathematical formulae Label is 0.
It is theoretical according to Q-learning (intensified learning), the future returns in t-th of sample text region can be calculated:
Wherein, T is the total amount in all sample text regions, γt’-tRepresent t-th of sample pair The incentive discount coefficient of the individual samples of t '.Q-learning purpose is one control strategy of construction so that Agent (in IT field, Agent can refer to the software or hardware entities of autonomic activities) behavioral performance reaches maximum.Agent is from complex environment Perception information, information is handled.Agent is by the performance of learning improvement itself and housing choice behavior, so as to produce colony's row For selection, individual behavior selection and group behavior select to cause that Agent makes decisions a certain action of selection, and then influence ring Border.Incentive discount coefficient is used for representing the incentive discount that is obtained at the moment afterwards, same reward, more early, Q- at the time of acquisition The reward experienced in learning is higher.
2nd, (x, the expectation maximal rewards value in t-th of sample text region a) is calculated using optimal action-value function Q:
Qt(xt, at)=maxπE[rt|xt=x, at=a, π], wherein, xtRefer to the row vector in sample text region, at Numeric label is referred to, π refers to the mapping function of sample text region and numeric label.
3rd, the target that t-th of sample text region is calculated according to below equation exports y:
Wherein, θt-1Represent t-1 Weighting parameter during sample text region, γ are incentive discount coefficient, θt-1For weights corresponding to the t-1 sample text region.
4th, by minimizing loss function Lt(xt, at) update the weighting parameter of deep learning neutral net.
Wherein, ρ (x, a) refers to sample text The row vector x in regiontWith label atProbability distribution, E [] is asks expectation.
5th, loss function L is minimizedt(xt, at) to weights θtDerivation, you can obtain the weights of deep learning neutral net more New equation:
Using each text filed weights of algorithm being calculated step by step so that can when inputting text filed every time Effectively the weights of the first convolutional network are adjusted, improve the validity of training so that the first convolutional network trained More accurately detect text filed comprising mathematical formulae in image.
In one embodiment, the second convolutional neural networks train to obtain in the following way:Obtaining will choose each time Text filed be input to the public comprising mathematics of roughing in the sample for exporting to obtain in the input variable of the first convolutional neural networks Formula it is text filed;By text filed second convolutional neural networks of being input to comprising mathematical formulae of roughing in each sample It is trained in input variable, the second convolutional neural networks trained.
The input that can be regarded as the second convolutional neural networks is the output of the first convolutional neural networks, i.e., including for roughing counts Learn the text filed of formula.When by each it is text filed be input in the input variable of the first convolutional neural networks after, first Convolutional neural networks can export prediction and include the text filed of mathematical formulae, i.e., roughing is text filed comprising mathematical formulae. When the roughing that the first convolutional neural networks export the second convolutional neural networks are input to comprising the text filed of mathematical formulae After in input variable, the second convolutional neural networks being screened again comprising the text filed of mathematical formulae to roughing, it is defeated Go out predictive more accurate final text filed comprising mathematical formulae.
Although the first convolutional neural networks and the second convolutional neural networks are all the texts that output prediction contains mathematical formulae One's respective area, but it is actually still distinguishing.What the first convolutional neural networks were done is a screening, numerous text filed In exclude and be entirely free of the text filed of mathematical formulae, and prediction is included into the text filed of mathematical formulae and filtered out Come.Compare the first convolutional neural networks, and the requirement of the second convolutional neural networks is then more harsh, because the first convolution nerve net Picked out as long as network will likely include the text filed of mathematical formulae, but the second convolutional neural networks are possible in a pile Include mathematical formulae it is text filed in, accurately pick out and really include the text filed of mathematical formulae.Therefore to the When two convolutional neural networks are trained, the data of input are also first to filter out obtained roughing by the first convolutional neural networks It is text filed, so enable to the second convolutional neural networks carry out feature extraction with learn when more precisely with it is quick Sense, so can guarantee that the second convolutional neural networks trained when predicting output data more quick and precisely.
In one embodiment, roughing in each sample comprising the text filed of mathematical formulae is input to the second convolution It is trained in the input variable of neutral net, including:Obtain each text filed each text being calculated according to selection The weights of one's respective area;By text filed second convolutional neural networks of being input to comprising mathematical formulae of roughing in each sample When in input variable, the weights in the second convolutional neural networks are adjusted according to text filed weights;Adjusted according to each Weights after section are trained to the second convolutional neural networks.
Training to neutral net is actually the weights of constantly adjustment neutral net so that the prediction of neutral net As a result constantly close to the standard figures of setting.When being trained to the first convolutional neural networks, pass through each text area Each text filed weights have been calculated in domain, and the input of the second convolutional neural networks is also equally text filed, only Be these it is text filed be text filed by the select roughing of the first convolutional neural networks but substantially not any Change, input be it is text filed, output is still text filed.Therefore the second convolutional neural networks are in the training process, The process for calculating each text filed weights can be dispensed, can be used directly what the first convolutional neural networks had calculated Each text filed weights.
Therefore, roughing in each sample comprising the text filed of mathematical formulae is being input to the second convolutional neural networks Input variable in when, the weights in the second convolutional neural networks are adjusted according to text filed weights, weights regulation Process be the second convolution neural metwork training process.The continuous change of weights also can be improved preferably for nerve net The training effectiveness of network.
In one embodiment, the first convolutional neural networks and the second convolutional neural networks are included in a deep learning god Through in network.
First convolutional neural networks and the second convolutional neural networks are included in same deep learning neutral net, therefore In actual mechanical process, it is only necessary to by image to be predicted be divided into it is multiple it is text filed after be input to deep learning nerve net In network, and need not be successively using the first convolutional neural networks and the second convolutional neural networks to according to image to be predicted point That cuts to obtain text filed screened and is extracted respectively.But it is to be understood that to text inside deep learning neutral net When one's respective area is detected, two steps are divided into, the first step is first to screen, and second step is extraction, the two steps difference Completed by the first convolutional neural networks in deep learning neutral net and the second convolutional neural networks.This mode causes in reality Operating procedure can be more saved in the operating process of border, can also save detection time, improves detection efficiency.
In one embodiment, as shown in figure 5, deep learning neutral net trains to obtain in the following way:
Step 502, image pattern is obtained, and image pattern is divided into multiple text filed.
Image pattern can be obtained from mathematical formulae Sample Storehouse, and multiple images sample is got from mathematical formulae Sample Storehouse Afterwards, image pattern first can be divided into multiple sample text regions.As shown in fig. 6, first manual operation divide the image into it is multiple Sample text region, artificial segmentation figure as when have predetermined segmentation standard, this segmentation standard can according to the actual requirements depending on.
Step 504, it is normalized to text filed, and is each text filed addition numeric label.
After dividing the image into multiple sample text regions, sample text region can be normalized.Normalization Processing is included according to the word size in sample text region and the direction of line of text, sample text region is zoomed in and out, Rotation so that all sample text region font sizes and image size are consistent, and text line direction is unified.Part When the font size in sample text region is smaller, font can be amplified to processing.When the size in part sample text region When smaller, the size in this sample text region of mode to be filled in the blanks in sample text region surrounding and other can be passed through It is text filed consistent.
After normalized, numeric label will be added for each sample text region of sample.Numeric label is to make in advance Rule is set, for example numeric label 1, the sample not comprising mathematical formulae are added into the sample text region comprising mathematical formulae It is text filed then to add numeric label 0.The standard whether identification sample text region includes mathematical formulae can be according to researcher Consideration depending on, such as, by standard set it is more harsh, it is believed that when only the formula comprising equation is just mathematical formulae, It is y >=x that so those, which contain the formula included in other oeprators, such as sample text region, due to not wrapped in the formula Containing equal to number, then the numeric label of this sample text region addition is 0, represents not including mathematics public affairs in this sample text region Formula.After it with the addition of corresponding numeric label for each sample text region, as long as can accurately be known according to numeric label Whether the road sample text region includes mathematical formulae.
Step 506, sample text region is processed into the form of row vector, by the sample text region of row vector form It is divided into predetermined quantity part training sample set.
Sample text region is not the input data directly as deep learning neutral net in the training process, But the form by sample text regional processing into row vector, then be input in the input variable of deep learning neutral net.Tool Body, all pixels in each sample text region are pressed into coordinate sequence, each pixel is exactly a value in row vector, Equivalent to the array two dimension, become one-dimensional.Each corresponding sample text region of row vector, therefore have corresponding number It is worth label, corresponding numeric label is 0 or 1.The sample text region after row vector will be processed into again is divided into predetermined quantity part Training sample set.Predetermined quantity is generally to be set by researcher according to actual items or experience, will owned Sample text regional processing embark on journey after vector format, these sample text regions can be divided into the training of predetermined quantity part at random Sample set.For example the total quantity in sample text region is 100, when the predetermined quantity set is 10, then will be by 100 Sample text region is divided into 10 parts of training sample sets.
Step 508, multiple sample text regions are randomly choosed from every a training sample set successively and is sequentially inputted to depth Spend in learning neural network.
It is not to choose a training sample each time after sample text region to be divided into the training sample set of predetermined quantity part This collection, all sample text regions in sample set are all input in deep learning neutral net.But trained from every portion Randomly select multiple in sample set, then multiple sample text regions of selection are input in deep learning neutral net, to depth Degree learning neural network is trained.It is real when sample text region to be input to the input variable of deep learning neutral net It is to be input to sample text region in the input variable for the first convolutional neural networks that deep learning neutral net includes on border. The sample text region of input can pass through the first convolutional neural networks and carry out preliminary screening, and the roughing exported includes mathematics Formula it is text filed, the second convolutional neural networks then included in deep learning neutral net are again to the first convolution nerve net The roughing that network screens accurately is extracted comprising the text filed of mathematical formulae.
Step 510, the weights according to corresponding to each sample text region are trained to deep learning neutral net.
After each sample text region is processed into row vector, then by the row vector in t-th of sample text region It is designated as xt, corresponding numeric label is expressed as at, at∈ [0,1].Weights corresponding to calculating sample text region are divided into 5 steps, have Body is as follows:
1st, it is theoretical according to Q-learning, the future returns in t-th of sample text region can be calculated:
Wherein, T is the total amount in all sample text regions, γt’-tRepresent t-th of sample pair The incentive discount coefficient of the individual samples of t '.
2nd, (x, the expectation maximal rewards value in t-th of sample text region a) is calculated using optimal action-value function Q:
Qt(xt, at)=maxπE[rt|xt=x, at=a, π], wherein, xtRefer to the row vector in sample text region, at Numeric label is referred to, π refers to the mapping function of sample text region and numeric label.
3rd, the target that t-th of sample text region is calculated according to below equation exports y:
Wherein, θt-1Represent t-1 Weighting parameter during sample text region, γ are incentive discount coefficient, θt-1For weights corresponding to the t-1 sample text region.
4th, by minimizing loss function Lt(xt, at) update the weighting parameter of deep learning neutral net.
Wherein, ρ (x, a) refers to sample text The row vector x in regiontWith label atProbability distribution, E [] is asks expectation.
5th, loss function L is minimizedt(xt, at) to weights θtDerivation, you can obtain the weights of deep learning neutral net more New equation:
After the weights in each sample text region are calculated, sample text region is input to deep learning each time When in neutral net, deep learning neutral net can be all adjusted according to the weights in each sample text region, so it is right Deep learning neutral net is trained.Here regardless of during deep learning neutral net is trained, be it is specific such as What the first convolutional neural networks for including of training deep learning neutral net and the second convolutional neural networks, because the first volume There are weights and share in product neutral net and the second convolutional neural networks, i.e., shared partial parameters.
Step 512, whether the numeric label for detecting the sample text region of deep learning neutral net output shows the sample Originally it is text filed to include mathematical formulae, if it is not, then performing step 514;If so, then perform step 518.
When sample text region is input in deep learning network, eventually exporting deep learning neural network prediction is Include the text filed of mathematical formulae.As shown in fig. 7, deep learning neutral net can include this part the text of mathematical formulae One's respective area exports.But the prediction result of deep learning neutral net is probably wrong.Judge the correctness of prediction result, Can directly according to output it is text filed corresponding to numeric label judged.Such as the text filed corresponding number when output When to be worth label be 0, then illustrate this it is text filed be actually not comprising mathematical formulae, but deep learning nerve net Luoque will This is text filed to be used as output data, then illustrates difference be present between the output of deep learning neutral net and real data, So just need to handle the sample text region of these prediction errors.
Step 514, after numeric label being shown into the sample text region not comprising mathematical formulae is handled, it is added to down A training sample is concentrated.
Processing mode to sample text region is specifically, by the output of last deep learning neutral net and actual number 2 ° of rotate counterclockwise is first carried out according to all sample text regions having differences, the gaussian noise of σ=0.2 is added, with herein These sample text regions are added in lower a training sample set again after reason.
Step 516, until the training sample set of predetermined quantity part is chosen to finish.
Finished when the training sample set of predetermined quantity part is chosen, i.e., all training sample sets are selected at random Multiple sample text regions, which are added in deep learning neutral net, to be trained.Although it there may be in predetermined quantity part When in last a training sample set input deep learning neutral net, however it remains the numerical value mark in the sample text region of output Label show that this article one's respective area does not include mathematical formulae, but such case is also receptible.Because when training sample is more, Even if it is also acceptable for finally there are a point tolerance.Furthermore training to a certain extent after, continue training go down result would not Convergence, does not continue to restrain with the error of test set MTD learning neural network, then training can also finish.
Step 518, do not deal with.
When the numeric label in the sample text region of deep learning neutral net output shows that the sample text region is not wrapped When containing mathematical formulae, then the prediction output for illustrating deep learning neutral net is correct, then without to these sample texts Region is handled.
Deep learning neutral net before the use, must will according to reality project in advance to deep learning neutral net An effective training is carried out, trains the deep learning neutral net finished to be preferably used for detecting image to be detected, It can accurately extract text filed comprising mathematical formulae in image to be detected.
In one embodiment, the first above-mentioned convolutional neural networks are RPN convolutional neural networks, the second convolution nerve net Network is fast-rcnn convolutional neural networks.
RPN (RegionProposal Networks), i.e. region convolutional neural networks.Inputted by sample text region During into deep learning neutral net, just it is enter into fact in RPN input variable, RPN convolutional neural networks are receiving During sample text region, RPN convolutional neural networks can be detected and screened to sample text region.RPN convolutional neural networks The sample text area comprising mathematical formulae text filed, and that these roughings are gone out for being judged as including mathematical formulae can be filtered out Domain passes to fast-rcnn (Regions with CNN features, region quick detection) convolutional neural networks.Can letter Singly it is interpreted as, RPN convolutional neural networks are exactly a formula detection model in fact, can distinguish common text and mathematics is public Formula.
The sample text that fast-rcnn convolutional neural networks then can be screened further to RPN convolutional neural networks Region is accurately detected, and the text filed output that will be predicted as including mathematical formulae.RPN convolutional neural networks and The combination of fast-rcnn convolutional neural networks, the accuracy of detection can be largely improved, improves testing result Reliability.
In one embodiment, as shown in Figure 8, there is provided mathematical formulae detection means, device include in a kind of image:
Acquisition module 802, for obtaining image to be detected;
Split module 804, for carrying out text segmentation to image to be detected, obtain multiple text filed;
First convolution neural network module 806, for by multiple text filed the first convolution for being sequentially inputted to train In the input variable of neutral net, the text filed comprising mathematical formulae of multiple roughings is exported;
Second convolution neural network module 808, for roughing to be input into training comprising the text filed of mathematical formulae In the input variable of the second good convolutional neural networks, output is finally text filed comprising mathematical formulae.
In one embodiment, mathematical formulae detection means also includes training module 900 in above-mentioned image, for first Convolutional neural networks are trained.Specifically, as shown in figure 9, training module 900 includes:
Training sample acquisition module 902, for obtaining mathematical formulae training sample;
Sample decomposition module 904, it is multiple text filed for each mathematical formulae training sample to be divided into;
Sample distribution module 906, for by multiple text filed training sample sets for being divided into predetermined quantity part;
First training module 908, chosen for successively in every a training sample set it is multiple text filed, successively will choosing Text filed be input in the input variable of the first convolutional neural networks taken is trained, until all training sample sets are chosen And train and finish, the first convolutional neural networks trained.
In one embodiment, above-mentioned training module 900 also includes:Computing module (not shown), for for choosing What is taken is each text filed, calculates each text filed weights;First training module 908 is additionally operable to each time will be text filed When being input in the input variable of the first convolutional neural networks, according to text filed weights in the first convolutional neural networks Weights are adjusted, and the first convolutional neural networks are trained according to the weights after each regulation.
In one embodiment, above-mentioned first training module 908 is additionally operable to obtain each time by the text filed defeated of selection Enter into the input variable of the first convolutional neural networks the text filed comprising mathematical formulae of roughing in the sample for exporting to obtain; Roughing in each sample comprising the text filed of mathematical formulae is input in the input variable of the second convolutional neural networks Row training, the second convolutional neural networks trained.
In one embodiment, above-mentioned sample decomposition module 904 is additionally operable to each text filed addition obtained to segmentation Numeric label;Training module 900 also includes:
Second training module (not shown), for obtain each time roughing in multiple samples comprising mathematical formulae After text filed, roughing in multiple samples comprising the text filed of mathematical formulae is sequentially inputted to the second convolutional neural networks Input variable in be trained, export text filed comprising mathematical formulae in sample;Detection includes when in the sample of output The text filed numeric label of mathematical formulae shows text filed when not including mathematical formulae, then locates to text filed Reason;To by the text filed instruction for being added to lower portion and needing to be input in the input variable of the first convolutional neural networks after processing Practice in sample set;Text filed training sample after processing is added concentrates selection multiple text filed, by the text of selection One's respective area is input in the input variable of the first convolutional neural networks and trained again, until the second convolution trained Neutral net.
In one embodiment, the second training module is additionally operable to the text filed carry out that mathematical formulae is included in sample is inverse Hour hands rotate predetermined angle;To the height of the text filed addition preset value comprising mathematical formulae in the sample after rotation predetermined angle This noise.
In one embodiment, above-mentioned second training module is additionally operable to obtain and text filed calculated according to each of selection The each text filed weights arrived;Roughing in each sample comprising the text filed of mathematical formulae is input to the second convolution When in the input variable of neutral net, the weights in the second convolutional neural networks are adjusted according to text filed weights; The second convolutional neural networks are trained according to the weights after each regulation.
In one embodiment, there is provided a kind of computer-readable recording medium, be stored thereon with computer program, calculate Machine program realizes following steps when being executed by processor:Obtain image to be detected;Text segmentation is carried out to image to be detected, obtained It is multiple text filed;It is defeated by the input variable of multiple text filed the first convolutional neural networks for being sequentially inputted to train Go out the text filed comprising mathematical formulae of multiple roughings;Text filed be input to comprising mathematical formulae of roughing is trained The second convolutional neural networks input variable in, output is finally text filed comprising mathematical formulae.
In one embodiment, above-mentioned first convolution god is trained when computer program is executed by processor by following steps Through network:Obtain mathematical formulae training sample;Each mathematical formulae training sample is divided into multiple text filed;Will be multiple The text filed training sample set for being divided into predetermined quantity part;Chosen successively in every a training sample set it is multiple text filed, Text filed be input in the input variable of the first convolutional neural networks of selection is trained successively, until all training samples This collection is chosen and trains the first convolutional neural networks for finishing, being trained.
In one embodiment, computer program is executed by processor is input to the first volume by the text filed of selection successively During the step being trained in the input variable of product neutral net, including for each text filed of selection, calculate each text The weights of one's respective area;Each time by it is text filed be input in the input variable of the first convolutional neural networks when, according to text area Weights in first convolutional neural networks are adjusted the weights in domain;According to the weights after each regulation to the first convolutional Neural Network is trained.
In one embodiment, above-mentioned second convolution god is trained when computer program is executed by processor by following steps Through network:Obtain and the text filed of selection is input to what output in the input variable of the first convolutional neural networks obtained each time Roughing is text filed comprising mathematical formulae in sample;By in each sample roughing it is text filed defeated comprising mathematical formulae Enter into the input variable of the second convolutional neural networks and be trained, the second convolutional neural networks trained.
In one embodiment, computer program be executed by processor each mathematical formulae training sample is divided into it is multiple After text filed step, in addition to:The each text filed addition numeric label obtained to segmentation;
Computer program, which is executed by processor, is input to comprising the text filed of mathematical formulae roughing in each sample It is trained in the input variable of second convolutional neural networks, during the step of the second convolutional neural networks trained, bag Include:After text filed comprising mathematical formulae for obtaining roughing in multiple samples each time, roughing in multiple samples is included The text filed of mathematical formulae is sequentially inputted to be trained in the input variable of the second convolutional neural networks, exports and is wrapped in sample Containing the text filed of mathematical formulae;When the text filed numeric label for including mathematical formulae in the sample of output shows text area When domain does not include mathematical formulae, then handle text filed;By text filed after processing be added to it is lower it is a need it is defeated Enter to the training sample in the input variable of the first convolutional neural networks and concentrate;Text filed training after processing is added Choose multiple text filed in sample set, text filed by selection is input in the input variable of the first convolutional neural networks Row is trained again, until the second convolutional neural networks trained.
In one embodiment, when computer program is executed by processor the step handled text filed, including: The text filed carry out rotate counterclockwise predetermined angle of mathematical formulae will be included in sample;To in the sample after rotation predetermined angle The gaussian noise of text filed addition preset value comprising mathematical formulae.
In one embodiment, computer program be executed by processor by each sample roughing comprising mathematical formulae It is text filed when being input to the step being trained in the input variable of the second convolutional neural networks, including:Obtain according to selection Each text filed each text filed weights being calculated;By the text for including mathematical formulae of roughing in each sample When one's respective area is input in the input variable of the second convolutional neural networks, according to text filed weights to the second convolution nerve net Weights in network are adjusted;The second convolutional neural networks are trained according to the weights after each regulation.
One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with The hardware of correlation is instructed to complete by computer program, described program can be stored in a non-volatile computer and can be read In storage medium, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, it is provided herein Each embodiment used in any reference to memory, storage, database or other media, may each comprise non-volatile And/or volatile memory.Nonvolatile memory may include that read-only storage (ROM), programming ROM (PROM), electricity can be compiled Journey ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include random access memory Or external cache (RAM).By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) directly RAM (RDRAM), straight Connect memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above example can be combined arbitrarily, to make description succinct, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, lance is not present in the combination of these technical characteristics Shield, all it is considered to be the scope of this specification record.
Above example only expresses the several embodiments of the present invention, and its description is more specific and detailed, but can not Therefore it is construed as limiting the scope of the patent.It should be pointed out that for the person of ordinary skill of the art, On the premise of not departing from present inventive concept, various modifications and improvements can be made, these belong to protection scope of the present invention. Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims (15)

1. mathematical formulae detection method, methods described include in a kind of image:
Obtain image to be detected;
Text segmentation is carried out to described image to be detected, obtained multiple text filed;
By in the input variable of the multiple text filed the first convolutional neural networks for being sequentially inputted to train, output is multiple Roughing includes the text filed of mathematical formulae;
The text filed input for being input to the second convolutional neural networks trained comprising mathematical formulae of the roughing is become In amount, output is finally text filed comprising mathematical formulae.
2. according to the method for claim 1, it is characterised in that first convolutional neural networks are trained in the following way Obtain:
Obtain mathematical formulae training sample;
Each mathematical formulae training sample is divided into multiple text filed;
By the multiple text filed training sample set for being divided into predetermined quantity part;
Chosen successively in every a training sample set multiple text filed, the text filed of selection is input to the first volume successively It is trained in the input variable of product neutral net, chooses and train until all training sample sets and finish, trained First convolutional neural networks.
3. according to the method for claim 2, it is characterised in that described that the text filed of selection is input to the first volume successively It is trained in the input variable of product neutral net, including:
For each text filed of selection, each text filed weights are calculated;
Each time by it is described it is text filed be input in the input variable of first convolutional neural networks when, according to the text Weights in first convolutional neural networks are adjusted the weights in region;
First convolutional neural networks are trained according to the weights after each regulation.
4. according to the method for claim 2, it is characterised in that second convolutional neural networks are trained in the following way Obtain:
Obtain the text filed sample that exports to obtain in the input variable of the first convolutional neural networks of being input to of selection each time Roughing is text filed comprising mathematical formulae in this;
Text filed comprising mathematical formulae of roughing in each sample is input to second convolutional neural networks It is trained in input variable, the second convolutional neural networks trained.
5. according to the method for claim 4, it is characterised in that it is described each mathematical formulae training sample is divided into it is more It is individual it is text filed after, in addition to:The each text filed addition numeric label obtained to segmentation;
It is described that roughing in each sample comprising the text filed of mathematical formulae is input to the second convolution nerve net It is trained in the input variable of network, the second convolutional neural networks trained, including:
, will be thick in the multiple sample after text filed comprising mathematical formulae for obtaining roughing in the multiple sample each time Choosing is sequentially inputted to be trained in the input variable of second convolutional neural networks comprising the text filed of mathematical formulae, Export final text filed comprising mathematical formulae in sample;
When in the sample of output finally comprising mathematical formulae text filed numeric label show it is described it is text filed not During comprising mathematical formulae, then text filed handled to described;
By after processing it is described it is text filed be added to lower portion and need to be input to the inputs of first convolutional neural networks become Training sample in amount is concentrated;
The text filed training sample after processing is added concentrates selection multiple text filed, by the text area of selection Domain is input in the input variable of the first convolutional neural networks and trained again, until the second convolutional Neural trained Network.
6. according to the method for claim 5, it is characterised in that it is described by it is described it is text filed handled, including:
By the final text filed carry out rotate counterclockwise predetermined angle for including mathematical formulae in the sample of output;
To the gaussian noise of the text filed addition preset value comprising mathematical formulae in the sample after rotation predetermined angle.
7. according to the method for claim 4, it is characterised in that it is described by each sample roughing it is public comprising mathematics Text filed be input in the input variable of second convolutional neural networks of formula is trained, including:
Obtain each text filed each text filed weights being calculated according to selection;
Text filed comprising mathematical formulae of roughing in each sample is input to second convolutional neural networks When in input variable, the weights in second convolutional neural networks are adjusted according to the text filed weights;
Second convolutional neural networks are trained according to the weights after each regulation.
8. mathematical formulae detection means in a kind of image, it is characterised in that described device includes:
Acquisition module, for obtaining image to be detected;
Split module, for carrying out text segmentation to described image to be detected, obtain multiple text filed;
First convolution neural network module, for by the multiple text filed the first convolutional Neural for being sequentially inputted to train In the input variable of network, the text filed comprising mathematical formulae of multiple roughings is exported;
Second convolution neural network module, for the roughing to be input into what is trained comprising the text filed of mathematical formulae In the input variable of second convolutional neural networks, output is finally text filed comprising mathematical formulae.
9. device according to claim 8, it is characterised in that described device also includes training module, for the first volume Product neutral net is trained, and the training module includes:
Training sample acquisition module, for obtaining mathematical formulae training sample;
Sample decomposition module, it is multiple text filed for each mathematical formulae training sample to be divided into;
Sample distribution module, for by multiple text filed training sample sets for being divided into predetermined quantity part;
First training module, chosen for successively in every a training sample set it is multiple text filed, successively by the text of selection One's respective area is input in the input variable of the first convolutional neural networks and is trained, until all training sample sets are chosen and train Finish, the first convolutional neural networks trained.
10. device according to claim 9, it is characterised in that the training module also includes:
Computing module, for for each text filed of selection, calculating each text filed weights;
First training module is additionally operable to, and is input to each time by text filed in the input variable of the first convolutional neural networks When, the weights in the first convolutional neural networks are adjusted according to text filed weights, according to the weights after each regulation First convolutional neural networks are trained.
11. device according to claim 9, it is characterised in that first training module 908 is additionally operable to, and is obtained each The secondary text filed bag for being input to roughing in the sample for exporting to obtain in the input variable of the first convolutional neural networks by selection Containing the text filed of mathematical formulae;The text filed of mathematical formulae that include of roughing in each sample is input to the second convolution god It is trained in input variable through network, the second convolutional neural networks trained.
12. device according to claim 11, it is characterised in that the sample decomposition module 904 is additionally operable to splitting The each text filed addition numeric label arrived;
The training module 900 also includes:
Second training module, will be more after text filed comprising mathematical formulae for obtaining roughing in multiple samples each time Roughing is sequentially inputted to enter in the input variable of the second convolutional neural networks comprising the text filed of mathematical formulae in individual sample Row training, export text filed comprising mathematical formulae in sample;Detection is when the text that mathematical formulae is included in the sample of output The numeric label in region shows text filed when not including mathematical formulae, then handles text filed;To by after processing The text filed training sample concentration for being added to lower portion and needing to be input in the input variable of the first convolutional neural networks;From adding Enter the text filed training sample after processing and concentrated to choose multiple text filed, text filed by selection is input to first Trained again in the input variable of convolutional neural networks, until the second convolutional neural networks trained.
13. device according to claim 11, it is characterised in that second training module is additionally operable to, and is obtained according to choosing The each text filed each text filed weights being calculated taken;By in each sample roughing comprising mathematical formulae It is text filed when being input in the input variable of the second convolutional neural networks, according to text filed weights to the second convolutional Neural Weights in network are adjusted;The second convolutional neural networks are trained according to the weights after each regulation.
14. a kind of computer equipment, including memory, processor and storage are on a memory and the meter that can run on a processor Calculation machine program, it is characterised in that realized during the computing device described program in claim 1-7 any one described images The step of mathematical formulae detection method.
15. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that described program is processed The step of device realizes mathematical formulae detection method in claim 1-7 any one described images when performing.
CN201711190154.1A 2017-11-24 2017-11-24 Method and device for detecting mathematical formulas in images, computer equipment and storage medium Active CN107886082B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711190154.1A CN107886082B (en) 2017-11-24 2017-11-24 Method and device for detecting mathematical formulas in images, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711190154.1A CN107886082B (en) 2017-11-24 2017-11-24 Method and device for detecting mathematical formulas in images, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN107886082A true CN107886082A (en) 2018-04-06
CN107886082B CN107886082B (en) 2023-07-04

Family

ID=61774866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711190154.1A Active CN107886082B (en) 2017-11-24 2017-11-24 Method and device for detecting mathematical formulas in images, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN107886082B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145956A (en) * 2018-07-26 2019-01-04 上海慧子视听科技有限公司 Methods of marking, device, computer equipment and storage medium
CN110796137A (en) * 2019-10-10 2020-02-14 中国建设银行股份有限公司 Method and device for identifying image
CN110942067A (en) * 2019-11-29 2020-03-31 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium
CN111652145A (en) * 2020-06-03 2020-09-11 广东小天才科技有限公司 Formula detection method and device, electronic equipment and storage medium
CN111695377A (en) * 2019-03-13 2020-09-22 杭州海康威视数字技术股份有限公司 Text detection method and device and computer equipment
CN111814798A (en) * 2020-07-14 2020-10-23 深圳中兴网信科技有限公司 Method for digitizing titles and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101329731A (en) * 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
CN104298976A (en) * 2014-10-16 2015-01-21 电子科技大学 License plate detection method based on convolutional neural network
CN106355573A (en) * 2016-08-24 2017-01-25 北京小米移动软件有限公司 Target object positioning method and device in pictures
CN106845406A (en) * 2017-01-20 2017-06-13 深圳英飞拓科技股份有限公司 Head and shoulder detection method and device based on multitask concatenated convolutional neutral net
US20170177965A1 (en) * 2015-12-17 2017-06-22 Xerox Corporation Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101329731A (en) * 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
CN104298976A (en) * 2014-10-16 2015-01-21 电子科技大学 License plate detection method based on convolutional neural network
US20170177965A1 (en) * 2015-12-17 2017-06-22 Xerox Corporation Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks
CN106355573A (en) * 2016-08-24 2017-01-25 北京小米移动软件有限公司 Target object positioning method and device in pictures
CN106845406A (en) * 2017-01-20 2017-06-13 深圳英飞拓科技股份有限公司 Head and shoulder detection method and device based on multitask concatenated convolutional neutral net

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145956A (en) * 2018-07-26 2019-01-04 上海慧子视听科技有限公司 Methods of marking, device, computer equipment and storage medium
CN109145956B (en) * 2018-07-26 2021-12-14 上海慧子视听科技有限公司 Scoring method, scoring device, computer equipment and storage medium
CN111695377A (en) * 2019-03-13 2020-09-22 杭州海康威视数字技术股份有限公司 Text detection method and device and computer equipment
CN111695377B (en) * 2019-03-13 2023-09-29 杭州海康威视数字技术股份有限公司 Text detection method and device and computer equipment
CN110796137A (en) * 2019-10-10 2020-02-14 中国建设银行股份有限公司 Method and device for identifying image
CN110942067A (en) * 2019-11-29 2020-03-31 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium
CN111652145A (en) * 2020-06-03 2020-09-11 广东小天才科技有限公司 Formula detection method and device, electronic equipment and storage medium
CN111652145B (en) * 2020-06-03 2023-09-26 广东小天才科技有限公司 Formula detection method and device, electronic equipment and storage medium
CN111814798A (en) * 2020-07-14 2020-10-23 深圳中兴网信科技有限公司 Method for digitizing titles and readable storage medium

Also Published As

Publication number Publication date
CN107886082B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN107886082A (en) Mathematical formulae detection method, device, computer equipment and storage medium in image
US9129190B1 (en) Identifying objects in images
US9613297B1 (en) Identifying objects in images
EP3712812A1 (en) Recognizing typewritten and handwritten characters using end-to-end deep learning
CN107169485B (en) Mathematical formula identification method and device
CN106897363B (en) Text recommendation method based on eye movement tracking
CN107798299A (en) Billing information recognition methods, electronic installation and readable storage medium storing program for executing
US20210034905A1 (en) Apparatus, method and computer program for analyzing image
CN108446621A (en) Bank slip recognition method, server and computer readable storage medium
CN108985232A (en) Facial image comparison method, device, computer equipment and storage medium
CN108710866A (en) Chinese mold training method, Chinese characters recognition method, device, equipment and medium
CN109101469A (en) The information that can search for is extracted from digitized document
CN110178139A (en) Use the system and method for the character recognition of the full convolutional neural networks with attention mechanism
CN112990180B (en) Question judging method, device, equipment and storage medium
CN109614973A (en) Rice seedling and Weeds at seedling image, semantic dividing method, system, equipment and medium
CN109840524B (en) Text type recognition method, device, equipment and storage medium
CN107609575A (en) Calligraphy evaluation method, calligraphy evaluating apparatus and electronic equipment
CN108875769A (en) Data mask method, device and system and storage medium
CN110413961A (en) The method, apparatus and computer equipment of text scoring are carried out based on disaggregated model
CN110929746A (en) Electronic file title positioning, extracting and classifying method based on deep neural network
CN108897750A (en) Merge the personalized location recommendation method and equipment of polynary contextual information
US10489427B2 (en) Document classification system, document classification method, and document classification program
CN105354845B (en) A kind of semi-supervised change detecting method of remote sensing image
CN109101984A (en) A kind of image-recognizing method and device based on convolutional neural networks
CN113420116B (en) Medical document analysis method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant