CN105184292B - The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image - Google Patents

The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image Download PDF

Info

Publication number
CN105184292B
CN105184292B CN201510531070.4A CN201510531070A CN105184292B CN 105184292 B CN105184292 B CN 105184292B CN 201510531070 A CN201510531070 A CN 201510531070A CN 105184292 B CN105184292 B CN 105184292B
Authority
CN
China
Prior art keywords
connected domain
character
value
line
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510531070.4A
Other languages
Chinese (zh)
Other versions
CN105184292A (en
Inventor
陈李江
刘宁
刘辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan Cloud River Technology Co Ltd
Original Assignee
Hainan Cloud River Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan Cloud River Technology Co Ltd filed Critical Hainan Cloud River Technology Co Ltd
Priority to CN201510531070.4A priority Critical patent/CN105184292B/en
Publication of CN105184292A publication Critical patent/CN105184292A/en
Application granted granted Critical
Publication of CN105184292B publication Critical patent/CN105184292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Character Input (AREA)

Abstract

The structural analysis of handwritten form mathematical formulae and recognition methods in a kind of natural scene image, including:The gray matrix of natural scene image is converted to local contrast matrix by S1, is carried out two-value division to obtained local contrast matrix using otsu methods, is obtained two values matrix;S2 carries out connected domain analysis to two values matrix in step S1, rejects non-character connected domain, obtain character connected domain;S3 carries out formula special construction Element detection to the character connected domain in S2 using correlation coefficient process, and is individually marked to all special construction elements detected;S4 divides the two values matrix in S1 into every trade using horizontal projection method;S5:Each character connected domain is identified using convolutional neural networks;S6 is defined output sequence, by recognition result according to corresponding sequence, is exported with latex typesetting formats.This method efficiently solves the problem of representation of elementary mathematics formula in OCR identifications.

Description

The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image
Technical field
The present invention relates to image processing and pattern recognitions, public more particularly to handwritten form mathematics in natural scene image Formula structural analysis and knowledge method for distinguishing.
Background technology
OCR (Optical Character Recognition, optical character identification) technology has a wide range of applications, needle To OCR identification technologies all comparative maturities of Chinese and English, but there is the case where complicated structure for mathematical formulae is this at present, Current OCR technique does not support that the present invention, which focuses on solving this, the problem of very strong application demand well.
Invention content
The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image provided by the invention, can be effectively Solve the problem of representation of elementary mathematics formula in OCR identifications.
The structural analysis of handwritten form mathematical formulae and recognition methods in the natural scene image of the present invention, including:
Step S1:The gray matrix of natural scene image is converted into local contrast matrix, uses otsu (big Tianjin thresholds Value) local contrast matrix progress two-value division of the method to obtaining, obtain two values matrix;
Step S2:Connected domain analysis is carried out to two values matrix in step S1, rejects non-character connected domain, obtains character connection Domain;
Step S3:Formula special construction Element detection is carried out to the character connected domain in step S2 using correlation coefficient process, And all special construction elements detected are individually marked;
Step S4:The two values matrix in step S1 is divided into every trade using horizontal projection method;
Step S5:Each character connected domain is identified using convolutional neural networks;
Step S6:Output sequence is defined, by recognition result according to corresponding sequence, with the latex (typesettings based on Τ Ε Χ System) typesetting format exported.
Preferably, coordinate calculates public affairs for the local contrast Con (i, j) of the point of (i, j) in the local contrast matrix Formula is:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein,
Imax(i, j) and Imin(i, j) be respectively image gray matrix in coordinate be (i, j) point centered on neighborhood Maximum gradation value and minimum gradation value, herein we be arranged neighborhood radius be 5;
Std indicates the standard deviation of gray matrix, γ=1.
Preferably, it is using the method that otsu methods carry out two-value division to obtained local contrast matrix:Take part right Than maximum value in degree matrix and minimum value, it will be divided into n parts of minizones between maximum value and minimum value, each element is divided into In its corresponding minizone, histogram is formed, otsu divisions are carried out on the basis of this histogram, the point for being less than selected threshold value is the back of the body Sight spot, the point for being more than selected threshold value are character point.
Preferably, connected domain analysis is carried out to two values matrix in step S1, rejects non-character connected domain, obtain character connection The method in domain is::
Step S201:The minimum outsourcing rectangle for obtaining connected domain, records the coordinate on four vertex of the minimum outsourcing rectangle, Calculate the length and height of minimum outsourcing rectangle;
Step 202:Count the average length and height of all connected domains;
Step S203:Carry out the rejecting of non-character connected domain:
If the length of some connected domain and height are respectively less than the 1/4 of average length and height, then it is assumed that it is noise spot, Weed out the connected domain;
If the length of some connected domain and height are all higher than 4 times of average length and height, then it is assumed that it is in image Non-character part, weed out the connected domain;
Step S204:Remaining connected domain is preserved as character connected domain.
Preferably, formula special construction element described in step S3 includes braces, radical sign, fraction line;
It is detected using rule match method separable type line connected domain:The ratio between the length of selection connected domain and width are more than 5 and the upper and lower part of connected domain need to have the connected domain of adjacent connected domain, and the connected domain is identified as fraction line connected domain;
Braces connected domain and radical sign connected domain are detected using template matching method:
Step S301:Select the standard two-value template of braces connected domain and radical sign connected domain;
Step S302:The size of current connected domain is standardized, makes its size as standard form;
Step S303:Standard two-value template is matched with current connected domain respectively,
Matched formula is formula of correlation coefficient, is expressed as:
Wherein, xiAnd yiThe value of i-th of element in current template and standard form is indicated respectively,WithIt indicates to work as respectively The mean value of front template and standard form;R ∈ (0,1), when r values are more than 0.5, successful match.
Preferably, used in step S4 the method that horizontal projection method divides two values matrix into every trade for:
Oscillogram is obtained after carrying out floor projection to the two values matrix in step S1, the value of oscillogram abscissa is original image Line number, the number for the character point that the value of ordinate includes by current line;
It is extended to its left and right from each wave crest of oscillogram, when numerical value is less than 0.1 times of its crest value, stops expanding Exhibition;If being overlapped when two neighboring wave crest extension, corresponding two row merges into a line;
Starting and ending position of the record per a line, the corresponding abscissa of wave crest left end are that the initial row of current line is sat Mark, the corresponding abscissa of wave crest right end are the end line coordinate of current line.
Preferably, after the starting and ending location information for obtaining every a line, each character connected domain is corresponding with row, tool Body method is:
The horizontal coordinate at each character connected domain center is calculated at a distance from the horizontal coordinate at each line of text center, by word Symbol connected domain is divided into that minimum a line of distance.
Preferably, the structure of the convolutional neural networks in step S4 is Lenet-5 structures, and the convolutional neural networks are by one Input layer, two convolution sum down-sampling layers, a full connection hidden layer and an output layer composition;
The training data of the convolutional neural networks is the sample of the character connected domain after standardization;
Convolutional neural networks will be inputted after character connected domain standardization in step S2, obtains each character connected domain pair The character answered.
Preferably, the output sequence that step S6 is defined includes three layers:
First layer ordinal relation is row order relation:It is corresponding by row output according to the correspondence of character connected domain and row Character connected domain;
Second layer ordinal relation is row order relation:In every a line, all character connected domains according to its left end row coordinate into Row ascending sort;
Third layer ordinal relation is the sequence relation in formula special construction:Element is according to each equation in equation group It is exported;Fraction element is exported according to the form of first molecule, rear denominator.
Preferably, for the sequence relation in formula special construction, it is thus necessary to determine that each formula special construction element packet The character block contained;
For braces, representative is this special construction of equation group, it is thus necessary to determine that the row coordinate that equation group terminates, to Determine it includes all character blocks;According to the position of current line residing for character block, it is classified as " top, middle part, lower part " three A part, every character block below and above, the element being regarded as in equation group find out all such characters Block, using the end column of the character block of wherein right end as the end column of entire equation group;It is every to be located at braces and equation group All character blocks of end column are all divided into current equation group structure;Equation group inside configuration is divided into every trade again, really Several equations are contained in its fixed inside, and the character block inside equation group is exported according to equation sequence;
For fraction line, it is thus necessary to determine that current all molecules of fraction and denominator element, every starting ordinate are more than fraction Line originates ordinate, and terminates the character block that ordinate is less than fraction knot beam ordinate, is all divided into current fraction structure In;Character block in separable type structure needs to further determine that it is molecule or denominator, and method of determination is according to the cross of character block Coordinate determines:If if character block bottom abscissa is less than fraction line center abscissa, belong to molecule;If horizontal at the top of character block Coordinate is more than fraction line center abscissa, then it belongs to denominator;
For radical sign, it is thus necessary to determine that the character block being located inside radical sign, every starting ordinate are more than the vertical seat of radical sign starting Mark, and terminate ordinate and be less than the character block that radical sign terminates ordinate, it is all divided into current radical sign structure;
According to the sequence relation in above-mentioned row order relation, row order relation and formula special construction, final formula knot is determined The output of structure is exported with latex (composing system based on Τ Ε Χ) typesetting format.
The problem of representation of elementary mathematics formula, realizes the accurate knowledge of formula in being identified present invention efficiently solves OCR Not.
Description of the drawings
Fig. 1 is the stream of the structural analysis of handwritten form mathematical formulae and recognition methods in natural scene provided in an embodiment of the present invention Cheng Tu;
Fig. 2 is the structural schematic diagram that the present invention implements the convolutional neural networks that character recognition uses.
Specific implementation mode
Below in conjunction with the accompanying drawings to handwritten form mathematical formulae structural analysis in natural scene provided in an embodiment of the present invention and knowledge Method for distinguishing is described in detail.
As shown in Figure 1, the structural analysis of handwritten form mathematical formulae and identification side in natural scene provided in an embodiment of the present invention Method includes the following steps:
The gray matrix of natural scene image is converted to local contrast matrix, using otsu methods to obtaining by step S1 Local contrast matrix carry out two-value division, obtain two values matrix;
Coordinate is local contrast Con (i, j) calculation formula of the point of (i, j) in local contrast matrix in the present embodiment For:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein,
Imax(i, j) and Imin(i, j) be respectively image gray matrix in coordinate be (i, j) point centered on neighborhood Maximum gradation value and minimum gradation value, herein we be arranged neighborhood radius be 5;
Std indicates the standard deviation of gray matrix, γ=1.
It is using the method that otsu methods carry out two-value division to obtained local contrast matrix in the present embodiment:Take part Maximum value and minimum value in contrast matrix will be divided into 1000 parts of minizones between maximum value and minimum value, by each element Be divided into its corresponding minizone, form the statistic histogram that length is 1000, to the histogram using OTSU methods into Row two-value divides, and the point for being less than selected threshold value is background dot, and the point for being more than selected threshold value is character point.
Step S2 carries out connected domain analysis to two values matrix in step S1, rejects non-character connected domain, obtains character connection Domain, specific method are:
Step S201:The minimum outsourcing rectangle for obtaining connected domain, records the coordinate on four vertex of the minimum outsourcing rectangle, Calculate the length and height of minimum outsourcing rectangle;
Step 202:Count the average length and height of all connected domains;
Step S203:Carry out the rejecting of non-character connected domain:
If the length of some connected domain and height are respectively less than the 1/4 of average length and height, then it is assumed that it is noise spot, Weed out the connected domain;
If the length of some connected domain and height are all higher than 4 times of average length and height, then it is assumed that it is in image Non-character part, weed out the connected domain;
Step S204:Remaining connected domain is preserved as character connected domain, character connected domain is obtained according to its minimum outsourcing rectangle Character block.
Step S3 carries out formula special construction Element detection using correlation coefficient process to the character connected domain in step S2, And all special construction elements detected are individually marked;
The formula special construction element of the present embodiment includes braces, radical sign, fraction line;
It is detected using rule match method separable type line connected domain:The ratio between the length of selection connected domain and width are more than 5 and the upper and lower part of connected domain need to have the connected domain of adjacent connected domain, and the connected domain is identified as fraction line connected domain;
Braces connected domain and radical sign connected domain are detected using template matching method, standard form using The matrix of 32*32, for the character block of character connected domain to be detected, it is also desirable to specification be melted into 32*32 matrix, calculate this two The related coefficient of a matrix, if it is more than 0.5, then it represents that successful match, the specific steps are:
Step S301:Select the standard two-value template of braces connected domain and radical sign connected domain;
Step S302:The size of current connected domain is standardized, makes its size as standard form;
Step S303:Standard two-value template is matched with current connected domain respectively,
Matched formula is formula of correlation coefficient, is expressed as:
Wherein, xiAnd yiThe value of i-th of element in current template and standard form is indicated respectively,WithIt indicates to work as respectively The mean value of front template and standard form;R ∈ (0,1), when r values are more than 0.5, successful match.
All special construction elements detected are individually marked, subsequently to carry out structural analysis.
Step S4 divides the two values matrix in step S1 into every trade using horizontal projection method;
Oscillogram is obtained after carrying out floor projection to the two values matrix in step S1, the value of oscillogram abscissa is original image Line number, the number for the character point that the value of ordinate includes by current line obtains row information based on wave crest;
Provide that the distance of adjacent peaks must be less than distance on 10 two wave crests, only retain peak value more 10 or more High one, stipulated that the height of wave crest is minimum more than 1st/20th of image length;
For meeting the wave crest of above-mentioned condition, from wave crest left and right ends simultaneously toward external expansion, until numerical value is less than crest height of wave Degree 0.01 times when, stop extension,
It is extended to its left and right from each wave crest of oscillogram, when numerical value is less than 0.1 times of its crest value, stops expanding Exhibition;If being overlapped when two neighboring wave crest extension, corresponding two row merges into a line;If two neighboring wave crest extension When be overlapped, then its corresponding two row merges into a line;
Starting and ending position of the record per a line, the corresponding abscissa of wave crest left end are that the initial row of current line is sat Mark, the corresponding abscissa of wave crest right end are the end line coordinate of current line.
After obtaining the starting and ending location information of every a line, by each character connected domain and corresponding, the specific method of row For:The horizontal coordinate at each character connected domain center is calculated at a distance from the horizontal coordinate at each line of text center, character is connected Logical domain is divided into that minimum a line of distance.
Because equation group there may come a time when accidentally be divided into multirow, therefore provide, the line number where braces does not allow quilt It is divided into multirow.
Step S5 is identified each character connected domain using convolutional neural networks;
As shown in Fig. 2, the structure of convolutional neural networks be Lenet-5 structures, the convolutional neural networks by an input layer, Two convolution sum down-sampling layers, a full connection hidden layer and an output layer composition;
Input layer sample size is 32*32, and first convolutional layer characteristic pattern number is 6, second convolutional layer characteristic pattern Number is 16, the mode that down-sampling layer is exported using maximum value, and ranks all become original half, hidden layer node Number is 120, and output layer node number is 84;
Training sample is the sample of the character connected domain after standardization, by being obtained by above-mentioned binaryzation mode, that is, is trained As the mode and normalized mode that sample is obtained with forecast sample be, this is done to improve recognition accuracy.
Convolutional neural networks will be inputted after character connected domain standardization in step S2, obtains each character connected domain pair The character answered.
Step S6 is defined output sequence, by recognition result according to corresponding sequence, is exported with latex typesetting formats;
The output sequence of definition includes three layers:
First layer ordinal relation is row order relation:It is corresponding by row output according to the correspondence of character connected domain and row Character connected domain;
Second layer ordinal relation is row order relation:In every a line, all character connected domains according to its left end row coordinate into Row ascending sort;
Third layer ordinal relation is the sequence relation in formula special construction:Element is according to each equation in equation group It is exported;Fraction element is exported according to the form of first molecule, rear denominator.
For the sequence relation in formula special construction, it is thus necessary to determine that the character that each formula special construction element includes Block;
For braces, representative is this special construction of equation group, it is thus necessary to determine that the row coordinate that equation group terminates, to Determine it includes all character blocks;According to the position of current line residing for character block, it is classified as " top, middle part, lower part " three A part, every character block below and above, the element being regarded as in equation group find out all such characters Block, using the end column of the character block of wherein right end as the end column of entire equation group;It is every to be located at braces and equation group All character blocks of end column are all divided into current equation group structure;Equation group inside configuration is divided into every trade again, really Several equations are contained in its fixed inside, and the character block inside equation group is exported according to equation sequence;
For fraction line, it is thus necessary to determine that current all molecules of fraction and denominator element, every starting ordinate are more than fraction Line originates ordinate, and terminates the character block that ordinate is less than fraction knot beam ordinate, is all divided into current fraction structure In;Character block in separable type structure needs to further determine that it is molecule or denominator, and method of determination is according to the cross of character block Coordinate determines:If if character block bottom abscissa is less than fraction line center abscissa, belong to molecule;If horizontal at the top of character block Coordinate is more than fraction line center abscissa, then it belongs to denominator;
For radical sign, it is thus necessary to determine that the character block being located inside radical sign, every starting ordinate are more than the vertical seat of radical sign starting Mark, and terminate ordinate and be less than the character block that radical sign terminates ordinate, it is all divided into current radical sign structure;
According to the sequence relation in above-mentioned row order relation, row order relation and formula special construction, final formula knot is determined The output of structure is exported with latex typesetting formats.
The problem of representation that elementary mathematics formula in OCR identifications can be efficiently solved by above-described embodiment, realizes public affairs Formula accurately identifies.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, any made by repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (7)

1. the structural analysis of handwritten form mathematical formulae and recognition methods in a kind of natural scene image, which is characterized in that the method Including:
Step S1:The gray matrix of natural scene image is converted into local contrast matrix according to following formula, and is used Otsu methods carry out two-value division to obtained local contrast matrix, obtain two values matrix:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein, Con (i, j) be image gray matrix in coordinate be (i, j) point local contrast, Imax(i, j) and Imin (i, j) be respectively image gray matrix in coordinate be (i, j) point centered on neighborhood maximum gradation value and minimum ash Angle value, the radius that neighborhood is arranged in we herein is 5;Std indicates the standard deviation of gray matrix, γ=1;ε be the dimensionless for preventing denominator from being 0;
Step S2:Connected domain analysis is carried out to two values matrix in step S1, non-character connected domain is rejected, obtains character connected domain;
Step S3:Formula special construction Element detection is carried out to the character connected domain in step S2 using correlation coefficient process, and right All special construction elements detected are individually marked;Wherein, the formula special construction element includes braces, radical sign With fraction line;
It is detected using rule match method separable type line connected domain:The ratio between length and the width for selecting connected domain are more than 5 and connect The upper and lower part in logical domain need to have the connected domain of adjacent connected domain, and the connected domain is identified as fraction line connected domain;
Braces connected domain and radical sign connected domain are detected using template matching method:
Step S301:Select the standard two-value template of braces connected domain and radical sign connected domain;
Step S302:The size of current connected domain is standardized, makes its size as standard form;
Step S303:Standard two-value template is matched with current connected domain respectively,
Matched formula is formula of correlation coefficient, is expressed as:
Wherein, xiAnd yiThe value of i-th of element in current template and standard form is indicated respectively,WithCurrent template is indicated respectively With the mean value of standard form;R ∈ (0,1), when r values are more than 0.5, successful match;
Step S4:The two values matrix in step S1 is divided into every trade using horizontal projection method, specifically executes following step:
Oscillogram is obtained after carrying out floor projection to the two values matrix in step S1, the value of oscillogram abscissa is the row of original image Number, the number for the character point that the value of ordinate includes by current line;
It is extended to its left and right from each wave crest of oscillogram, when numerical value is less than 0.1 times of its crest value, stops extension;If It is overlapped when two neighboring wave crest extension, then its corresponding two row merges into a line;
Starting and ending position of the record per a line, the corresponding abscissa of wave crest left end is the initial row coordinate of current line, wave Right end corresponding abscissa in peak is the end line coordinate of current line;
Step S5:Each character connected domain is identified using convolutional neural networks;
Step S6:Output sequence is defined, by recognition result according to corresponding sequence, is exported with latex typesetting formats.
2. according to the method described in claim 1, it is characterized in that, being carried out to obtained local contrast matrix using otsu methods Two-value divide method be:Maximum value and minimum value in local contrast matrix are taken, n will be divided between maximum value and minimum value Part minizone, each element is divided into its corresponding minizone, forms histogram, otsu is carried out on the basis of this histogram It divides, the point for being less than selected threshold value is background dot, and the point for being more than selected threshold value is character point.
3. according to the method described in claim 2, it is characterized in that, to two values matrix progress connected domain analysis in step S1, pick Except non-character connected domain, the method for obtaining character connected domain is:Step S201:Obtain the minimum outsourcing rectangle of connected domain, record The coordinate on four vertex of the minimum outsourcing rectangle calculates the length and height of minimum outsourcing rectangle;
Step 202:Count the average length and height of all connected domains;
Step S203:Carry out the rejecting of non-character connected domain:
If the length of some connected domain and height are respectively less than the 1/4 of average length and height, then it is assumed that it is noise spot, is rejected Fall the connected domain;
If the length of some connected domain and height are all higher than 4 times of average length and height, then it is assumed that it is non-in image Character portion weeds out the connected domain;
Step S204:Remaining connected domain is preserved as character connected domain.
4. according to the method described in claim 1, it is characterized in that, after obtaining the starting and ending location information of every a line, incite somebody to action Each character connected domain is corresponding with row, and specific method is:
The horizontal coordinate at each character connected domain center is calculated at a distance from the horizontal coordinate at each line of text center, character is connected Logical domain is divided into that minimum a line of distance.
5. according to the method described in claim 4, it is characterized in that, the structure of the convolutional neural networks in step S5 is Lenet- 5 structures, the convolutional neural networks are by an input layer, two convolution sum down-sampling layers, a full connection hidden layer and an output Layer composition;
The training data of the convolutional neural networks is the sample of the character connected domain after standardization;
Convolutional neural networks will be inputted after character connected domain standardization in step S2, it is corresponding to obtain each character connected domain Character.
6. according to the method described in claim 5, it is characterized in that, the output sequence that step S6 is defined includes three layers:
First layer ordinal relation is row order relation:According to the correspondence of character connected domain and row, corresponding character is exported by row Connected domain;
Second layer ordinal relation is row order relation:In every a line, all character connected domains are risen according to its left end row coordinate Sequence sorts;
Third layer ordinal relation is the sequence relation in formula special construction:Element is carried out according to each equation in equation group Output;Fraction element is exported according to the form of first molecule, rear denominator.
7. according to the method described in claim 6, it is characterized in that, for the sequence relation in formula special construction, need really The character block that each fixed formula special construction element includes;
For braces, representative is this special construction of equation group, it is thus necessary to determine that the row coordinate that equation group terminates, so that it is determined that It includes all character blocks;According to the position of current line residing for character block, it is classified as " top, middle part, lower part " three portions Point, every character block below and above, the element being regarded as in equation group finds out all such character blocks, with Wherein end column of the end column of the character block of right end as entire equation group;It is every to be located at braces and equation group end column All character blocks, be all divided into current equation group structure;Equation group inside configuration is divided into every trade again, is determined in it Several equations are contained in portion, and the character block inside equation group is exported according to equation sequence;
For fraction line, it is thus necessary to determine that current all molecules of fraction and denominator element, every starting ordinate rise more than fraction line Beginning ordinate, and terminate the character block that ordinate is less than fraction knot beam ordinate, it is all divided into current fraction structure;It is right Character block in fraction structure needs to further determine that it is molecule or denominator, and method of determination is according to the abscissa of character block It determines:If if character block bottom abscissa is less than fraction line center abscissa, belong to molecule;If abscissa at the top of character block More than fraction line center abscissa, then it belongs to denominator;
For radical sign, it is thus necessary to determine that the character block being located inside radical sign, every starting ordinate are more than radical sign and originate ordinate, and Terminate ordinate and be less than the character block that radical sign terminates ordinate, is all divided into current radical sign structure;
According to the sequence relation in above-mentioned row order relation, row order relation and formula special construction, final formula structure is determined Output, is exported with latex typesetting formats.
CN201510531070.4A 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image Active CN105184292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510531070.4A CN105184292B (en) 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510531070.4A CN105184292B (en) 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image

Publications (2)

Publication Number Publication Date
CN105184292A CN105184292A (en) 2015-12-23
CN105184292B true CN105184292B (en) 2018-08-03

Family

ID=54906358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510531070.4A Active CN105184292B (en) 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image

Country Status (1)

Country Link
CN (1) CN105184292B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN105844275B (en) * 2016-03-25 2019-08-23 北京云江科技有限公司 The localization method of line of text in text image
CN106709394B (en) * 2016-12-12 2019-07-05 北京慧眼智行科技有限公司 A kind of image processing method and device
CN107169485B (en) * 2017-03-28 2020-10-09 北京捷通华声科技股份有限公司 Mathematical formula identification method and device
CN107886065A (en) * 2017-11-06 2018-04-06 哈尔滨工程大学 A kind of Serial No. recognition methods of mixing script
CN109993040B (en) * 2018-01-03 2021-07-30 北京世纪好未来教育科技有限公司 Text recognition method and device
CN110135407B (en) * 2018-02-09 2021-01-29 北京世纪好未来教育科技有限公司 Sample labeling method and computer storage medium
CN110135425B (en) * 2018-02-09 2021-02-26 北京世纪好未来教育科技有限公司 Sample labeling method and computer storage medium
CN110135426B (en) * 2018-02-09 2021-04-30 北京世纪好未来教育科技有限公司 Sample labeling method and computer storage medium
CN109002756A (en) * 2018-06-04 2018-12-14 平安科技(深圳)有限公司 Handwritten Chinese character image recognition methods, device, computer equipment and storage medium
CN108898142B (en) * 2018-06-15 2022-03-18 宁波云江互联网科技有限公司 Recognition method of handwritten formula and computing device
CN109239073B (en) * 2018-07-28 2020-11-10 西安交通大学 Surface defect detection method for automobile body
CN109117848B (en) * 2018-09-07 2022-11-18 泰康保险集团股份有限公司 Text line character recognition method, device, medium and electronic equipment
CN109886093A (en) * 2019-01-08 2019-06-14 深圳禾思众成科技有限公司 A kind of formula detection method, equipment and computer readable storage medium
CN109977861B (en) * 2019-03-25 2023-06-20 中国科学技术大学 Off-line handwriting mathematical formula recognition method
CN110020655B (en) * 2019-04-19 2021-08-20 厦门商集网络科技有限责任公司 Character denoising method and terminal based on binarization
CN110533035B (en) * 2019-08-28 2022-02-15 海南阿凡题科技有限公司 Student homework page number identification method based on text matching
CN110569853B (en) * 2019-09-12 2022-11-29 南京红松信息技术有限公司 Target positioning-based independent formula segmentation method
CN111027561B (en) * 2019-11-22 2020-08-18 广州寄锦教育科技有限公司 Mathematical formula positioning method, system, readable storage medium and computer equipment
CN111079745A (en) * 2019-12-11 2020-04-28 中国建设银行股份有限公司 Formula identification method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184399A (en) * 2011-03-31 2011-09-14 上海名图信息技术有限公司 Character segmenting method based on horizontal projection and connected domain analysis
CN102542273A (en) * 2011-12-02 2012-07-04 方正国际软件有限公司 Detection method and system for complex formula areas in document image
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
CN104050471A (en) * 2014-05-27 2014-09-17 华中科技大学 Natural scene character detection method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060062470A1 (en) * 2004-09-22 2006-03-23 Microsoft Corporation Graphical user interface for expression recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184399A (en) * 2011-03-31 2011-09-14 上海名图信息技术有限公司 Character segmenting method based on horizontal projection and connected domain analysis
CN102542273A (en) * 2011-12-02 2012-07-04 方正国际软件有限公司 Detection method and system for complex formula areas in document image
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
CN104050471A (en) * 2014-05-27 2014-09-17 华中科技大学 Natural scene character detection method and system

Also Published As

Publication number Publication date
CN105184292A (en) 2015-12-23

Similar Documents

Publication Publication Date Title
CN105184292B (en) The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image
WO2017031716A1 (en) Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN107093172B (en) Character detection method and system
CN109389121B (en) Nameplate identification method and system based on deep learning
Pal et al. Segmentation of Bangla unconstrained handwritten text
US6178263B1 (en) Method of estimating at least one run-based font attribute of a group of characters
CN103996057B (en) Real-time Handwritten Numeral Recognition Method based on multi-feature fusion
US8761514B2 (en) Character recognition apparatus and method based on character orientation
CN102629322B (en) Character feature extraction method based on stroke shape of boundary point and application thereof
KR20110057536A (en) Character recognition device and control method thereof
WO2010092952A1 (en) Pattern recognition device
JP2005523530A (en) System and method for identifying and extracting character string from captured image data
CN107944451B (en) Line segmentation method and system for ancient Tibetan book documents
CN105205488A (en) Harris angular point and stroke width based text region detection method
CN111626292B (en) Text recognition method of building indication mark based on deep learning technology
CN105261021A (en) Method and apparatus of removing foreground detection result shadows
CN111353961A (en) Document curved surface correction method and device
CN104156730A (en) Anti-noise Chinese character feature extraction method based on framework
CN105512600A (en) License plate identification method based on mutual information and characteristic extraction
CN116824608A (en) Answer sheet layout analysis method based on target detection technology
CN109858484B (en) Multi-class transformation license plate correction method based on deflection evaluation
CN102073862B (en) Method for quickly calculating layout structure of document image
CN107341429A (en) Cutting method, cutting device and the electronic equipment of hand-written adhesion character string
Tse et al. An OCR-independent character segmentation using shortest-path in grayscale document images
Malon et al. Support vector machines for mathematical symbol recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 571924 Hainan Old City high-tech industrial demonstration area Hainan Ecological Software Park Walker Park 8811

Applicant after: Hainan Cloud River Technology Co., Ltd.

Address before: 100083 Haidian District Zhongguancun Road East 16 Longhu Downing 8 8 2801

Applicant before: Beijing Yun Jiang Science and Technology Ltd.

GR01 Patent grant
GR01 Patent grant