CN105184292A - Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image - Google Patents

Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image Download PDF

Info

Publication number
CN105184292A
CN105184292A CN201510531070.4A CN201510531070A CN105184292A CN 105184292 A CN105184292 A CN 105184292A CN 201510531070 A CN201510531070 A CN 201510531070A CN 105184292 A CN105184292 A CN 105184292A
Authority
CN
China
Prior art keywords
connected domain
character
line
value
ordinate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510531070.4A
Other languages
Chinese (zh)
Other versions
CN105184292B (en
Inventor
陈李江
刘宁
刘辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yun Jiang Science And Technology Ltd
Original Assignee
Beijing Yun Jiang Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yun Jiang Science And Technology Ltd filed Critical Beijing Yun Jiang Science And Technology Ltd
Priority to CN201510531070.4A priority Critical patent/CN105184292B/en
Publication of CN105184292A publication Critical patent/CN105184292A/en
Application granted granted Critical
Publication of CN105184292B publication Critical patent/CN105184292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Character Input (AREA)

Abstract

The invention provides a method for analyzing and recognizing the structure of a handwritten mathematical formula in a natural scene image. The method comprises the steps of S1, converting the gray matrix of a natural scene image into a local contrast matrix, and conducting the binary classification on the local contrast matrix based on the otsu method to obtain a binary matrix; S2, analyzing the connected domains of the binary matrix obtained in the step S1, and removing non-character type connected domains to obtain character type connected domains; S3, detecting formula structural elements and other special structural elements in the character type connected domains based on the correlation coefficient method, and separately marking out all detected special structural elements; S4, dividing the binary matrix obtained in the step S1 based on the horizontal projection method; S5, recognizing each character type connected domain via a convolutional neural network; S6, defining an output sequence and outputting recognized results according to the corresponding sequence in the latex layout format. According to the technical scheme of the invention, by means of the method, the expression problem of elementary mathematical formulas during the OCR recognition process can be effectively solved.

Description

The structure analysis of handwritten form mathematical formulae and recognition methods in natural scene image
Technical field
The present invention relates to image processing and pattern recognition, particularly relate to the structure analysis of handwritten form mathematical formulae and knowledge method for distinguishing in natural scene image.
Background technology
OCR (OpticalCharacterRecognition, optical character identification) technology has a wide range of applications, for Chinese and English OCR recognition technology all comparative maturities, but at present for this situation having complicated structure of mathematical formulae, current OCR technology is not well supported, the present invention solves emphatically the problem that this has very strong application demand.
Summary of the invention
The structure analysis of handwritten form mathematical formulae and recognition methods in natural scene image provided by the invention, can solve the problem of representation of elementary mathematics formula in OCR identification effectively.
The structure analysis of handwritten form mathematical formulae and recognition methods in natural scene image of the present invention, comprising:
Step S1: the gray matrix of natural scene image is converted to local contrast matrix, uses otsu (Otsu threshold) method to carry out two-value division to the local contrast matrix obtained, obtains two values matrix;
Step S2: carry out connected domain analysis to two values matrix in step S1, rejects non-character connected domain, obtains character connected domain;
Step S3: adopt correlation coefficient process to carry out formula special construction Element detection to the character connected domain in step S2, and all special construction elements detected are marked separately;
Step S4: adopt horizontal projection method to the capable division of the two values matrix in step S1;
Step S5: adopt convolutional neural networks to identify each character connected domain;
Step S6: definition output order, by the order of recognition result according to correspondence, exports with latex (composing system based on Τ Ε Χ) typesetting format.
Preferably, in described local contrast matrix coordinate to be local contrast Con (i, the j) computing formula of the point of (i, j) be:
Con(i,j)=αC(i,j)+(1-α)(I max(i,j)-I min(i,j))
Wherein,
I max(i, j) and I min(i, j) be respectively in the gray matrix of image with coordinate be (i, j) point centered by the maximum gradation value of neighborhood and minimum gradation value, the radius that we arrange neighborhood is herein 5;
std represents the standard deviation of gray matrix, γ=1.
C ( i , j ) = I max ( i , j ) - I min ( i , j ) I max ( i , j ) + I min ( i , j ) + ϵ , ε be prevent denominator be 0 dimensionless.
Preferably, use otsu method to the method that the local contrast matrix obtained carries out two-value division is: get maximal value and minimum value in local contrast matrix, n part minizone will be divided between maximal value and minimum value, each element is divided in the minizone of its correspondence, form histogram, otsu division is carried out on this histogram basis, and the point being less than selected threshold value is background dot, and the point being greater than selected threshold value is character point.
Preferably, carry out connected domain analysis to two values matrix in step S1, reject non-character connected domain, the method obtaining character connected domain is::
Step S201: the minimum outsourcing rectangle obtaining connected domain, records the coordinate on four summits of this minimum outsourcing rectangle, calculates length and the height of minimum outsourcing rectangle;
Step 202: average length and the height of adding up all connected domains;
Step S203: the rejecting carrying out non-character connected domain:
If the length of certain connected domain and be highly all less than average length and height 1/4, then think that it is noise spot, weed out this connected domain;
If the length of certain connected domain and be highly all greater than average length and height 4 times, then think that it is the non-character part in image, weed out this connected domain;
Step S204: preserve residue connected domain as character connected domain.
Preferably, the special construction of formula described in step S3 element comprises braces, radical sign, fraction line;
Adopt rule match method separable type line connected domain to detect: select connected domain length be greater than 5 with the ratio of width and need there be the connected domain of adjacent connected domain the upper and lower of connected domain, and this connected domain is designated fraction line connected domain;
Template matching method is adopted to detect for braces connected domain and radical sign connected domain:
Step S301: the standard two-value template selecting braces connected domain and radical sign connected domain;
Step S302: the size of current connected domain standardized, makes its size the same with standard form;
Step S303: standard two-value template is mated with current connected domain respectively,
The formula of coupling is formula of correlation coefficient, is expressed as:
r = Σ i = 1 n ( x i - x ‾ ) ( y i - y ‾ ) Σ i = 1 n ( x i - x ‾ ) 2 · Σ i = 1 n ( y i - y ‾ ) 2
Wherein, x iand y irepresent the value of i-th element in current template and standard form respectively, with represent the average of current template and standard form respectively; R ∈ (0,1), when r value is greater than 0.5, the match is successful.
Preferably, the method for horizontal projection method to the capable division of two values matrix is adopted to be in step S4:
Obtain oscillogram after carrying out horizontal projection to the two values matrix in step S1, the value of oscillogram horizontal ordinate is the line number of original image, the number of the character point that the value of ordinate comprises for current line;
From each crest of oscillogram to expanding about it, until when numerical value is less than 0.1 times of its crest value, stop expansion; If there occurs overlap during adjacent two crests expansion, then two row of its correspondence merge into a line;
Record the starting and ending position of every a line, the horizontal ordinate that crest left end is corresponding is the initial row coordinate of current line, and the horizontal ordinate that crest right-hand member is corresponding is the end line coordinate of current line.
Preferably, after obtaining the starting and ending positional information of every a line, each character connected domain is corresponding with row, and concrete grammar is:
Calculate the distance of the horizontal coordinate at each character connected domain center and the horizontal coordinate at each line of text center, character connected domain is divided into apart from that minimum a line.
Preferably, the structure of the convolutional neural networks in step S4 is Lenet-5 structure, and this convolutional neural networks is made up of an input layer, two Convolution sums down-sampling layers, a full connection hidden layer and output layers;
The training data of described convolutional neural networks is the sample of the character connected domain after standardization;
Input convolutional neural networks by after the character connected domain standardization in step S2, obtain the character that each character connected domain is corresponding.
Preferably, the output order of step S6 definition comprises three layers:
Ground floor ordinal relation is line order relation: according to the corresponding relation of character connected domain with row, export corresponding character connected domain by row;
Second layer ordinal relation is row order relations: in every a line, and all character connected domains carry out ascending sort according to its left end row coordinate;
Third layer ordinal relation is the sequence relation in formula special construction: in system of equations, element exports according to each equation; Fraction element exports according to the form of first molecule, rear denominator.
Preferably, for the sequence relation in formula special construction, need the character block determining that each formula special construction element comprises;
For braces, representative be this special construction of system of equations, need the row coordinate determining that system of equations terminates, thus determine its all character blocks comprised; According to the position of current line residing for character block, be divided into " top, middle part, bottom " three parts, every character block being positioned at upper and lower, all think the element in system of equations, find out all such character blocks, using the end column of the wherein character block of low order end as the end column of whole system of equations; Every all character blocks being positioned at braces and system of equations end column, are all divided in current system of equations structure; Division gone again to system of equations inside configuration, determines that it is inner containing several equation, the character block of system of equations inside is exported according to equation order;
For fraction line, need to determine all molecules of current fraction and point parent element, every initial ordinate is greater than the initial ordinate of fraction line, and end ordinate is less than the character block of fraction toe-in bundle ordinate, is all divided in current fraction structure; Character block in separable type structure, needs to determine that it is molecule or denominator further, determines that mode determines according to the horizontal ordinate of character block: if if character block bottom transverse coordinate is less than fraction line center horizontal ordinate, then it belongs to molecule; If character block top horizontal ordinate is greater than fraction line center horizontal ordinate, then it belongs to denominator;
For radical sign, need to determine the character block being positioned at radical sign inside, every initial ordinate is greater than the initial ordinate of radical sign, and end ordinate is less than the character block that radical sign terminates ordinate, is all divided in current radical sign structure;
According to the sequence relation in above-mentioned line order relation, row order relation and formula special construction, determine the output of final formula structure, export with latex (composing system based on Τ Ε Χ) typesetting format.
Present invention efficiently solves the problem of representation of elementary mathematics formula in OCR identification, achieve the accurate identification of formula.
Accompanying drawing explanation
The process flow diagram of the structure analysis of handwritten form mathematical formulae and recognition methods in the natural scene that Fig. 1 provides for the embodiment of the present invention;
Fig. 2 is the structural representation of the convolutional neural networks that the invention process character recognition adopts.
Embodiment
In the natural scene embodiment of the present invention provided below in conjunction with accompanying drawing the structure analysis of handwritten form mathematical formulae with know method for distinguishing and be described in detail.
As shown in Figure 1, in the natural scene that provides of the embodiment of the present invention, the structure analysis of handwritten form mathematical formulae and recognition methods comprise the following steps:
Step S1, is converted to local contrast matrix by the gray matrix of natural scene image, uses otsu method to carry out two-value division to the local contrast matrix obtained, obtains two values matrix;
In the present embodiment in local contrast matrix coordinate to be local contrast Con (i, the j) computing formula of the point of (i, j) be:
Con(i,j)=αC(i,j)+(1-α)(I max(i,j)-I min(i,j))
Wherein,
I max(i, j) and I min(i, j) be respectively in the gray matrix of image with coordinate be (i, j) point centered by the maximum gradation value of neighborhood and minimum gradation value, the radius that we arrange neighborhood is herein 5;
std represents the standard deviation of gray matrix, γ=1.
C ( i , j ) = I max ( i , j ) - I min ( i , j ) I max ( i , j ) + I min ( i , j ) + ϵ , ε be prevent denominator be 0 dimensionless.
Otsu method is used to the method that the local contrast matrix obtained carries out two-value division to be in the present embodiment: to get maximal value and minimum value in local contrast matrix, 1000 parts of minizones will be divided between maximal value and minimum value, each element is divided in the minizone of its correspondence, forming a length is the statistic histogram of 1000, OTSU method is adopted to carry out two-value division to this histogram, the point being less than selected threshold value is background dot, and the point being greater than selected threshold value is character point.
Step S2, carries out connected domain analysis to two values matrix in step S1, and reject non-character connected domain, obtain character connected domain, concrete grammar is:
Step S201: the minimum outsourcing rectangle obtaining connected domain, records the coordinate on four summits of this minimum outsourcing rectangle, calculates length and the height of minimum outsourcing rectangle;
Step 202: average length and the height of adding up all connected domains;
Step S203: the rejecting carrying out non-character connected domain:
If the length of certain connected domain and be highly all less than average length and height 1/4, then think that it is noise spot, weed out this connected domain;
If the length of certain connected domain and be highly all greater than average length and height 4 times, then think that it is the non-character part in image, weed out this connected domain;
Step S204: preserve residue connected domain as character connected domain, obtain the character block of character connected domain according to its minimum outsourcing rectangle.
Step S3, adopts correlation coefficient process to carry out formula special construction Element detection to the character connected domain in step S2, and marks separately all special construction elements detected;
The formula special construction element of the present embodiment comprises braces, radical sign, fraction line;
Adopt rule match method separable type line connected domain to detect: select connected domain length be greater than 5 with the ratio of width and need there be the connected domain of adjacent connected domain the upper and lower of connected domain, and this connected domain is designated fraction line connected domain;
Template matching method is adopted to detect for braces connected domain and radical sign connected domain, what standard form adopted is the matrix of 32*32, for the character block of character connected domain to be detected, also specification is needed to change into the matrix of 32*32, calculate the related coefficient of these two matrixes, if it is greater than 0.5, then represent that the match is successful, concrete steps are:
Step S301: the standard two-value template selecting braces connected domain and radical sign connected domain;
Step S302: the size of current connected domain standardized, makes its size the same with standard form;
Step S303: standard two-value template is mated with current connected domain respectively,
The formula of coupling is formula of correlation coefficient, is expressed as:
r = Σ i = 1 n ( x i - x ‾ ) ( y i - y ‾ ) Σ i = 1 n ( x i - x ‾ ) 2 · Σ i = 1 n ( y i - y ‾ ) 2
Wherein, x iand y irepresent the value of i-th element in current template and standard form respectively, with represent the average of current template and standard form respectively; R ∈ (0,1), when r value is greater than 0.5, the match is successful.
All special construction elements detected are marked separately, carries out structure analysis so that follow-up.
Step S4, adopts horizontal projection method to the capable division of the two values matrix in step S1;
Obtain oscillogram after carrying out horizontal projection to the two values matrix in step S1, the value of oscillogram horizontal ordinate is the line number of original image, and the number of the character point that the value of ordinate comprises for current line, obtains row information based on crest;
The distance of regulation adjacent peaks more than 10, must be less than two crests of 10 for distance, only retain one that peak value is higher, and the height of regulation crest is so minimum that to be greater than 1/20th of image length;
For the crest meeting above-mentioned condition, from two ends, crest left and right simultaneously toward external expansion, until when numerical value is less than 0.01 times of crest height, stop expansion,
From each crest of oscillogram to expanding about it, until when numerical value is less than 0.1 times of its crest value, stop expansion; If there occurs overlap during adjacent two crests expansion, then two row of its correspondence merge into a line; If there occurs overlap during adjacent two crests expansion, then two row of its correspondence merge into a line;
Record the starting and ending position of every a line, the horizontal ordinate that crest left end is corresponding is the initial row coordinate of current line, and the horizontal ordinate that crest right-hand member is corresponding is the end line coordinate of current line.
After obtaining the starting and ending positional information of every a line, each character connected domain is corresponding with row, concrete grammar is: the distance calculating the horizontal coordinate at each character connected domain center and the horizontal coordinate at each line of text center, character connected domain is divided into apart from that minimum a line.
Because system of equations may be divided into multirow sometimes by mistake, therefore specify, the line number at braces place, do not allow to be divided into multirow.
Step S5, adopts convolutional neural networks to identify each character connected domain;
As shown in Figure 2, the structure of convolutional neural networks is Lenet-5 structure, and this convolutional neural networks is made up of an input layer, two Convolution sums down-sampling layers, a full connection hidden layer and output layers;
Input layer sample size is 32*32, and first convolutional layer characteristic pattern number is 6, and second convolutional layer characteristic pattern number is 16, what down-sampling layer adopted is the mode that maximal value exports, ranks all become original half, and hidden layer node number is 120, and output layer node number is 84;
Training sample is the sample of the character connected domain after standardization, and obtain by by above-mentioned binaryzation mode, namely training sample is the same with the mode that forecast sample obtains and normalized mode, this is done to improve recognition accuracy.
Input convolutional neural networks by after the character connected domain standardization in step S2, obtain the character that each character connected domain is corresponding.
Step S6, definition output order, by the order of recognition result according to correspondence, exports with latex typesetting format;
The output order of definition comprises three layers:
Ground floor ordinal relation is line order relation: according to the corresponding relation of character connected domain with row, export corresponding character connected domain by row;
Second layer ordinal relation is row order relations: in every a line, and all character connected domains carry out ascending sort according to its left end row coordinate;
Third layer ordinal relation is the sequence relation in formula special construction: in system of equations, element exports according to each equation; Fraction element exports according to the form of first molecule, rear denominator.
For the sequence relation in formula special construction, need the character block determining that each formula special construction element comprises;
For braces, representative be this special construction of system of equations, need the row coordinate determining that system of equations terminates, thus determine its all character blocks comprised; According to the position of current line residing for character block, be divided into " top, middle part, bottom " three parts, every character block being positioned at upper and lower, all think the element in system of equations, find out all such character blocks, using the end column of the wherein character block of low order end as the end column of whole system of equations; Every all character blocks being positioned at braces and system of equations end column, are all divided in current system of equations structure; Division gone again to system of equations inside configuration, determines that it is inner containing several equation, the character block of system of equations inside is exported according to equation order;
For fraction line, need to determine all molecules of current fraction and point parent element, every initial ordinate is greater than the initial ordinate of fraction line, and end ordinate is less than the character block of fraction toe-in bundle ordinate, is all divided in current fraction structure; Character block in separable type structure, needs to determine that it is molecule or denominator further, determines that mode determines according to the horizontal ordinate of character block: if if character block bottom transverse coordinate is less than fraction line center horizontal ordinate, then it belongs to molecule; If character block top horizontal ordinate is greater than fraction line center horizontal ordinate, then it belongs to denominator;
For radical sign, need to determine the character block being positioned at radical sign inside, every initial ordinate is greater than the initial ordinate of radical sign, and end ordinate is less than the character block that radical sign terminates ordinate, is all divided in current radical sign structure;
According to the sequence relation in above-mentioned line order relation, row order relation and formula special construction, determine the output of final formula structure, export with latex typesetting format.
Effectively can be solved the problem of representation of elementary mathematics formula in OCR identification by above-described embodiment, achieve the accurate identification of formula.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. the structure analysis of handwritten form mathematical formulae and a recognition methods in natural scene image, is characterized in that, described method comprises:
Step S1: the gray matrix of natural scene image is converted to local contrast matrix, uses otsu method to carry out two-value division to the local contrast matrix obtained, obtains two values matrix;
Step S2: carry out connected domain analysis to two values matrix in step S1, rejects non-character connected domain, obtains character connected domain;
Step S3: adopt correlation coefficient process to carry out formula special construction Element detection to the character connected domain in step S2, and all special construction elements detected are marked separately;
Step S4: adopt horizontal projection method to the capable division of the two values matrix in step S1;
Step S5: adopt convolutional neural networks to identify each character connected domain;
Step S6: definition output order, by the order of recognition result according to correspondence, exports with latex typesetting format.
2. method according to claim 1, is characterized in that, in described local contrast matrix, to be local contrast Con (i, the j) computing formula of the point of (i, j) be coordinate:
Con(i,j)=αC(i,j)+(1-α)(I max(i,j)-I min(i,j))
Wherein,
I max(i, j) and I min(i, j) be respectively in the gray matrix of image with coordinate be (i, j) point centered by the maximum gradation value of neighborhood and minimum gradation value, the radius that we arrange neighborhood is herein 5;
std represents the standard deviation of gray matrix, γ=1.
C ( i , j ) = I max ( i , j ) - I min ( i , j ) I max ( i , j ) + I min ( i , j ) + ϵ , ε be prevent denominator be 0 dimensionless.
3. method according to claim 2, it is characterized in that, use otsu method to the method that the local contrast matrix obtained carries out two-value division is: get maximal value and minimum value in local contrast matrix, n part minizone will be divided between maximal value and minimum value, each element is divided in the minizone of its correspondence, forms histogram, otsu division is carried out on this histogram basis, the point being less than selected threshold value is background dot, and the point being greater than selected threshold value is character point.
4. method according to claim 3, is characterized in that, carries out connected domain analysis to two values matrix in step S1, and reject non-character connected domain, the method obtaining character connected domain is::
Step S201: the minimum outsourcing rectangle obtaining connected domain, records the coordinate on four summits of this minimum outsourcing rectangle, calculates length and the height of minimum outsourcing rectangle;
Step 202: average length and the height of adding up all connected domains;
Step S203: the rejecting carrying out non-character connected domain:
If the length of certain connected domain and be highly all less than average length and height 1/4, then think that it is noise spot, weed out this connected domain;
If the length of certain connected domain and be highly all greater than average length and height 4 times, then think that it is the non-character part in image, weed out this connected domain;
Step S204: preserve residue connected domain as character connected domain.
5. method according to claim 4, is characterized in that, the special construction of formula described in step S3 element comprises braces, radical sign, fraction line;
Adopt rule match method separable type line connected domain to detect: select connected domain length be greater than 5 with the ratio of width and need there be the connected domain of adjacent connected domain the upper and lower of connected domain, and this connected domain is designated fraction line connected domain;
Template matching method is adopted to detect for braces connected domain and radical sign connected domain:
Step S301: the standard two-value template selecting braces connected domain and radical sign connected domain;
Step S302: the size of current connected domain standardized, makes its size the same with standard form;
Step S303: standard two-value template is mated with current connected domain respectively,
The formula of coupling is formula of correlation coefficient, is expressed as:
r = Σ i = 1 n ( x i - x ‾ ) ( y i - y ‾ ) Σ i = 1 n ( x i - x ‾ ) 2 · Σ i = 1 n ( y i - y ‾ ) 2
Wherein, x iand y irepresent the value of i-th element in current template and standard form respectively, with represent the average of current template and standard form respectively; R ∈ (0,1), when r value is greater than 0.5, the match is successful.
6. method according to claim 5, is characterized in that, adopts the method for horizontal projection method to the capable division of two values matrix to be in step S4:
Obtain oscillogram after carrying out horizontal projection to the two values matrix in step S1, the value of oscillogram horizontal ordinate is the line number of original image, the number of the character point that the value of ordinate comprises for current line;
From each crest of oscillogram to expanding about it, until when numerical value is less than 0.1 times of its crest value, stop expansion; If there occurs overlap during adjacent two crests expansion, then two row of its correspondence merge into a line;
Record the starting and ending position of every a line, the horizontal ordinate that crest left end is corresponding is the initial row coordinate of current line, and the horizontal ordinate that crest right-hand member is corresponding is the end line coordinate of current line.
7. method according to claim 6, is characterized in that, after obtaining the starting and ending positional information of every a line, each character connected domain is corresponding with row, and concrete grammar is:
Calculate the distance of the horizontal coordinate at each character connected domain center and the horizontal coordinate at each line of text center, character connected domain is divided into apart from that minimum a line.
8. method according to claim 7, is characterized in that, the structure of the convolutional neural networks in step S4 is Lenet-5 structure, and this convolutional neural networks is made up of an input layer, two Convolution sums down-sampling layers, a full connection hidden layer and output layers;
The training data of described convolutional neural networks is the sample of the character connected domain after standardization;
Input convolutional neural networks by after the character connected domain standardization in step S2, obtain the character that each character connected domain is corresponding.
9. method according to claim 8, is characterized in that, the output order of step S6 definition comprises three layers:
Ground floor ordinal relation is line order relation: according to the corresponding relation of character connected domain with row, export corresponding character connected domain by row;
Second layer ordinal relation is row order relations: in every a line, and all character connected domains carry out ascending sort according to its left end row coordinate;
Third layer ordinal relation is the sequence relation in formula special construction: in system of equations, element exports according to each equation; Fraction element exports according to the form of first molecule, rear denominator.
10. method according to claim 9, is characterized in that, for the sequence relation in formula special construction, needs the character block determining that each formula special construction element comprises;
For braces, representative be this special construction of system of equations, need the row coordinate determining that system of equations terminates, thus determine its all character blocks comprised; According to the position of current line residing for character block, be divided into " top, middle part, bottom " three parts, every character block being positioned at upper and lower, all think the element in system of equations, find out all such character blocks, using the end column of the wherein character block of low order end as the end column of whole system of equations; Every all character blocks being positioned at braces and system of equations end column, are all divided in current system of equations structure; Division gone again to system of equations inside configuration, determines that it is inner containing several equation, the character block of system of equations inside is exported according to equation order;
For fraction line, need to determine all molecules of current fraction and point parent element, every initial ordinate is greater than the initial ordinate of fraction line, and end ordinate is less than the character block of fraction toe-in bundle ordinate, is all divided in current fraction structure; Character block in separable type structure, needs to determine that it is molecule or denominator further, determines that mode determines according to the horizontal ordinate of character block: if if character block bottom transverse coordinate is less than fraction line center horizontal ordinate, then it belongs to molecule; If character block top horizontal ordinate is greater than fraction line center horizontal ordinate, then it belongs to denominator;
For radical sign, need to determine the character block being positioned at radical sign inside, every initial ordinate is greater than the initial ordinate of radical sign, and end ordinate is less than the character block that radical sign terminates ordinate, is all divided in current radical sign structure;
According to the sequence relation in above-mentioned line order relation, row order relation and formula special construction, determine the output of final formula structure, export with latex typesetting format.
CN201510531070.4A 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image Active CN105184292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510531070.4A CN105184292B (en) 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510531070.4A CN105184292B (en) 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image

Publications (2)

Publication Number Publication Date
CN105184292A true CN105184292A (en) 2015-12-23
CN105184292B CN105184292B (en) 2018-08-03

Family

ID=54906358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510531070.4A Active CN105184292B (en) 2015-08-26 2015-08-26 The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image

Country Status (1)

Country Link
CN (1) CN105184292B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105844275A (en) * 2016-03-25 2016-08-10 北京云江科技有限公司 Method for positioning text lines in text image
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN106709394A (en) * 2016-12-12 2017-05-24 北京慧眼智行科技有限公司 Image processing method and device
CN107169485A (en) * 2017-03-28 2017-09-15 北京捷通华声科技股份有限公司 A kind of method for identifying mathematical formula and device
CN107886065A (en) * 2017-11-06 2018-04-06 哈尔滨工程大学 A kind of Serial No. recognition methods of mixing script
CN108898142A (en) * 2018-06-15 2018-11-27 宁波云江互联网科技有限公司 A kind of recognition methods and calculating equipment of handwritten formula
CN109117848A (en) * 2018-09-07 2019-01-01 泰康保险集团股份有限公司 A kind of line of text character identifying method, device, medium and electronic equipment
CN109239073A (en) * 2018-07-28 2019-01-18 西安交通大学 A kind of detection method of surface flaw for body of a motor car
CN109886093A (en) * 2019-01-08 2019-06-14 深圳禾思众成科技有限公司 A kind of formula detection method, equipment and computer readable storage medium
CN109977861A (en) * 2019-03-25 2019-07-05 中国科学技术大学 Offline handwritten form method for identifying mathematical formula
CN109993040A (en) * 2018-01-03 2019-07-09 北京世纪好未来教育科技有限公司 Text recognition method and device
CN110020655A (en) * 2019-04-19 2019-07-16 厦门商集网络科技有限责任公司 A kind of character denoising method and terminal based on binaryzation
CN110135407A (en) * 2018-02-09 2019-08-16 北京世纪好未来教育科技有限公司 Sample mask method and computer storage medium
CN110135426A (en) * 2018-02-09 2019-08-16 北京世纪好未来教育科技有限公司 Sample mask method and computer storage medium
CN110135425A (en) * 2018-02-09 2019-08-16 北京世纪好未来教育科技有限公司 Sample mask method and computer storage medium
CN110533035A (en) * 2019-08-28 2019-12-03 海南阿凡题科技有限公司 Students' work page number recognition methods based on text matches
WO2019232850A1 (en) * 2018-06-04 2019-12-12 平安科技(深圳)有限公司 Method and apparatus for recognizing handwritten chinese character image, computer device, and storage medium
CN110569853A (en) * 2019-09-12 2019-12-13 南京红松信息技术有限公司 Target positioning-based independent formula segmentation method
CN111027561A (en) * 2019-11-22 2020-04-17 广州寄锦教育科技有限公司 Mathematical formula positioning method, system, readable storage medium and computer equipment
CN111079745A (en) * 2019-12-11 2020-04-28 中国建设银行股份有限公司 Formula identification method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060062470A1 (en) * 2004-09-22 2006-03-23 Microsoft Corporation Graphical user interface for expression recognition
CN102184399A (en) * 2011-03-31 2011-09-14 上海名图信息技术有限公司 Character segmenting method based on horizontal projection and connected domain analysis
CN102542273A (en) * 2011-12-02 2012-07-04 方正国际软件有限公司 Detection method and system for complex formula areas in document image
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
CN104050471A (en) * 2014-05-27 2014-09-17 华中科技大学 Natural scene character detection method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060062470A1 (en) * 2004-09-22 2006-03-23 Microsoft Corporation Graphical user interface for expression recognition
CN102184399A (en) * 2011-03-31 2011-09-14 上海名图信息技术有限公司 Character segmenting method based on horizontal projection and connected domain analysis
CN102542273A (en) * 2011-12-02 2012-07-04 方正国际软件有限公司 Detection method and system for complex formula areas in document image
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
CN104050471A (en) * 2014-05-27 2014-09-17 华中科技大学 Natural scene character detection method and system

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
US10354133B2 (en) 2015-08-26 2019-07-16 Beijing Lejent Technology Co., Ltd. Method for structural analysis and recognition of handwritten mathematical formula in natural scene image
CN105844275B (en) * 2016-03-25 2019-08-23 北京云江科技有限公司 The localization method of line of text in text image
CN105844275A (en) * 2016-03-25 2016-08-10 北京云江科技有限公司 Method for positioning text lines in text image
CN106709394A (en) * 2016-12-12 2017-05-24 北京慧眼智行科技有限公司 Image processing method and device
CN106709394B (en) * 2016-12-12 2019-07-05 北京慧眼智行科技有限公司 A kind of image processing method and device
CN107169485A (en) * 2017-03-28 2017-09-15 北京捷通华声科技股份有限公司 A kind of method for identifying mathematical formula and device
CN107886065A (en) * 2017-11-06 2018-04-06 哈尔滨工程大学 A kind of Serial No. recognition methods of mixing script
CN109993040B (en) * 2018-01-03 2021-07-30 北京世纪好未来教育科技有限公司 Text recognition method and device
CN109993040A (en) * 2018-01-03 2019-07-09 北京世纪好未来教育科技有限公司 Text recognition method and device
CN110135425A (en) * 2018-02-09 2019-08-16 北京世纪好未来教育科技有限公司 Sample mask method and computer storage medium
CN110135426A (en) * 2018-02-09 2019-08-16 北京世纪好未来教育科技有限公司 Sample mask method and computer storage medium
CN110135407A (en) * 2018-02-09 2019-08-16 北京世纪好未来教育科技有限公司 Sample mask method and computer storage medium
WO2019232850A1 (en) * 2018-06-04 2019-12-12 平安科技(深圳)有限公司 Method and apparatus for recognizing handwritten chinese character image, computer device, and storage medium
CN108898142A (en) * 2018-06-15 2018-11-27 宁波云江互联网科技有限公司 A kind of recognition methods and calculating equipment of handwritten formula
CN108898142B (en) * 2018-06-15 2022-03-18 宁波云江互联网科技有限公司 Recognition method of handwritten formula and computing device
CN109239073A (en) * 2018-07-28 2019-01-18 西安交通大学 A kind of detection method of surface flaw for body of a motor car
CN109117848A (en) * 2018-09-07 2019-01-01 泰康保险集团股份有限公司 A kind of line of text character identifying method, device, medium and electronic equipment
CN109886093A (en) * 2019-01-08 2019-06-14 深圳禾思众成科技有限公司 A kind of formula detection method, equipment and computer readable storage medium
CN109977861A (en) * 2019-03-25 2019-07-05 中国科学技术大学 Offline handwritten form method for identifying mathematical formula
CN110020655A (en) * 2019-04-19 2019-07-16 厦门商集网络科技有限责任公司 A kind of character denoising method and terminal based on binaryzation
CN110533035B (en) * 2019-08-28 2022-02-15 海南阿凡题科技有限公司 Student homework page number identification method based on text matching
CN110533035A (en) * 2019-08-28 2019-12-03 海南阿凡题科技有限公司 Students' work page number recognition methods based on text matches
CN110569853A (en) * 2019-09-12 2019-12-13 南京红松信息技术有限公司 Target positioning-based independent formula segmentation method
CN110569853B (en) * 2019-09-12 2022-11-29 南京红松信息技术有限公司 Target positioning-based independent formula segmentation method
CN111027561A (en) * 2019-11-22 2020-04-17 广州寄锦教育科技有限公司 Mathematical formula positioning method, system, readable storage medium and computer equipment
CN111079745A (en) * 2019-12-11 2020-04-28 中国建设银行股份有限公司 Formula identification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN105184292B (en) 2018-08-03

Similar Documents

Publication Publication Date Title
CN105184292A (en) Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image
WO2017031716A1 (en) Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
EP2943911B1 (en) Process of handwriting recognition and related apparatus
Moussa et al. New features using fractal multi-dimensions for generalized Arabic font recognition
CN112528963A (en) Intelligent arithmetic question reading system based on MixNet-YOLOv3 and convolutional recurrent neural network CRNN
CN108492298B (en) Multispectral image change detection method based on generation countermeasure network
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
CN105469047A (en) Chinese detection method based on unsupervised learning and deep learning network and system thereof
Miller et al. A set of handwriting features for use in automated writer identification
CN104240256A (en) Image salient detecting method based on layering sparse modeling
CN105608454A (en) Text structure part detection neural network based text detection method and system
US9224207B2 (en) Segmentation co-clustering
Obaidullah et al. A system for handwritten script identification from Indian document
CN105069447A (en) Facial expression identification method
CN113723330B (en) Method and system for understanding chart document information
CN104156730A (en) Anti-noise Chinese character feature extraction method based on framework
CN111553351A (en) Semantic segmentation based text detection method for arbitrary scene shape
CN115578735B (en) Text detection method and training method and device of text detection model
CN115620312A (en) Cross-modal character handwriting verification method, system, equipment and storage medium
Sahare et al. Robust character segmentation and recognition schemes for multilingual Indian document images
Chen et al. Method on water level ruler reading recognition based on image processing
Bogacz et al. Cuneiform character similarity using graph representations
Phong et al. An end‐to‐end framework for the detection of mathematical expressions in scientific document images
CN111144300A (en) Pdf table structure identification method based on image identification
CN111753714B (en) Multidirectional natural scene text detection method based on character segmentation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 571924 Hainan Old City high-tech industrial demonstration area Hainan Ecological Software Park Walker Park 8811

Applicant after: Hainan Cloud River Technology Co., Ltd.

Address before: 100083 Haidian District Zhongguancun Road East 16 Longhu Downing 8 8 2801

Applicant before: Beijing Yun Jiang Science and Technology Ltd.

GR01 Patent grant
GR01 Patent grant