CN110210467A - A kind of formula localization method, image processing apparatus, the storage medium of text image - Google Patents
A kind of formula localization method, image processing apparatus, the storage medium of text image Download PDFInfo
- Publication number
- CN110210467A CN110210467A CN201910452711.5A CN201910452711A CN110210467A CN 110210467 A CN110210467 A CN 110210467A CN 201910452711 A CN201910452711 A CN 201910452711A CN 110210467 A CN110210467 A CN 110210467A
- Authority
- CN
- China
- Prior art keywords
- formula
- value
- text
- target
- abscissa value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 230000004807 localization Effects 0.000 title claims abstract description 62
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000010606 normalization Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 19
- 238000001514 detection method Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012015 optical character recognition Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 238000000151 deposition Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/225—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Character Input (AREA)
Abstract
This application discloses formula localization method, image processing apparatus, the storage medium of a kind of text image, the formula localization method of text image includes: the String localization information and attention force information for obtaining line of text;According to String localization information and pay attention to force information, calculates the formula coordinate set and formula boundary set of line of text;According to formula coordinate set and formula boundary set, the positioning coordinate of the formula in line of text is calculated.By the above-mentioned means, can accurately be positioned to the formula in text image.
Description
Technical field
This application involves technical field of image processing, more particularly to formula localization method, the image of a kind of text image
Processing unit, storage medium.
Background technique
With the development of development of Mobile Internet technology, such as a large amount of hand-held mobile terminals of smart phone, tablet computer come into me
Life, become we live in indispensable a part.These handheld terminals are owned by camera function, this is our energy
Enough document information of acquisition at any time provide huge convenience.
And scientific formula also widely exists in text document as a kind of special information carrier.In practical applications,
It is frequently necessary to carry out scientific formula positioning extraction, how the formula in text image is positioned in order to urgently to be resolved
Problem.
Summary of the invention
To solve the above problems, this application provides a kind of formula localization method of text image, image processing apparatus, depositing
Storage media can accurately position the formula in text image.
The technical solution that the application uses is: a kind of formula localization method of text image is provided, this method comprises:
It obtains the String localization information of line of text and pays attention to force information;According to String localization information and pay attention to force information, calculates line of text
Formula coordinate set and formula boundary set;According to formula coordinate set and formula boundary set, the positioning of the formula in line of text is calculated
Coordinate.
Wherein, according to String localization information and attention force information, the formula coordinate set and formula boundary set of line of text are calculated
The step of, comprising: according to force information is paid attention to, obtain the attention information vector of target spcial character in target text row;Wherein,
Spcial character is the character in formula;Judge whether the corresponding index value of maximum value is 0 in attention information vector;If so,
The formula info that adjoins of target spcial character is added to formula boundary set.
Wherein, this method further include: according to the String localization information of target text row, calculate the width of target text row
Value;According to the corresponding index value of maximum value in the width value of target text row, attention information vector, target spcial character is calculated
Abscissa value;According to the abscissa value of target spcial character, maximum abscissa value and minimum abscissa value are determined;According to text
Location information, maximum abscissa value and minimum abscissa value, calculation formula coordinate set.
Wherein, according to the String localization information of target text row, the step of calculating the width value of target text row, comprising:
It is calculated using the following equation the normalization width value of target text row:Wherein, w is the coordinate of target text row
Width, h are the height of target text row.
Wherein, according to the corresponding index value of maximum value in the width value of target text row, attention information vector, mesh is calculated
The step of marking the abscissa value of spcial character, comprising: be calculated using the following equation the abscissa value of target spcial character:Wherein, w is the width of target text row, and aidx is that maximum value is corresponding in attention information vector
Index value, wmFor the normalization width value of target text row.
Wherein, according to the abscissa value of target spcial character, determining maximum abscissa value and minimum abscissa value the step of,
It include: that initial maximum abscissa value and initial minimum abscissa are determined according to the initial abscissa value of the target spcial character
Value;When getting the new abscissa value of the target spcial character, the new abscissa value and the maximum cross
The size of coordinate value and the minimum abscissa value;If the new abscissa value is less than the minimum abscissa value, update
The minimum abscissa value;If the new abscissa value is greater than the maximum abscissa value, the maximum abscissa is updated
Value.
Wherein, according to String localization information, maximum abscissa value and minimum abscissa value, the step of calculation formula coordinate set
Suddenly, comprising: be calculated using the following equation formula coordinate set:Wherein, x1For the abscissa lvalue of formula, x2
For the abscissa r value of formula, y1For the ordinate upper value of formula, y2To be worth under the ordinate of formula, bi0For String localization information
In abscissa lvalue, bi2For the ordinate upper value in String localization information, bi3For under the ordinate in String localization information
Value, wminFor minimum abscissa value, wmaxFor maximum abscissa value.
Wherein, according to formula coordinate set and formula boundary set, the step of calculating the positioning coordinate of the formula in line of text, packet
It includes: judging the lastrow text of target line text whether in formula boundary set;Adjoin if so, judging whether target line text has
Adjacent formula info;If so, then the last one formula of first formula coordinate and lastrow text in target line text is sat
Mark is merged.
Wherein, this method further include: according to formula coordinate set, obtain the binary image in target formula region;To two-value
Change image and carry out ordinate projection, to obtain the ordinate of target formula;Using the ordinate of target formula to formula coordinate set
It is updated.
Another technical solution that the application uses is: providing a kind of image processing apparatus, which includes:
Module is obtained, for obtaining the String localization information of line of text and paying attention to force information;First computing module, for fixed according to text
Position information and attention force information, calculate the formula coordinate set and formula boundary set of line of text;Second computing module, for according to public affairs
Formula coordinate set and formula boundary set calculate the positioning coordinate of the formula in line of text.
Another technical solution that the application uses is: providing a kind of image processing apparatus, which includes
Processor and memory, memory is for storing program data, and processor is for executing program data to realize as above-mentioned
Method.
Another technical solution that the application uses is: providing a kind of computer storage medium, the computer storage medium
For storing program data, program data is when being executed by processor, to realize such as above-mentioned method.
The formula localization method of text image provided by the present application includes: the String localization information and attention for obtaining line of text
Force information;According to String localization information and pay attention to force information, calculates the formula coordinate set and formula boundary set of line of text;According to public affairs
Formula coordinate set and formula boundary set calculate the positioning coordinate of the formula in line of text.By the above-mentioned means, attention can be utilized
Information positions the formula in text image, thus lay the foundation for subsequent formulas solutions, and then can be accurate
To the image of formula.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.Wherein:
Fig. 1 is the flow diagram of the formula localization method of text image provided by the embodiments of the present application;
Fig. 2 is the first positioning coordinate schematic diagram of text provided by the embodiments of the present application;
Fig. 3 is the flow diagram provided by the embodiments of the present application for obtaining boundary set;
Fig. 4 is the flow diagram provided by the embodiments of the present application for obtaining formulary;
Fig. 5 is the flow diagram that maximum abscissa and minimum abscissa are determined in the embodiment of the present application;
Fig. 6 is the second positioning coordinate schematic diagram of text provided by the embodiments of the present application;
Fig. 7 is the flow diagram of the embodiment of the present application calculation formula positioning coordinate;
Fig. 8 is the logical schematic of the formula localization method of text image provided by the embodiments of the present application;
Fig. 9 is the logical schematic of acquisition formula positioning coordinate set provided by the embodiments of the present application;
Figure 10 is the logical schematic that formula coordinate provided by the embodiments of the present application merges;
Figure 11 is the first structure diagram of image processing apparatus provided by the embodiments of the present application;
Figure 12 is the second structural schematic diagram of image processing apparatus provided by the embodiments of the present application;
Figure 13 is the structural schematic diagram of computer storage medium provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description.It is understood that specific embodiment described herein is only used for explaining the application, rather than to the limit of the application
It is fixed.It also should be noted that illustrating only part relevant to the application for ease of description, in attached drawing and not all knot
Structure.Based on the embodiment in the application, obtained by those of ordinary skill in the art without making creative efforts
Every other embodiment, shall fall in the protection scope of this application.
Term " first ", " second " in the application etc. be for distinguishing different objects, rather than it is specific suitable for describing
Sequence.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include.Such as comprising
The process, method, system, product or equipment of a series of steps or units are not limited to listed step or unit, and
It is optionally further comprising the step of not listing or unit, or optionally further comprising for these process, methods, product or equipment
Intrinsic other step or units.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments
It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and
Implicitly understand, embodiment described herein can be combined with other embodiments.
Refering to fig. 1, Fig. 1 is the flow diagram of the formula localization method of text image provided by the embodiments of the present application, should
Method includes:
Step 11: obtaining the String localization information of line of text and pay attention to force information.
Text image, also referred to as file and picture, the i.e. document of picture format.It is to be turned paper document by certain mode
The document of picture format is turned to, for user's electronic reading.The format of general text image include JPG (JPEG), BMP,
PNG, GIF, FSP, TIFF, TGA, EPS etc..
Optionally, String localization information can be the positioning coordinate of text.It is to be appreciated that text is usually with " row "
Form arranges line by line, which is usually the coordinate of upper left point and lower-right most point in the rectangular area where a line text.
As shown in Fig. 2, Fig. 2 is the first positioning coordinate schematic diagram of text provided by the embodiments of the present application, wherein A (x1,
y1) indicate the line of text upper left corner coordinate points, B (x2,y2) indicate the line of text lower right corner coordinate points.
In specific operation, gray proces first can be carried out to text image.
Gray scale is the most direct visual signature for describing gray level image content.It refers to the color depth at black white image midpoint,
Range is generally from 0 to 255, and white is 255, black 0, therefore black white image is also referred to as gray level image.Gray level image matrix element
Value is usually [0,255], therefore its data type is generally 8 signless integers, and here it is usually said 256 grades of people
Gray scale.When color image is converted to gray level image, need to calculate each effective brightness value of pixel, calculation formula in image
Are as follows: Y=0.3R+0.59G+0.11B.
Then, then to text image carry out Denoising disposal.
It is alternatively possible to carry out Gaussian smoothing to gray level image using Gaussian filter algorithm.Gaussian filtering is exactly pair
The process that entire image is weighted and averaged, the value of each pixel are all passed through by other pixel values in itself and neighborhood
It is obtained after crossing weighted average.The concrete operations of gaussian filtering are: with every in a template (or convolution, mask) scan image
One pixel goes the value of alternate template central pixel point with the weighted average gray value of pixel in the determining neighborhood of template.
Secondly, carrying out binaryzation and inverse processing to text image again.
Image binaryzation (Image Binarization) be exactly set the gray value of the pixel on image to 0 or
255, that is, whole image is showed to the process of apparent black and white effect.
Inverse is that the color that can become white is superimposed with primary colors, i.e., subtracts primary colors with white (RGB:255,255,255)
Color.Such as the inverse of red (RGB:255,0,0) is cyan (0,255,255).And to binaryzation in above-mentioned steps 34
Treated image, is that gray value 0 is become gray value 255, gray value 255 becomes gray value 0.
Finally, edge calculations can be carried out to text image.
It is alternatively possible to which the target of Canny edge algorithms is to find an optimal edge using Canny edge algorithms
Detection algorithm, optimal edge detection are meant that:
(1) optimal detection: algorithm can identify the actual edge in image as much as possible, missing inspection true edge it is general
The probability of rate and erroneous detection non-edge is all as small as possible;
(2) oplimal Location criterion: the position of the positional distance actual edge point of the marginal point detected is nearest, or by
It is minimum in the degree for the true edge that influence of noise causes the edge detected to deviate object;
(3) test point and marginal point correspond: the marginal point and actual edge point of operator detection should be corresponded.
Canny edge algorithms may include following steps:
(1) intensity gradient (intensity gradients) of image is looked for;
(2) using non-maximum suppression (non-maximum suppression) technology come eliminate side erroneous detection (originally be not but
Detected is);
(3) possible (potential) boundary is determined using the method for dual threshold;
(4) boundary is tracked using hysteresis techniques.
It is pre-processed by above-mentioned mode to text image to be corrected, then starts to obtain the first inclination information.
It is to be appreciated that by the above-mentioned pretreatment to text image, it, can be to line of text institute in the identification by image
Identification positioning is carried out in the upper left corner in region and the point in the lower right corner.
OCR (Optical Character Recognition, optical character identification) refers to that electronic equipment (such as scans
Instrument or digital camera) check the character printed on paper, its shape is determined by the mode for detecting dark, bright, then uses character recognition
Shape is translated into the process of computword by method;That is, it is directed to printed character, it will be in paper document using optical mode
Text conversion become the image file of black and white lattice, and by identification software by the text conversion in image at text formatting,
The technology further edited and processed for word processor.
AOCR (Attention OCR) is a kind of algorithm identified using attention mechanism to single file text, usually with
CNN (Convolutional Neural Networks, convolutional neural networks) feature passes through attention model pair as input
The state of RNN (recurrent neural network, Recognition with Recurrent Neural Network) and the attention weight calculation of laststate go out
The attention weight of a new state.CNN feature and weight are inputted into RNN later, result is obtained by coding and decoding.
Step 12: according to String localization information and paying attention to force information, calculate formula coordinate set and the formula boundary of line of text
Collection.
Wherein, in terms of step 12 can specifically include two, first, obtain formula boundary set;Second, calculation formula coordinate
Collection.
It is the flow diagram provided by the embodiments of the present application for obtaining boundary set refering to Fig. 3, Fig. 3, this method comprises:
Step 31: according to force information is paid attention to, obtaining the attention information vector of target spcial character in target text row;Its
In, spcial character is the character in formula.
It is alternatively possible to which the spcial character to extraction encodes, coding characteristic is obtained;It is general that prediction is calculated to coding characteristic
Rate;The weight that different coding feature is calculated using attention mechanism, the attention information vector after being encoded.
Step 32: judging whether the corresponding index value of maximum value is 0 in attention information vector.
Sequence is indexed according to 0,1,2 in attention confidence vector ..., wherein if the corresponding index value of maximum value is
0, then it represents that the maximum value is the first place in vector, then further indicates that the spcial character is located at the row head of line of text.
When the judging result of step 32 is to be, step 33 is executed.
Step 33: the formula info that adjoins of target spcial character is added to formula boundary set.
Adjoin formula info for indicate the different character be located at line of text row it is first, may a upper line of text row end be somebody's turn to do
A formula is collectively formed in the row head of line of text.
It is the flow diagram provided by the embodiments of the present application for obtaining formulary refering to Fig. 4, Fig. 4, this method comprises:
Step 41: according to the String localization information of target text row, calculating the width value of target text row.
It is alternatively possible to be calculated using the following equation the width value of target text row:
Wherein, w is the coordinate width of target text row, and h is the height of target text row, wmCalculated result be to take upwards
Whole, such as wmCalculated result be 1.5, then can with value be 2.
Step 42: according to the corresponding index value of maximum value in the width value of target text row, attention information vector, calculating
The abscissa value of target spcial character.
It is alternatively possible to be calculated using the following equation the abscissa value of target spcial character:
Wherein, w is the width of target text row, and aidx is the corresponding index value of maximum value in attention information vector, wm
For the width value of target text row, CwCalculated result round up.
Step 43: according to the abscissa value of target spcial character, determining maximum abscissa value and minimum abscissa value.
It optionally, is to determine that maximum abscissa and the process of minimum abscissa show in the embodiment of the present application refering to Fig. 5, Fig. 5
It is intended to, this method comprises:
Step 431: according to the initial abscissa value of the target spcial character, determining initial maximum abscissa value and initial
Minimum abscissa value.
The abscissa value of goal spcial character, the target spcial character being as calculated in above-mentioned steps 42
Abscissa value Cw, maximum abscissa value W is set heremaxWith minimum abscissa value Wmin。
It is alternatively possible to which the initial abscissa value of the target spcial character obtained after traversing a line of text, determines just
The maximum abscissa value that begins and initial minimum abscissa value.
Step 432: when getting the new abscissa value of the target spcial character, the new abscissa value
With the size of the maximum abscissa value and the minimum abscissa value.
Step 433: if the new abscissa value is less than the minimum abscissa value, updating the minimum abscissa
Value.
If CwCompare WminIt is small, then to WminIt is updated, it is alternatively possible to by WminValue replace with CwValue.
Step 434: if the new abscissa value is greater than the maximum abscissa value, updating the maximum abscissa
Value.
If CwCompare WmaxGreatly, then to WmaxIt is updated, it is alternatively possible to by WmaxValue replace with CwValue.
Step 44: according to String localization information, maximum abscissa value and minimum abscissa value, calculation formula coordinate set.
It is alternatively possible to be calculated using the following equation formula coordinate set:
Wherein, x1For the abscissa lvalue of formula, x2For the abscissa r value of formula, y1For the ordinate upper value of formula, y2
To be worth under the ordinate of formula, bi0For the abscissa lvalue in String localization information, bi1For the abscissa in String localization information
R value, bi2For the ordinate upper value in String localization information, bi3To be worth under the ordinate in String localization information, wminFor minimum
Abscissa value, wmaxFor maximum abscissa value.
Step 13: according to formula coordinate set and formula boundary set, calculating the positioning coordinate of the formula in line of text.
It is the second positioning coordinate schematic diagram of text provided by the embodiments of the present application refering to Fig. 6, Fig. 6, it is possible to understand that ground,
In some embodiments, the formula for needing to position may not be located at a upper text in the same line of text, such as the front portion of formula
The next part of current row, formula is located at next line of text.As shown in fig. 6, " exemplary " in " example text " in lastrow,
" text " is in next line.
Optionally, as shown in fig. 7, Fig. 7 is the flow diagram of the embodiment of the present application calculation formula positioning coordinate, this method
Include:
Step 71: judging the lastrow text of target line text whether in formula boundary set.
When the judging result of step 71 is to be, step 72 is executed.
It is to be appreciated that by the deterministic process of step 71, it is known that whether have in lastrow text be located at row it is first or
The formula of person's end of line is possible in this way with the formula of current row be the same formula.
Step 72: judging whether target line text has and adjoin formula info.
Wherein, this adjoin formula info be added in above-mentioned steps 33 adjoin formula info.
When the judging result of step 72 is to be, step 73 is executed.
Step 73: by target line text first formula coordinate and lastrow text the last one formula coordinate into
Row fusion.
As shown in fig. 6, the top-left coordinates of " exemplary " in " example text " are C (x3,y3), lower right coordinate is D (x4,
y4), the top-left coordinates of " text " are E (x5, y5), lower right coordinate is F (x6,y6).It is possible to be merged according to coordinate
Obtain the coordinate of entire formula.
In addition, ordinate can be updated during coordinate calculates, specifically: according to formula coordinate set, obtain
Take the binary image in target formula region;Ordinate projection is carried out to binary image, to obtain the ordinate of target formula;
Formula coordinate set is updated using the ordinate of target formula.
It is different from the prior art, the formula localization method of text image provided in this embodiment includes: to obtain line of text
String localization information and attention force information;According to String localization information and pay attention to force information, calculates the formula coordinate set of line of text
With formula boundary set;According to formula coordinate set and formula boundary set, the positioning coordinate of the formula in line of text is calculated.By above-mentioned
Mode can position the formula in text image using force information is paid attention to, to lay base for subsequent formulas solutions
Plinth, and then the image of formula can be accurately obtained.
Above-described embodiment is introduced below by several detailed steps:
It is the logical schematic of the formula localization method of text image provided by the embodiments of the present application refering to Fig. 8, Fig. 8, it should
Method includes:
Step 81: text image S of the input containing formula, the positioning coordinate information collection B, AOCR of every style of writing sheet is to every style of writing sheet
Identification information collection T and every this attention force information collection A of style of writing.
Step 82a: the i-th row text information ti is obtained from T.
Step 82b: binaryzation is carried out to image S and obtains binary image St.
Wherein, step 82a and step 82b can be performed simultaneously, and can also successively be executed.
Step 83: whether judge ti text character has mathematics keyword.If so, 84 are thened follow the steps, if it is not, then returning to
Step 82a.
Step 84: corresponding attention force information ai is obtained from A.
Step 85: the formula coordinate set of the i-th style of writing sheet is obtained according to ti, ai and corresponding String localization coordinate information bi
AB, formula boundary set FB and reference numeral k.
Step 86: calculating the positioning coordinate of formula existing for the row using AB, FB, k and St.
Step 87: exporting all formulary FBL.
It is the logical schematic of acquisition formula positioning coordinate set provided by the embodiments of the present application, this method refering to Fig. 9, Fig. 9
Include:
Step 901: finding all mathematics key characters of ti.
Step 902: centered on mathematics key character, searching all non-chinese characters to the left and right, obtain reference numeral collection
FS。
Step 903: calculating the wide w and high h of the style of writing sheet, and width is normalized to wm, set minimum abscissa value WminWith
Maximum abscissa value Wmax。
Step 904: traversal FS obtains corresponding number fs, extracts the attention information vector a of ai corresponding with fs.
Step 905: obtaining the corresponding index aidx of maximum value in a, and calculate abscissa value Cw。
Step 906: whether aidx is in first place for inquiry.If so, 907 are thened follow the steps, if it is not, thening follow the steps 908.
Step 907: formula info will be adjoined and be added in FB corresponding position, this corresponding number k of the i-th style of writing of reservation.
Step 908: judging CwWhether W is comparedminIt is small.If so, thening follow the steps 909,910 are thened follow the steps if not.
Step 909: using CwUpdate Wmin。
Step 910: judging CwWhether W is comparedmaxGreatly.If so, thening follow the steps 911,912 are thened follow the steps if not.
Step 911: using CwUpdate Wmax。
Step 912: return step 904, until FS has been handled.
Step 913: calculating current formula ab, and be added in formula coordinate set AB.
0, Figure 10 is the logical schematic that formula coordinate provided by the embodiments of the present application merges refering to fig. 1, this method comprises:
Step 101: obtaining j-th strip formula coordinate from AB, and intercept provisional binarization image tt from St.
Step 102: ordinate projection being carried out to tt using sciagraphy, obtains practical ordinate, updates the vertical of j-th strip formula
Coordinate.
Step 103: return step 101 has been handled until left and right formula coordinate is whole, has been performed the next step.
Step 104: judging that k-1 numbers corresponding FB and whether there is.If so, 105 are thened follow the steps, if it is not, then executing step
Rapid 107.
Step 105: judging whether k-th of fb has and adjoin formula info.If so, 106 are thened follow the steps, if it is not, then executing
Step 107.
Step 106: first formula coordinate of the last one formula coordinate in FBL and current AB are fused into new formula
Coordinate, and the last one formula coordinate in FBL is replaced, remaining formula is also added in FBL in AB.
Step 107: current corresponding AB formula coordinate set is added in formulary FBL.
Step 108: exporting the formula coordinate set FBL of the style of writing sheet.
It is to be appreciated that above-mentioned logic step is built upon on the basis of above-described embodiment, principle and calculation
Similar, which is not described herein again.
1, Figure 11 is the first structure diagram of image processing apparatus provided by the embodiments of the present application, the image refering to fig. 1
Processing unit 110 includes obtaining module 111, the first computing module 112 and the second computing module 113.
Wherein, module 111 is obtained to be used to obtain the String localization information of line of text and pay attention to force information;First computing module
112 for calculating the formula coordinate set and formula boundary set of line of text according to String localization information and attention force information;Second meter
Module 113 is calculated to be used to calculate the positioning coordinate of the formula in line of text according to formula coordinate set and formula boundary set.
2, Figure 12 is the second structural schematic diagram of image processing apparatus provided by the embodiments of the present application, the image refering to fig. 1
Processing unit 120 includes processor 121 and memory 122, and for storing program data, processor 121 is used for memory 122
Program data is executed to realize following method:
It obtains the String localization information of line of text and pays attention to force information;According to String localization information and pay attention to force information, meter
Calculate the formula coordinate set and formula boundary set of line of text;According to formula coordinate set and formula boundary set, the public affairs in line of text are calculated
The positioning coordinate of formula.
Optionally, processor 121 is also used to execute program data to realize following method: according to force information is paid attention to, obtaining
Take the attention information vector of target spcial character in target text row;Wherein, spcial character is the character in formula;Judgement note
Whether the corresponding index value of maximum value is 0 in meaning force information vector;If so, the formula info that adjoins of target spcial character is added
Enter to formula boundary set.
Optionally, processor 121 is also used to execute program data to realize following method: according to the text of target text row
This location information calculates the width value of target text row;According to maximum in the width value of target text row, attention information vector
It is worth corresponding index value, calculates the abscissa value of target spcial character;According to the abscissa value of target spcial character, determine maximum
Abscissa value and minimum abscissa value;It is sat according to String localization information, maximum abscissa value and minimum abscissa value, calculation formula
Mark collection.
Optionally, processor 121 is also used to execute program data to realize following method: being calculated using the following equation mesh
Mark the normalization width value of line of text:Wherein, w is the coordinate width of target text row, and h is target text
Capable height.
Optionally, processor 121 is also used to execute program data to realize following method: being calculated using the following equation mesh
Mark the abscissa value of spcial character:Wherein, w is the width of target text row, and aidx is attention
The corresponding index value of maximum value, w in information vectormFor the normalization width value of target text row.
Optionally, processor 121 is also used to execute program data to realize following method: according to the special word of the target
The initial abscissa value of symbol determines initial maximum abscissa value and initial minimum abscissa value;Getting, the target is special
When the new abscissa value of character, the new abscissa value and the maximum abscissa value and the minimum abscissa value
Size;If the new abscissa value is less than the minimum abscissa value, the minimum abscissa value is updated;If described new
Abscissa value be greater than the maximum abscissa value, then update the maximum abscissa value.
Optionally, processor 121 is also used to execute program data to realize following method: being calculated using the following equation public affairs
Formula coordinate set:Wherein, x1For the abscissa lvalue of formula, x2For the abscissa r value of formula, y1For formula
Ordinate upper value, y2To be worth under the ordinate of formula, bi0For the abscissa lvalue in String localization information, bi2For String localization
Ordinate upper value in information, bi3To be worth under the ordinate in String localization information, wminFor minimum abscissa value, wmaxFor maximum
Abscissa value.
Optionally, processor 121 is also used to execute program data to realize following method: judging the upper of target line text
Whether a line text is in formula boundary set;If so, judging whether target line text has adjoins formula info;If so, then will
First formula coordinate in target line text is merged with the last one formula coordinate of lastrow text.
Optionally, processor 121 is also used to execute program data to realize following method: according to formula coordinate set, obtaining
Take the binary image in target formula region;Ordinate projection is carried out to binary image, to obtain the ordinate of target formula;
Formula coordinate set is updated using the ordinate of target formula.
3, Figure 13 is the structural schematic diagram of computer storage medium provided by the embodiments of the present application, the computer refering to fig. 1
Program data 131 is stored in storage medium 130, the program data 131 is when being executed by processor, to realize following side
Method:
It obtains the String localization information of line of text and pays attention to force information;According to String localization information and pay attention to force information, meter
Calculate the formula coordinate set and formula boundary set of line of text;According to formula coordinate set and formula boundary set, the public affairs in line of text are calculated
The positioning coordinate of formula.
In several embodiments provided herein, it should be understood that disclosed method and equipment, Ke Yitong
Other modes are crossed to realize.For example, equipment embodiment described above is only schematical, for example, the module or
The division of unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units
Or component can be combined or can be integrated into another system, or some features can be ignored or not executed.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.Some or all of unit therein can be selected to realize present embodiment scheme according to the actual needs
Purpose.
In addition, each functional unit in each embodiment of the application can integrate in one processing unit, it can also
To be that each unit physically exists alone, can also be integrated in one unit with two or more units.It is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit in above-mentioned other embodiments is realized in the form of SFU software functional unit and as independence
Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the application
Technical solution substantially all or part of the part that contributes to existing technology or the technical solution can be in other words
It is expressed in the form of software products, which is stored in a storage medium, including some instructions are used
So that a computer equipment (can be personal computer, server or the network equipment etc.) or processor
(processor) all or part of the steps of each embodiment the method for the application is executed.And storage medium packet above-mentioned
It includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random
Access Memory), the various media that can store program code such as magnetic or disk.
The foregoing is merely presently filed embodiments, are not intended to limit the scope of the patents of the application, all according to this
Equivalent structure or equivalent flow shift made by application specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other
Technical field similarly includes in the scope of patent protection of the application.
Claims (12)
1. a kind of formula localization method of text image characterized by comprising
It obtains the String localization information of line of text and pays attention to force information;
According to the String localization information and the attention force information, formula coordinate set and the formula boundary of the line of text are calculated
Collection;
According to the formula coordinate set and the formula boundary set, the positioning coordinate of the formula in the line of text is calculated.
2. the method according to claim 1, wherein
It is described according to the String localization information and the attention force information, calculate the formula coordinate set and formula of the line of text
The step of boundary set, comprising:
According to the attention force information, the attention information vector of target spcial character in the target text row is obtained;Wherein,
The spcial character is the character in formula;
Judge whether the corresponding index value of maximum value is 0 in the attention information vector;
If so, the formula info that adjoins of the target spcial character is added to formula boundary set.
3. according to the method described in claim 2, it is characterized in that,
The method also includes:
According to the String localization information of target text row, the width value of the target text row is calculated;
According to the corresponding index value of maximum value in the width value of the target text row, the attention information vector, institute is calculated
State the abscissa value of target spcial character;
According to the abscissa value of the target spcial character, maximum abscissa value and minimum abscissa value are determined;
According to the String localization information, the maximum abscissa value and the minimum abscissa value, calculation formula coordinate set.
4. according to the method described in claim 3, it is characterized in that,
The String localization information according to target text row, the step of calculating the width value of the target text row, comprising:
It is calculated using the following equation the normalization width value of the target text row:
Wherein, w is the coordinate width of the target text row, and h is the height of the target text row.
5. according to the method described in claim 3, it is characterized in that,
The width value according to the target text row, the corresponding index value of maximum value in the attention information vector, meter
The step of calculating the abscissa value of the target spcial character, comprising:
It is calculated using the following equation the abscissa value of the target spcial character:
Wherein, w is the width of the target text row, and aidx is the corresponding index of maximum value in the attention information vector
Value, wmFor the normalization width value of the target text row.
6. according to the method described in claim 3, it is characterized in that,
The step of abscissa value of the target spcial character, determining maximum abscissa value and minimum abscissa value, comprising:
According to the initial abscissa value of the target spcial character, initial maximum abscissa value and initial minimum abscissa are determined
Value;
When getting the new abscissa value of the target spcial character, the new abscissa value and the maximum cross
The size of coordinate value and the minimum abscissa value;
If the new abscissa value is less than the minimum abscissa value, the minimum abscissa value is updated;
If the new abscissa value is greater than the maximum abscissa value, the maximum abscissa value is updated.
7. according to the method described in claim 3, it is characterized in that,
It is described according to the String localization information, the maximum abscissa value and the minimum abscissa value, calculation formula coordinate
The step of collection, comprising:
It is calculated using the following equation formula coordinate set:
Wherein, x1For the abscissa lvalue of formula, x2For the abscissa r value of formula, y1For the ordinate upper value of formula, y2For public affairs
It is worth under the ordinate of formula, bi0For the abscissa lvalue in the String localization information, bi2It is vertical in the String localization information
Coordinate upper value, bi3To be worth under the ordinate in the String localization information, wminFor minimum abscissa value, wmaxFor maximum abscissa
Value.
8. the method according to claim 1, wherein
It is described according to the formula coordinate set and the formula boundary set, calculate the positioning coordinate of the formula in the line of text
Step, comprising:
Judge the lastrow text of target line text whether in the formula boundary set;
If so, judging whether the target line text has adjoins formula info;
If so, then the last one formula of first formula coordinate and the lastrow text in the target line text is sat
Mark is merged.
9. the method according to claim 1, wherein
The method also includes:
According to the formula coordinate set, the binary image in target formula region is obtained;
Ordinate projection is carried out to the binary image, to obtain the ordinate of the target formula;
The formula coordinate set is updated using the ordinate of the target formula.
10. a kind of image processing apparatus characterized by comprising
Module is obtained, for obtaining the String localization information of line of text and paying attention to force information;
First computing module, for calculating the public affairs of the line of text according to the String localization information and the attention force information
Formula coordinate set and formula boundary set;
Second computing module, for calculating the public affairs in the line of text according to the formula coordinate set and the formula boundary set
The positioning coordinate of formula.
11. a kind of image processing apparatus, which is characterized in that described image processing unit includes processor and memory, described
Memory is for storing program data, and the processor is for executing described program data to realize as claim 1-9 is any
Method described in.
12. a kind of computer storage medium, which is characterized in that the computer storage medium is described for storing program data
Program data is when being executed by processor, to realize such as the described in any item methods of claim 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910452711.5A CN110210467B (en) | 2019-05-28 | 2019-05-28 | Formula positioning method of text image, image processing device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910452711.5A CN110210467B (en) | 2019-05-28 | 2019-05-28 | Formula positioning method of text image, image processing device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110210467A true CN110210467A (en) | 2019-09-06 |
CN110210467B CN110210467B (en) | 2021-07-30 |
Family
ID=67789041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910452711.5A Active CN110210467B (en) | 2019-05-28 | 2019-05-28 | Formula positioning method of text image, image processing device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110210467B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112613279A (en) * | 2020-12-24 | 2021-04-06 | 北京乐学帮网络技术有限公司 | File conversion method and device, computer device and readable storage medium |
CN112699337A (en) * | 2019-10-22 | 2021-04-23 | 北京易真学思教育科技有限公司 | Equation correction method, electronic device and computer storage medium |
CN112712075A (en) * | 2020-12-30 | 2021-04-27 | 科大讯飞股份有限公司 | Formula detection method, electronic equipment and storage device |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130100853A1 (en) * | 2011-10-19 | 2013-04-25 | Electronics And Telecommunications Research Institute | Apparatus and method for recognizing target mobile communication terminal |
CN104751148A (en) * | 2015-04-16 | 2015-07-01 | 同方知网数字出版技术股份有限公司 | Method for recognizing scientific formulas in layout file |
CN105913057A (en) * | 2016-04-12 | 2016-08-31 | 中国传媒大学 | Projection and structure characteristic-based in-image mathematical formula detection method |
CN107169485A (en) * | 2017-03-28 | 2017-09-15 | 北京捷通华声科技股份有限公司 | A kind of method for identifying mathematical formula and device |
CN107798321A (en) * | 2017-12-04 | 2018-03-13 | 海南云江科技有限公司 | A kind of examination paper analysis method and computing device |
CN108399386A (en) * | 2018-02-26 | 2018-08-14 | 阿博茨德(北京)科技有限公司 | Information extracting method in pie chart and device |
CN109241861A (en) * | 2018-08-14 | 2019-01-18 | 科大讯飞股份有限公司 | A kind of method for identifying mathematical formula, device, equipment and storage medium |
CN109471583A (en) * | 2014-03-20 | 2019-03-15 | 卡西欧计算机株式会社 | Electronic equipment, mathematical expression display control method and recording medium |
CN109614944A (en) * | 2018-12-17 | 2019-04-12 | 科大讯飞股份有限公司 | A kind of method for identifying mathematical formula, device, equipment and readable storage medium storing program for executing |
CN111340020A (en) * | 2019-12-12 | 2020-06-26 | 科大讯飞股份有限公司 | Formula identification method, device, equipment and storage medium |
-
2019
- 2019-05-28 CN CN201910452711.5A patent/CN110210467B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130100853A1 (en) * | 2011-10-19 | 2013-04-25 | Electronics And Telecommunications Research Institute | Apparatus and method for recognizing target mobile communication terminal |
CN109471583A (en) * | 2014-03-20 | 2019-03-15 | 卡西欧计算机株式会社 | Electronic equipment, mathematical expression display control method and recording medium |
CN104751148A (en) * | 2015-04-16 | 2015-07-01 | 同方知网数字出版技术股份有限公司 | Method for recognizing scientific formulas in layout file |
CN105913057A (en) * | 2016-04-12 | 2016-08-31 | 中国传媒大学 | Projection and structure characteristic-based in-image mathematical formula detection method |
CN107169485A (en) * | 2017-03-28 | 2017-09-15 | 北京捷通华声科技股份有限公司 | A kind of method for identifying mathematical formula and device |
CN107798321A (en) * | 2017-12-04 | 2018-03-13 | 海南云江科技有限公司 | A kind of examination paper analysis method and computing device |
CN108399386A (en) * | 2018-02-26 | 2018-08-14 | 阿博茨德(北京)科技有限公司 | Information extracting method in pie chart and device |
CN109241861A (en) * | 2018-08-14 | 2019-01-18 | 科大讯飞股份有限公司 | A kind of method for identifying mathematical formula, device, equipment and storage medium |
CN109614944A (en) * | 2018-12-17 | 2019-04-12 | 科大讯飞股份有限公司 | A kind of method for identifying mathematical formula, device, equipment and readable storage medium storing program for executing |
CN111340020A (en) * | 2019-12-12 | 2020-06-26 | 科大讯飞股份有限公司 | Formula identification method, device, equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
SHINPEI YAMAZAKI, FUMIHIRO FURUKORI ET.AL.: ""Embedding a Mathematical OCR Module into OCRopus"", 《2011 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION》 * |
ZBIGNIEW WOJNA, ALEX GORBAN ET.AL.: ""Attention-based Extraction of Structured Information from Street View Imagery"", 《ARXIV》 * |
林晓燕,高良才,汤帜: ""中文电子文档的数学公式定位研究"", 《北京大学学报(自然科学版)》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112699337A (en) * | 2019-10-22 | 2021-04-23 | 北京易真学思教育科技有限公司 | Equation correction method, electronic device and computer storage medium |
CN112699337B (en) * | 2019-10-22 | 2022-07-29 | 北京易真学思教育科技有限公司 | Equation correction method, electronic device and computer storage medium |
CN112613279A (en) * | 2020-12-24 | 2021-04-06 | 北京乐学帮网络技术有限公司 | File conversion method and device, computer device and readable storage medium |
CN112712075A (en) * | 2020-12-30 | 2021-04-27 | 科大讯飞股份有限公司 | Formula detection method, electronic equipment and storage device |
CN112712075B (en) * | 2020-12-30 | 2023-12-01 | 科大讯飞股份有限公司 | Arithmetic detection method, electronic equipment and storage device |
Also Published As
Publication number | Publication date |
---|---|
CN110210467B (en) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2678485C1 (en) | Method of character segmentation and recognition | |
KR101617681B1 (en) | Text detection using multi-layer connected components with histograms | |
JP5522408B2 (en) | Pattern recognition device | |
WO2017140233A1 (en) | Text detection method and system, device and storage medium | |
US20070253040A1 (en) | Color scanning to enhance bitonal image | |
CN110188747A (en) | A kind of sloped correcting method of text image, device and image processing equipment | |
JP2016517587A (en) | Classification of objects in digital images captured using mobile devices | |
CN101925904A (en) | Document verification using dynamic document identification framework | |
CN110210467A (en) | A kind of formula localization method, image processing apparatus, the storage medium of text image | |
CN110852311A (en) | Three-dimensional human hand key point positioning method and device | |
CN109840520A (en) | A kind of invoice key message recognition methods and system | |
CN105701489A (en) | Novel digital extraction and identification method and system thereof | |
CN115082888B (en) | Lane line detection method and device | |
CN112699867A (en) | Fixed format target image element information extraction method and system | |
JP4901229B2 (en) | Red-eye detection method, apparatus, and program | |
CN113033558A (en) | Text detection method and device for natural scene and storage medium | |
CN111339932B (en) | Palm print image preprocessing method and system | |
CN111507119B (en) | Identification code recognition method, identification code recognition device, electronic equipment and computer readable storage medium | |
CN111626145A (en) | Simple and effective incomplete form identification and page-crossing splicing method | |
CN114067339A (en) | Image recognition method and device, electronic equipment and computer readable storage medium | |
CN113033559A (en) | Text detection method and device based on target detection and storage medium | |
CN110766001B (en) | Bank card number positioning and end-to-end identification method based on CNN and RNN | |
CN112532884A (en) | Identification method and device and electronic equipment | |
JP2008028716A (en) | Image processing method and apparatus | |
CN115223173A (en) | Object identification method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230831 Address after: No. 79 Wanbo Second Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province, 5114303802 (self declared) Patentee after: Guangzhou Huanju Mark Network Information Co.,Ltd. Address before: 511449 28th floor, block B1, Wanda Plaza, Nancun Town, Panyu District, Guangzhou City, Guangdong Province Patentee before: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right |