CN112712075A - Formula detection method, electronic equipment and storage device - Google Patents

Formula detection method, electronic equipment and storage device Download PDF

Info

Publication number
CN112712075A
CN112712075A CN202011644726.0A CN202011644726A CN112712075A CN 112712075 A CN112712075 A CN 112712075A CN 202011644726 A CN202011644726 A CN 202011644726A CN 112712075 A CN112712075 A CN 112712075A
Authority
CN
China
Prior art keywords
characters
character
text region
text
operation result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011644726.0A
Other languages
Chinese (zh)
Other versions
CN112712075B (en
Inventor
马皓
何春江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202011644726.0A priority Critical patent/CN112712075B/en
Publication of CN112712075A publication Critical patent/CN112712075A/en
Application granted granted Critical
Publication of CN112712075B publication Critical patent/CN112712075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/23Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on positionally close patterns or neighbourhood relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The application discloses a formula detection method, a device and a storage device, wherein the method comprises the following steps: detecting a text region of a formula in an image to be detected, and identifying characters in the text region; using the characters to check whether a plurality of equations exist in the text area; based on the presence of multiple expressions in the text region, the expressions in the text region are located using the characters. According to the scheme, the multiple expressions in the text area can be distinguished, and the accuracy of expression recognition is improved.

Description

Formula detection method, electronic equipment and storage device
Technical Field
The present application relates to the field of natural language processing, and in particular, to a method, device and storage apparatus for formula detection.
Background
In recent years, computer vision technology has been rapidly developed, and image-text recognition technology in computer vision technology is widely applied in the field of education. The image-text recognition technology is used for automatically recognizing the formula, and then automatic correction can be carried out, so that the burdens of teachers and parents can be greatly reduced. However, modification of the formula depends on accurate detection of the formula, and therefore, how to improve the accuracy of the formula detection is an urgent problem to be solved.
Disclosure of Invention
The application mainly provides an equation detection method, equipment and a storage device, which can improve the accuracy of equation detection.
In a first aspect, the present application provides a method for detecting an equation, including: detecting a text region of a formula in an image to be detected, and identifying characters in the text region; using the characters to check whether a plurality of equations exist in the text area; based on the presence of multiple expressions in the text region, the expressions in the text region are located using the characters.
In order to solve the above problem, a second aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, wherein the memory stores program instructions, and the processor is configured to execute the program instructions to implement the method for detecting an equation described in the first aspect.
In order to solve the above problem, a third aspect of the present application provides a storage device, which stores program instructions capable of being executed by a processor, the program instructions being used for implementing the method for detecting an equation described in the first aspect.
According to the scheme, whether a plurality of expressions exist in the text area is checked, and if the plurality of expressions exist, the expressions in the text area are located by using the recognized characters, so that the plurality of expressions in the text area can be distinguished, and the accuracy of expression recognition is improved.
Drawings
FIG. 1 is a first flow chart of an embodiment of the present invention;
FIG. 2 is a second flow chart of an embodiment of the present invention;
FIG. 3 is a schematic diagram of an embodiment of an attention model of the present algorithm detection method;
FIG. 4 is a third flowchart of an embodiment of the present invention;
FIG. 5 is a fourth flowchart illustrating an embodiment of the algorithm detection method of the present application;
FIG. 6 is a fifth flowchart illustrating an embodiment of the algorithm detection method of the present application;
FIG. 7 is a sixth flowchart illustrating an embodiment of the algorithm detection method of the present application;
FIG. 8 is a schematic flow chart diagram of an embodiment of an electronic device of the present application;
FIG. 9 is a block diagram of an embodiment of a memory device according to the present application.
Detailed Description
The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a first flow chart of an embodiment of the arithmetic detection method. Specifically, the embodiment includes the following steps:
step S11: and detecting a text area of the formula in the image to be detected, and identifying characters in the text area.
The expressions refer to expressions listed when performing the calculation of a number (or an algebraic expression). The arithmetic expression may include mathematical signs such as a number (or a letter instead of a number), arithmetic signs (four arithmetic, power, square, factorial, permutation and combination), and a "═ sign. The text region of the formula in the image to be detected can be understood as a region where the arithmetic is located in the image to be detected, and the region can include a certain number of pixel points.
In one embodiment, the character comprises a relationship character comprising any one of: equal to, about equal to, greater than, less than, not greater than. The symbol equal to "═ is", the symbol equal to or approximately equal to "≈", the symbol greater than ">, the symbol smaller than" < ", the symbol not smaller than" notless ", and the symbol not greater than" < ". In one embodiment, the relationship symbols also include the "-" symbol in a vertical equation.
In the image to be detected, if a plurality of equations exist in the image to be detected, a plurality of text regions can be identified. In a text area, there may be one equation or a plurality of equations.
In an implementation scenario, in order to improve the detection efficiency of the text region, a text detection network may be trained in advance, so that the image to be detected may be input into the text detection network to obtain the text region of the equation in the image to be detected. In particular, the text detection network may include, but is not limited to: DBNet (Real-time Scene Text Detection with differentiated binary Network), PSENet (progressive Scale Expansion Network), etc., which are not limited herein.
After the text region of the formula in the image to be detected is obtained, the text region of the formula in the image to be detected is identified by utilizing a text identification network, so that characters in the text region are identified. Text recognition networks may include, but are not limited to: crnn (volumetric recovery Neural network) + ctc (connectivity temporal classification), cnn (volumetric Neural networks) + Seq2Seq + Attention, etc., and the text recognition network may be a network having an Encoder-decoder structure, which is not limited herein. Such as numbers, letters, operator symbols, and relational symbols.
Step S12: using the characters, it is checked whether a plurality of equations exist within the text region.
After the characters associated with the expressions in the text region are obtained, the characters can be used to check whether multiple expressions exist within the text region. For example, when it is detected that a plurality of "═ signs exist in one text region, it can be considered that a plurality of equations exist in the text region. Furthermore, when there are a plurality of "═ signs in one text region and the characters in the text region do not satisfy the result of the associated mathematical operation, it can be said that there are a plurality of equations.
Upon detecting that a plurality of equations exist within the text region, execution of step S13 may be continued. In one implementation scenario, where only one equation is detected within the text region, no further action may be required. In another implementation scenario, in the case that only one formula is detected in the text area, the identified formula may be further modified to determine whether the formula is operated correctly, and a sign indicating that the operation is correct or a sign indicating that the operation is incorrect may be displayed.
Step S13: based on the presence of multiple expressions in the text region, the expressions in the text region are located using the characters.
If there are multiple expressions in the text region, the recognized characters can be used to locate the expressions in the text region, so that the multiple expressions in the text region can be separated to become independent expressions. For example, the recognition may be accomplished by determining a number of characters satisfying the result of the mathematical operation as one equation based on the mathematical operation relationship of the characters in the text region. Or according to some notation for indicating a separation, such as "; ",". "and so on to locate the formula.
Therefore, whether a plurality of expressions exist in the text area is checked, and if the plurality of expressions exist, the expressions in the text area are positioned by using the recognized characters, so that the plurality of expressions in the text area can be distinguished, the accuracy of expression recognition is improved, subsequent correction is facilitated, and the correction efficiency is improved.
Referring to fig. 2, fig. 2 is a second flow chart of an embodiment of the mathematical detection method of the present application. The present embodiment is a further extension of the step "recognizing characters in text region" described in step S11 in the above embodiment, and specifically, the step includes the following steps S111 to S113.
Step S111: extracting a regional characteristic diagram of the text region, and obtaining the attention value of each pixel point in the regional characteristic diagram by using the regional characteristic diagram based on an attention mechanism; wherein, the larger the attention value is, the higher the possibility that the pixel point belongs to the character is.
As described in the foregoing disclosure, in order to improve the recognition efficiency, when the characters in the text region are recognized, the characters in the text region may be recognized by using a text recognition network. When the characters in the text region are recognized by using the text recognition network, the feature map extracted by the text recognition network can be used as the region feature map. Taking the text recognition network as a network with an encoder-decoder structure as an example, the feature map extracted by the encoding layer in the text recognition network can be used as the region feature map of the text region. It can be understood that when multiple coding layers exist in the text recognition network, the feature map output by one of the coding layers can be extracted as the regional feature map.
After obtaining the region feature map of the text region, in the text recognition algorithm, processing may be performed on the region feature map based on an attention Mechanism (for example, Self-attention Mechanism) by using an attention model, so as to obtain an attention value of each pixel point in the region feature map, where in the obtained attention values of the pixel points, the larger the attention value is, the higher the possibility that the pixel point belongs to a character is.
Step S112: and weighting the corresponding pixel points of the regional characteristic graph by using the attention values of the pixel points to obtain a weighted characteristic graph.
After the attention values of the pixel points are obtained, the attention model can use the attention values of the pixel points to weight the pixel points corresponding to the regional characteristic graph, and a weighted characteristic graph is obtained. In the weighted feature map, the pixel points corresponding to the characters can obtain high weights. Thus, the position information of the character in the text area can be obtained in the weighted feature map.
Because the weighted feature map contains information about the location of the character, the weighted feature map can be saved for subsequent positioning of the formula in the text area. Specifically, each recognized character may generate a weighted feature map, that is, how many characters are recognized in the text region may generate a corresponding number of weighted feature maps.
Step S113: and identifying by using the weighted feature map to obtain characters in the text area.
In one embodiment, the weighted feature map and the feature map output by the encoding layer of the text recognition algorithm may be input into the decoding layer of the text recognition algorithm, so as to complete the recognition of the characters in the text region, thereby obtaining the characters in the text region.
Referring to fig. 3, fig. 3 is a schematic diagram of an embodiment of an attention model of the algorithm detection method of the present application. The region feature map 301 is a region feature map output by an encoding layer in a text recognition algorithm. The convolution kernel 302, the convolution kernel 303, and the convolution kernel 304 are all 1 × 1 convolution kernels, and are used for converting the region feature map. After conversion, matrix multiplication (f (x) and g (x)) are performed on the intermediate feature map 305 and the intermediate feature map 306 to establish the association of each pixel with each pixel, and then a softmax operation is used to obtain an attention value map 308 of each pixel. Finally, the intermediate feature map 307 and the attention value map 308 are subjected to matrix calculation to obtain a weighted feature map 309.
Unlike the foregoing embodiments, the present embodiment can enhance the character recognition capability of the text recognition algorithm by introducing an attention mechanism, and obtain the position information of the recognized character for the positioning of the subsequent algorithm.
Referring to fig. 4, fig. 4 is a third flowchart of an embodiment of the equation detection method of the present application. The present embodiment is a further extension of the step described in step S12 in the above embodiment, and specifically, the step includes the following step S121 and step S122.
Step S121: the relation character in the text area is searched.
On the basis of identifying the characters in the text area, the relation characters in the text area can be further searched. For example, all identified characters may be traversed to search for relationship characters. The meaning of the relationship symbol can be referred to the detailed description in the step S11, and is not described herein again.
In one implementation scenario, if no relation symbol is searched out, it may be considered that no formula exists in the text region, and the text region may be considered as not belonging to the object to be processed in the present application and therefore may be ignored.
Step S122: whether a plurality of equations exist in the text region is determined based on characters on both sides of the relation.
The operation results of some equations can be obtained according to the relation symbol, so that whether a plurality of equations exist in the text region can be determined according to characters on two sides of the relation symbol.
In one embodiment, if the whole text region has only one relation character, the text region can be considered to have only one formula, and the condition of multiple formulas does not exist. If there are multiple relation symbols in the text area, it is necessary to further determine whether there are multiple equations in the text area.
Referring to fig. 5, fig. 5 is a fourth flowchart illustrating an embodiment of the mathematical expression detection method of the present application. Specifically, when a plurality of relationship symbols are searched, step S122 may specifically include the following steps S1221 to S1223.
Step S1221: and acquiring a first operation result of the characters on the left side of the first relation symbol and acquiring a second operation result on the right side of the last relation symbol.
It is to be noted that the first operation result in the present specification corresponds to a third operation result of the claims, the second operation result in the present specification corresponds to a fourth operation result of the claims, the third operation result in the present specification corresponds to the first operation result of the claims, and the fourth operation result in the present specification corresponds to the second operation result of the claims.
Because there are multiple relation symbols, if they belong to the same formula, it indicates that the characters in the text region all satisfy the specific mathematical operation result, so the characters on the left side of the first relation symbol can be obtained and the corresponding operation result, that is, the first operation result, is obtained by calculation. The operation result on the right side of the last relation symbol can also be obtained, and the operation result is the second operation result. Then, whether a plurality of expressions exist in the text region is judged according to the two operation results.
For example, in a text area, if the recognized character is "2 +2 ═ 43+3 ═ 6", the first relation symbol is left "═ and the character on the left side of the relation symbol is" 2+2 ", and the first operation result is 4; the last relation symbol is "6" for the character to the right, and 6 for the second operation. For another example, in a text region, if the recognized character is "2 +2 ═ 43+3 ═ 2+ 4", the first relation symbol is left "═ and the character on the left side of the relation symbol is" 2+2 ", and the first operation result is 4; the last relation symbol is "2 + 4" on the right side of the character, and the result of the second operation is 6.
Step S1222: and if the first operation result and the second operation result meet the first preset condition, determining that an equation exists in the text region.
Step S1223: and if the first operation result and the second operation result do not meet the first preset condition, determining that a plurality of equations exist in the text region.
It is to be noted that the first preset condition in the present specification corresponds to the second preset condition in the claims, and the second preset condition in the present specification corresponds to the first preset condition in the claims.
In one embodiment, the first preset condition may include: the first operation result and the second operation result both satisfy the relationship of the first relation symbol and the relationship of the last relation symbol, that is, the first operation result and the second operation result both satisfy the operation requirement of the first relation symbol and the operation requirement of the last relation symbol. For example, in the above example, if the first relation symbol is "═ and the last relation symbol is" ═ and the first operation result is 4, the second operation result is 6, and 4 is not equal to 6, it can be determined that a plurality of equations exist in the text region.
For example, in a text region, if the recognized character is "7 +2 ═ 93 +3< 8", the first relation symbol is "═ and the first operation result is 9; the last relation symbol is "<" and the result of the second operation is 8. The first operation result does not satisfy the operation requirement of the first relation symbol, and does not satisfy the operation requirement of the last relation symbol, so that the first operation result and the second operation result do not satisfy the first preset condition, and therefore, a plurality of expressions in the text region can be determined.
For another example, in a text region, if the recognized character is "5 +2 ═ 7 ═ 3+ 4", the first relation symbol is "═ and the first operation result is 7; the last relation symbol is "═" and the result of the second operation is 7. The first operation result meets the operation requirements of the first relation symbol and the last relation symbol, so that the first operation result and the second operation result meet the first preset condition, and 1 formula can be determined to exist in the text region.
Therefore, by utilizing the relationship of the characters on both sides of the relationship character, it can be determined whether a plurality of expressions exist in the text region, thereby making it possible to subsequently locate the expressions in the text region where the plurality of expressions exist.
Referring to fig. 6, fig. 6 is a fifth flowchart illustrating an exemplary embodiment of the mathematical expression detection method of the present application. The present embodiment is a further extension of the step described in step S13 in the above embodiment, and specifically, the step includes the following steps S131 to S135.
Step S131: the relation character in the text area is searched.
For a detailed description of this step, please refer to step S121, which is not described herein again.
Step S132: acquiring a third operation result of the characters positioned on the left side of the relation symbol, and selecting a plurality of characters on the right side of the relation symbol; and the characters and the third operation result meet a second preset condition.
After the relation symbols within the text region are searched, an arithmetic determination may be made for each relation symbol. Specifically, a third operation result of the characters located on the left side of the relation symbol may be obtained, and a plurality of characters may be selected on the right side of the relation symbol, where the selected plurality of characters and the third operation result satisfy a second preset condition.
When selecting a relation symbol, the first relation symbol on the left side may be selected first, and the expression in which the relation symbol is located may be determined.
When selecting the character on the left side of the relation symbol, if an operation operator exists on the left side of the relation symbol, the operation operator closest to the relation symbol in the left side of the relation symbol may be determined, for example, the operation operator is "+", "-", "+", etc., then the characters on the left and right sides of the operation operator are selected, and a third operation result is determined according to the selected character.
The specific process of selecting the plurality of characters may be to first preselect a first character on the right side of the relation character, and determine whether the first character and the third operation result satisfy a second preset condition, and if so, may select the first character. If not, preselecting the first character and the second character on the right side of the relation character, and judging whether the first character, the second character and the third operation result meet a second preset condition. If not, selecting the first character, the second character and the third character on the right side of the relation character, and so on. The second preset condition is, for example: the fourth operation result and the third operation result of the characters enable the relation symbol to be established, namely the fourth operation result and the third operation result of the characters meet the operation requirement of the relation symbol.
For example, if the text area recognizes that the character is "8 +9 ═ 176 +3 ═ 9", the third operation result for the character on the left side of the first relation character is 17. When the character on the right side of the relation character is selected, the first character "1" may be selected first, that is, the fourth operation result is 1, and then it is determined whether the "1" and the third operation result 17 satisfy the second preset condition. Obviously, the "1" and the third operation result 17 do not satisfy the second preset condition, at this time, the first character "1" and the second character "7" on the right side of the relation character need to be selected, that is, the fourth operation result is 17, at this time, whether the fourth operation result 17 and the third operation result 17 satisfy the second preset condition is continuously judged, obviously, the fourth operation result and the third operation result satisfy the second preset condition, at this time, it can be determined that the selected characters are the first character and the second character on the right side of the relation character.
S133: it is determined whether a number of characters to the right of the relation character are selected.
In selecting the characters on the right side of the relation symbol, if the characters are not selected, step S134 may be performed, and if the characters are selected, step S135 may be performed.
Step S134: and displaying an operation error mark at a preset position of the text area.
Since several characters are not selected, it can be considered that this equation is calculated incorrectly, and thus, an operation error flag can be displayed at a preset position of the text region. The preset position is, for example, beside the expression, and the specific position may be set according to the need, which is not limited herein. The operation error flag is, for example, "X".
After the arithmetic operation error is determined, a determination may be made as to the second relation symbol from the left side in the text region, and step S132 may be executed again. After the second relation symbol is judged, the third relation symbol from the left side in the text area is judged, and the like.
Step S135: based on the number of characters, an equation position of an equation including the number of characters within the text region is determined.
After a number of characters are selected, this means that an equation is determined, and thus the position of the equation containing the number of characters in the text region can be determined. Specifically, the operator positions include the positions of characters on the right side of the relation symbol, the positions of the relation symbol, and the positions of characters (including the operator) on the left side of the selected relation symbol.
Thus, the arithmetic position can be determined by judging the relationship between the character on the left side of the relation character and the character on the right side.
Referring to fig. 7, fig. 7 is a sixth flowchart illustrating an embodiment of the method for detecting an expression of the present application, in this embodiment, the step S135 specifically includes the following steps S1351 and S1352.
Step S1351: and obtaining the character position of the last character in the text area in the plurality of characters by using the attention value.
In the embodiment of the present disclosure, the attention value represents the possibility of the existence of the character at the position of the pixel point, and the greater the attention value is, the greater the possibility of the existence of the character is. The manner of obtaining the attention value can refer to the related description in the foregoing embodiments, and is not described herein again. After the attention value of each pixel point is obtained and the corresponding weighted feature map is obtained according to the attention value, the position of the pixel point where the character contained in the weighted feature map is located, that is, the position of the character can be determined.
Since the weighted feature map includes the position information of the character, the character position of the last character in the text region among the characters (the characters on the right side of the selected relation character) can be obtained by using the weighted feature map. Specifically, a weighted feature map corresponding to the last character of the characters is searched, and the character position of the last character in the text area is obtained according to the position information of the last character included in the weighted feature map.
Step S1352: and determining the formula position of the formula containing a plurality of characters in the text area by using the character position.
And after the character position of the last character in the text area is obtained, the formula positions of the formulas in which the characters are located in the text area can be determined. The position of the equation includes, for example, a position at which the equation starts and a position at which the equation ends.
In one embodiment, if the relation symbol of the equation is the first relation symbol from the left in the text region, it may be determined that the starting position on the left side of the text region is the position where the equation starts, and the character position of the last character in the text region is the position where the equation ends.
In an embodiment, the position of the leftmost character in the characters on the left side of the selection relation character, that is, the position where the equation starts, may also be obtained first, and the specific determination method is the same as the determination of the character position of the last character in the text region. Then, the position of the leftmost character among the characters on the left side of the relation character and the region of the character position (position where the equation ends) of the last character in the text region are the positions of the equations.
In one embodiment, after the formula position is determined, a formula correct mark, for example, "√" as an example, may be displayed at the formula position of the text region, for example, a formula correct mark may be displayed at the formula end position.
By displaying the operation correct mark or the operation error mark, the automatic correction of the formula can be realized, the pressure of correction personnel is reduced, and the correction efficiency is improved.
In one embodiment, after determining the formula position, the above step S131 may be executed again starting from the next character of the above characters: a step of searching for a relation character within the text region and subsequent steps. At this time, the region searched in step S131 may exclude the arithmetic position that has been determined. Thereby, the positioning of the formula in the text area can be continuously realized.
Therefore, by using the weighted feature map, the positions of the characters can be determined, and based on the selected characters, the positioning of the formula can be realized, and the formula in which the text regions of a plurality of formulas exist can be distinguished.
Referring to fig. 8, fig. 8 is a schematic flowchart illustrating an embodiment of an electronic device according to the present application. The electronic device 80 comprises a memory 81 and a processor 82 coupled to each other, the memory 81 stores program instructions, and the processor 82 is configured to execute the program instructions to implement the steps in any of the above-mentioned embodiments of the method for detecting an equation.
Specifically, the processor 82 is configured to control itself and the memory 81 to implement the steps in any of the above-described embodiments of the equation detection method. The processor 82 may be an integrated circuit chip having signal processing capabilities. The Processor 82 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 82 may be collectively implemented by an integrated circuit chip.
In the embodiment of the present disclosure, the processor 82 is configured to perform detecting a text region of an equation in an image to be detected, and identify characters in the text region; using the characters to check whether a plurality of equations exist in the text area; based on the presence of multiple expressions in the text region, the expressions in the text region are located using the characters.
According to the scheme, whether a plurality of expressions exist in the text region is checked, and under the condition that the plurality of expressions exist, the expressions in the text region are located by using the recognized characters, so that the plurality of expressions in the text region can be distinguished, the accuracy of expression recognition is improved, subsequent correction is facilitated, and the correction efficiency is improved.
In some disclosed embodiments, the characters include relational characters. The processor 82 is configured to locate an equation in the text region using the character, and includes: searching for a relation character in the text region; acquiring a third operation result of the characters positioned on the left side of the relation symbol, and selecting a plurality of characters on the right side of the relation symbol; the characters and the third operation result meet a second preset condition; based on the number of characters, an equation position of an equation including the number of characters within the text region is determined.
Unlike the foregoing embodiment, by determining the relationship between the characters on the left side and the characters on the right side of the relationship character, the arithmetic position can be determined, and the positioning of the arithmetic expression can be realized.
In some disclosed embodiments, the processor 82 is configured to identify characters in the text region, and includes: extracting a regional characteristic diagram of the text region, and obtaining the attention value of each pixel point in the regional characteristic diagram by using the regional characteristic diagram based on an attention mechanism; wherein, the larger the attention value is, the higher the possibility that the pixel point belongs to the character is; weighting the corresponding pixel points of the regional characteristic graph by using the attention values of the pixel points to obtain a weighted characteristic graph; and identifying by using the weighted feature map to obtain characters in the text area. The processor 82 is configured to determine an equation position of an equation containing a plurality of characters in the text region based on the plurality of characters, and includes: obtaining the character position of the last character in the text area in the plurality of characters by using the attention value; and determining the formula position of the formula containing a plurality of characters in the text area by using the character position.
In distinction from the foregoing embodiments, by using the weighted feature map, the positions of the characters can be determined, and based on the selected characters, the positioning of the equations can be achieved, differentiating the equations in which text regions of a plurality of equations exist.
In some disclosed embodiments, after the processor 82 is configured to determine the formula location of the formula containing the number of characters within the text region based on the number of characters, the processor 82 is further configured to re-perform the step of searching for the relation character within the text region and subsequent steps, starting with a subsequent character of the number of characters.
Unlike the foregoing embodiment, the positioning of the expression in the text region can be continued by re-executing the step of retrieving the relation symbol in the text region and the subsequent steps.
In some disclosed embodiments, the second preset condition includes: the fourth operation result and the third operation result of the characters enable a relation symbol, and/or the relation symbol comprises any one of the following characters: equal to, about equal to, greater than, less than, not greater than.
In some disclosed embodiments, the processor 82 is further configured to display an operation correct mark at the formula location of the text region; in the case where a number of characters are not selected, an operation error flag is displayed at a preset position of the text region.
Different from the embodiment, the method can realize automatic correction of the formula by displaying the operation correct mark or the operation error mark, reduce the pressure of correction personnel and improve the correction efficiency.
In some disclosed embodiments, the characters include relational characters, and the processor 82 is configured to check whether a plurality of equations exist in the text region, including: searching for a relation character in the text region; whether a plurality of equations exist in the text region is determined based on characters on both sides of the relation.
Unlike the foregoing embodiment, whether or not a plurality of equations exist within the text region may be determined based on characters on both sides of the relation symbol.
In some disclosed embodiments, the processor 82 is configured to determine whether a plurality of equations exist in the text region based on the characters on the left side of the relationship character and the characters on the right side of the relationship character, including: under the condition that a plurality of relation symbols are searched, acquiring a first operation result of characters on the left side of a first relation symbol and acquiring a second operation result on the right side of a last relation symbol; if the first operation result and the second operation result meet a first preset condition, determining that an equation exists in the text region; and if the first operation result and the second operation result do not meet the first preset condition, determining that a plurality of equations exist in the text region.
Unlike the foregoing embodiment, by using the relationship of characters on both sides of the relationship character, it is possible to determine whether or not there are a plurality of equations in the text region, and thus it is possible to subsequently locate an equation in the text region where there are a plurality of equations.
Referring to fig. 9, fig. 9 is a block diagram illustrating an embodiment of a computer readable storage device according to the present application. The computer readable storage device 90 stores program instructions 901 capable of being executed by a processor, the program instructions 901 being for implementing any of the above-described methods of arithmetic detection.
In the above-described aspect, it is possible to distinguish each of the plurality of expressions in the text region by checking whether or not the plurality of expressions exist in the text region, and if the plurality of expressions exist, locating the expression in the text region by using the recognized character.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage device. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage device and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. The aforementioned storage device includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (10)

1. An equation detection method, comprising:
detecting a text area of a formula in an image to be detected, and identifying characters in the text area;
checking whether a plurality of equations exist in the text area by using the characters;
based on the presence of the plurality of equations within the text region, locating the equations in the text region using the characters.
2. The method of claim 1, wherein the character comprises a relational character; the locating the equation in the text region using the character includes:
searching for a relation character in the text region;
acquiring a first operation result of characters positioned on the left side of the relation symbol, and selecting a plurality of characters on the right side of the relation symbol; the characters and the first operation result meet a first preset condition;
based on the number of characters, an equation position within the text region of an equation including the number of characters is determined.
3. The method of claim 2, wherein the identifying the characters within the text region comprises:
extracting a regional characteristic diagram of the text region, and obtaining an attention value of each pixel point in the regional characteristic diagram by using the regional characteristic diagram based on an attention mechanism; wherein the larger the attention value is, the higher the possibility that the pixel point belongs to the character is;
weighting the pixel points corresponding to the regional characteristic graph by using the attention values of the pixel points to obtain a weighted characteristic graph;
identifying by using the weighted feature map to obtain characters in the text area;
the determining, based on the number of characters, an arithmetic location within the text region of an arithmetic that includes the number of characters, comprising:
obtaining the character position of the last character in the text area by using the attention value;
and determining the formula position of the formula containing the characters in the text region by utilizing the character position.
4. The method of claim 2, wherein after determining, based on the number of characters, an formula location of a formula containing the number of characters within the text region, the method further comprises:
and starting from the latter character of the characters, re-executing the step of searching the relation character in the text region and the subsequent steps.
5. The method according to claim 2, wherein the first preset condition comprises: the second operation result and the first operation result of the characters enable the relation symbol to be established;
and/or, the relationship comprises any one of: equal to, about equal to, greater than, less than, not greater than.
6. The method of claim 2, further comprising at least one of:
displaying an operation correct mark at an operation position of the text region;
and displaying an operation error mark at a preset position of the text area under the condition that the characters are not selected.
7. The method of claim 1, wherein the character comprises a relational character, and wherein the verifying whether the plurality of equations exist within the text region comprises:
searching for a relation character in the text region;
and determining whether a plurality of equations exist in the text area based on characters on two sides of the relation character.
8. The method of claim 7, wherein determining whether a plurality of equations exist within the text region based on the characters to the left of the relationship character and the characters to the right of the relationship character comprises:
under the condition that a plurality of relation symbols are searched, acquiring a third operation result of characters on the left side of the first relation symbol and acquiring a fourth operation result of the right side of the last relation symbol;
if the third operation result and the fourth operation result meet a second preset condition, determining that an equation exists in the text region;
and if the third operation result and the fourth operation result do not meet the second preset condition, determining that a plurality of equations exist in the text region.
9. An electronic device comprising a memory and a processor coupled to each other, wherein the memory stores program instructions, and the processor is configured to execute the program instructions to implement the method according to any one of claims 1 to 8.
10. A storage device storing program instructions executable by a processor to implement the method of any one of claims 1 to 8.
CN202011644726.0A 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device Active CN112712075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011644726.0A CN112712075B (en) 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011644726.0A CN112712075B (en) 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device

Publications (2)

Publication Number Publication Date
CN112712075A true CN112712075A (en) 2021-04-27
CN112712075B CN112712075B (en) 2023-12-01

Family

ID=75548089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011644726.0A Active CN112712075B (en) 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device

Country Status (1)

Country Link
CN (1) CN112712075B (en)

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020126905A1 (en) * 2001-03-07 2002-09-12 Kabushiki Kaisha Toshiba Mathematical expression recognizing device, mathematical expression recognizing method, character recognizing device and character recognizing method
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
US8818033B1 (en) * 2012-04-27 2014-08-26 Google Inc. System and method for detecting equations
CN104268540A (en) * 2014-09-05 2015-01-07 宇龙计算机通信科技(深圳)有限公司 Equation processing method and device based on images and terminal
CN104751148A (en) * 2015-04-16 2015-07-01 同方知网数字出版技术股份有限公司 Method for recognizing scientific formulas in layout file
CN105404497A (en) * 2015-10-26 2016-03-16 北京锐安科技有限公司 Logic expression analysis method and apparatus
CN105913057A (en) * 2016-04-12 2016-08-31 中国传媒大学 Projection and structure characteristic-based in-image mathematical formula detection method
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN106980856A (en) * 2016-01-15 2017-07-25 上海谦问万答吧云计算科技有限公司 Formula identification method and system and symbolic reasoning computational methods and system
CN107301411A (en) * 2016-04-14 2017-10-27 科大讯飞股份有限公司 Method for identifying mathematical formula and device
CN108364009A (en) * 2018-02-12 2018-08-03 掌阅科技股份有限公司 Recognition methods, computing device and the computer storage media of two-dimensional structure formula
US20180342061A1 (en) * 2016-07-15 2018-11-29 Beijing Sensetime Technology Development Co., Ltd Methods and systems for structured text detection, and non-transitory computer-readable medium
CN109241861A (en) * 2018-08-14 2019-01-18 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and storage medium
CN109614944A (en) * 2018-12-17 2019-04-12 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and readable storage medium storing program for executing
CN110210467A (en) * 2019-05-28 2019-09-06 广州华多网络科技有限公司 A kind of formula localization method, image processing apparatus, the storage medium of text image
CN110490056A (en) * 2019-07-08 2019-11-22 北京三快在线科技有限公司 The method and apparatus that image comprising formula is handled
CN110705399A (en) * 2019-09-19 2020-01-17 安徽七天教育科技有限公司 Method for automatically identifying mathematical formula
CN110929573A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Examination question checking method based on image detection and related equipment
CN111340020A (en) * 2019-12-12 2020-06-26 科大讯飞股份有限公司 Formula identification method, device, equipment and storage medium
CN111401353A (en) * 2020-03-17 2020-07-10 重庆邮电大学 Method, device and equipment for identifying mathematical formula
CN111738169A (en) * 2020-06-24 2020-10-02 北方工业大学 Handwriting formula recognition method based on end-to-end network model
CN111738105A (en) * 2020-06-04 2020-10-02 科大讯飞股份有限公司 Formula identification method and device, electronic equipment and storage medium
CN112101359A (en) * 2020-11-11 2020-12-18 广州华多网络科技有限公司 Text formula positioning method, model training method and related device

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020126905A1 (en) * 2001-03-07 2002-09-12 Kabushiki Kaisha Toshiba Mathematical expression recognizing device, mathematical expression recognizing method, character recognizing device and character recognizing method
US8818033B1 (en) * 2012-04-27 2014-08-26 Google Inc. System and method for detecting equations
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
CN104268540A (en) * 2014-09-05 2015-01-07 宇龙计算机通信科技(深圳)有限公司 Equation processing method and device based on images and terminal
CN104751148A (en) * 2015-04-16 2015-07-01 同方知网数字出版技术股份有限公司 Method for recognizing scientific formulas in layout file
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN105404497A (en) * 2015-10-26 2016-03-16 北京锐安科技有限公司 Logic expression analysis method and apparatus
CN106980856A (en) * 2016-01-15 2017-07-25 上海谦问万答吧云计算科技有限公司 Formula identification method and system and symbolic reasoning computational methods and system
CN105913057A (en) * 2016-04-12 2016-08-31 中国传媒大学 Projection and structure characteristic-based in-image mathematical formula detection method
CN107301411A (en) * 2016-04-14 2017-10-27 科大讯飞股份有限公司 Method for identifying mathematical formula and device
US20180342061A1 (en) * 2016-07-15 2018-11-29 Beijing Sensetime Technology Development Co., Ltd Methods and systems for structured text detection, and non-transitory computer-readable medium
CN108364009A (en) * 2018-02-12 2018-08-03 掌阅科技股份有限公司 Recognition methods, computing device and the computer storage media of two-dimensional structure formula
CN109241861A (en) * 2018-08-14 2019-01-18 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and storage medium
CN109614944A (en) * 2018-12-17 2019-04-12 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and readable storage medium storing program for executing
CN110210467A (en) * 2019-05-28 2019-09-06 广州华多网络科技有限公司 A kind of formula localization method, image processing apparatus, the storage medium of text image
CN110490056A (en) * 2019-07-08 2019-11-22 北京三快在线科技有限公司 The method and apparatus that image comprising formula is handled
CN110705399A (en) * 2019-09-19 2020-01-17 安徽七天教育科技有限公司 Method for automatically identifying mathematical formula
CN110929573A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Examination question checking method based on image detection and related equipment
CN111340020A (en) * 2019-12-12 2020-06-26 科大讯飞股份有限公司 Formula identification method, device, equipment and storage medium
CN111401353A (en) * 2020-03-17 2020-07-10 重庆邮电大学 Method, device and equipment for identifying mathematical formula
CN111738105A (en) * 2020-06-04 2020-10-02 科大讯飞股份有限公司 Formula identification method and device, electronic equipment and storage medium
CN111738169A (en) * 2020-06-24 2020-10-02 北方工业大学 Handwriting formula recognition method based on end-to-end network model
CN112101359A (en) * 2020-11-11 2020-12-18 广州华多网络科技有限公司 Text formula positioning method, model training method and related device

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LIANGCAI GAO等: "A Deep Learning-Based Formula Detection Method for PDF Documents", 《2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION(ICDAR)》, pages 553 - 558 *
PARAG MALI等: "ScanSSD: Scanning Single Shot Detector for Mathematical Formulas in PDF Document Images", 《ARXIV》, pages 1 - 8 *
卢晓卫: "印刷体数学公式识别系统的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2011, no. 12, pages 138 - 1720 *
姜斯文: "自监督学习在脱机数学公式手写体识别中的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2020, no. 8, pages 138 - 533 *
张自强: "印刷体文档中的数学公式识别算法的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2017, no. 3, pages 138 - 5037 *

Also Published As

Publication number Publication date
CN112712075B (en) 2023-12-01

Similar Documents

Publication Publication Date Title
CN106980856B (en) Formula identification method and system and symbolic reasoning calculation method and system
CN109284355B (en) Method and device for correcting oral arithmetic questions in test paper
US20090041361A1 (en) Character recognition apparatus, character recognition method, and computer product
CN109902285B (en) Corpus classification method, corpus classification device, computer equipment and storage medium
CN110390340B (en) Feature coding model, training method and detection method of visual relation detection model
CN111626177B (en) PCB element identification method and device
CN112036292A (en) Character recognition method and device based on neural network and readable storage medium
CN111274239A (en) Test paper structuralization processing method, device and equipment
JP7077483B2 (en) Problem correction methods, devices, electronic devices and storage media for mental arithmetic problems
CN113449725B (en) Object classification method, device, equipment and storage medium
CN113657098B (en) Text error correction method, device, equipment and storage medium
CN112016560A (en) Overlay text recognition method and device, electronic equipment and storage medium
CN112364974A (en) Improved YOLOv3 algorithm based on activation function
CN112036304A (en) Medical bill layout identification method and device and computer equipment
CN112069833B (en) Log analysis method, log analysis device and electronic equipment
CN112597299A (en) Text entity classification method and device, terminal equipment and storage medium
CN112712075A (en) Formula detection method, electronic equipment and storage device
CN110728321A (en) Training method and device for recognizing fractional image, and recognition method and device
JP4160206B2 (en) Database registration method using character recognition device
CN115995092A (en) Drawing text information extraction method, device and equipment
US20020186885A1 (en) Verifying results of automatic image recognition
CN115373982A (en) Test report analysis method, device, equipment and medium based on artificial intelligence
CN113836297A (en) Training method and device for text emotion analysis model
CN112287763A (en) Image processing method, apparatus, device and medium
CN112614107A (en) Image processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant