CN112712075B - Arithmetic detection method, electronic equipment and storage device - Google Patents

Arithmetic detection method, electronic equipment and storage device Download PDF

Info

Publication number
CN112712075B
CN112712075B CN202011644726.0A CN202011644726A CN112712075B CN 112712075 B CN112712075 B CN 112712075B CN 202011644726 A CN202011644726 A CN 202011644726A CN 112712075 B CN112712075 B CN 112712075B
Authority
CN
China
Prior art keywords
character
text region
characters
relational
operation result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011644726.0A
Other languages
Chinese (zh)
Other versions
CN112712075A (en
Inventor
马皓
何春江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202011644726.0A priority Critical patent/CN112712075B/en
Publication of CN112712075A publication Critical patent/CN112712075A/en
Application granted granted Critical
Publication of CN112712075B publication Critical patent/CN112712075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/23Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on positionally close patterns or neighbourhood relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The application discloses a method, equipment and a storage device for detecting an arithmetic expression, wherein the method comprises the following steps: detecting a text region of an arithmetic expression in an image to be detected, and identifying characters in the text region; checking whether a plurality of formulas exist in the text region by using the characters; based on the presence of multiple formulas in the text region, the formulas in the text region are located using the characters. According to the scheme, the multiple formulas in the text region can be distinguished, and then the accuracy of the identification of the formulas is improved.

Description

Arithmetic detection method, electronic equipment and storage device
Technical Field
The present application relates to the field of natural language processing, and in particular, to a method, an apparatus, and a storage device for detecting an expression.
Background
In recent years, computer vision technology has been rapidly developed, and image-text recognition technology in computer vision technology has been widely used in the education field. The image-text recognition technology is utilized to automatically recognize the expression, and automatic correction can be performed later, so that the burden of teachers and parents can be greatly reduced. However, the correction of the expression depends on the accurate detection of the expression, and in view of this, how to improve the accuracy of the detection of the expression is a problem to be solved.
Disclosure of Invention
The application mainly provides a method, equipment and a storage device for detecting an arithmetic expression, which can improve the accuracy of the detection of the arithmetic expression.
The first aspect of the present application provides a method for detecting an expression, the method comprising: detecting a text region of an arithmetic expression in an image to be detected, and identifying characters in the text region; checking whether a plurality of formulas exist in the text region by using the characters; based on the presence of multiple formulas in the text region, the formulas in the text region are located using the characters.
In order to solve the above problem, a second aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, where the memory stores program instructions, and the processor is configured to execute the program instructions to implement the method for detecting an expression described in the first aspect.
In order to solve the above-described problem text, a third aspect of the present application provides a storage device storing program instructions executable by a processor for implementing the expression detection method described in the above-described first aspect.
According to the scheme, whether the plurality of formulas exist in the text area or not is checked, and if the plurality of formulas exist, the formulas in the text area are positioned by utilizing the recognized characters, so that the plurality of formulas in the text area can be distinguished, and the accuracy of the identification of the formulas is improved.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of a method for detecting an expression of the present application;
FIG. 2 is a schematic diagram of a second flow chart of an embodiment of the method for detecting an expression of the present application;
FIG. 3 is a schematic diagram of an embodiment of an attention model of the method of detecting a formula of the present application;
FIG. 4 is a third flow chart of an embodiment of the method for detecting an expression of the present application;
FIG. 5 is a fourth flow chart of an embodiment of the method for detecting an expression of the present application;
FIG. 6 is a fifth flow chart of an embodiment of the method for detecting an expression of the present application;
FIG. 7 is a sixth flow chart of an embodiment of the method for detecting an expression of the present application;
FIG. 8 is a flow chart of an embodiment of the electronic device of the present application;
FIG. 9 is a schematic diagram of a frame of an embodiment of a storage device of the present application.
Detailed Description
The following describes embodiments of the present application in detail with reference to the drawings.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for detecting an embodiment of the present application. Specifically, this embodiment includes the steps of:
step S11: and detecting a text region of an arithmetic expression in the image to be detected, and identifying characters in the text region.
The expression refers to the expression listed when the calculation of the number (or the algebraic expression) is performed. The expression may include mathematical symbols such as numbers (or letters replacing numbers), arithmetic symbols (four-law arithmetic, power, square, factorial, permutation and combination, etc.), and "=" symbols. The text region of the formula in the image to be measured can be understood as the region in which the arithmetic is located in the image to be measured, and a certain number of pixels are included in the region.
In one embodiment, the character comprises a relational character comprising any one of: equal to, about equal to, greater than, less than, not greater than. The symbol equal to "=", the symbol approximately equal to "≡", the symbol greater than ">", the symbol less than "<", the symbol not greater than "no". In one embodiment, the relational symbols further comprise "-" symbols in the vertical expression.
In the image to be measured, if a plurality of formulas exist in the image to be measured, a plurality of text areas can be identified. In a text region, there may be one expression or a plurality of expressions.
In one implementation scenario, in order to improve the detection efficiency of the text region, a text detection network may be trained in advance, so that the image to be detected may be input into the text detection network, to obtain the text region of the formula in the image to be detected. In particular, the text detection network may include, but is not limited to: DBNet (Real-time Scene Text Detection with Differentiable Binarization Network), PSENet (Progressive Scale Expansion Network), etc., are not limited herein.
After the text region of the formula in the image to be detected is obtained, the text region of the formula in the image to be detected is identified by utilizing a text identification network, so that characters in the text region are identified. The text recognition network may include, but is not limited to: CRNN (Convolutional Recurrent Neural Network) + CTC (Connectionist temporal classification), CNN (Convolutional Neural Networks) +seq2seq+attention, etc., furthermore, the text recognition network may be a network having an Encoder-decoder (i.e., encoder-decoder) structure, without limitation. Characters are, for example, numbers, letters, operators, and relatives, etc.
Step S12: using the characters, it is checked whether there are a plurality of formulas in the text region.
After the characters associated with the formulas in the text region are obtained, the characters can be used to verify that there are multiple formulas within the text region. For example, when it is detected that there are a plurality of "=" numbers in one text region, it can be considered that there are a plurality of formulas in the text region. Further, when there are a plurality of "=" numbers in one text region and the characters in the text region do not satisfy the related mathematical operation result, it may be considered that there are a plurality of formulas.
When it is detected that there are a plurality of formulas within the text region, step S13 may be continued to be performed. In one implementation scenario, where only one equation is detected within the text region, no further operations may be required. In another implementation scenario, when only one expression is detected in the text region, the identified expression may be further modified to determine whether the expression is correctly operated, and display a correct operation flag or an incorrect operation flag.
Step S13: based on the presence of multiple formulas in the text region, the formulas in the text region are located using the characters.
If multiple formulas exist in the text region, the recognized characters may be utilized to locate the formulas in the text region such that the multiple formulas in the text region may be separated and thereby become independent formulas. For example, the recognition may be performed by determining a plurality of characters satisfying the mathematical operation result as one expression based on the mathematical operation relation of the characters in the text region. Or according to some symbols for indicating separation, such as "; ", and". "and the like.
Therefore, by checking whether a plurality of formulas exist in the text region and positioning the formulas in the text region by using the recognized characters when the plurality of formulas exist, the plurality of formulas in the text region can be distinguished, so that the accuracy of the identification of the formulas is improved, the subsequent correction is facilitated, and the correction efficiency is improved.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating a second process according to an embodiment of the method for detecting an expression of the present application. The present embodiment is a further extension of the step "recognize characters within text region" described in step S11 in the above embodiment, specifically, the step includes the following steps S111 to S113.
Step S111: extracting a regional feature map of a text region, and obtaining the attention value of each pixel point in the regional feature map by utilizing the regional feature map based on an attention mechanism; wherein the larger the attention value, the higher the likelihood that the pixel belongs to the character.
As described in the foregoing disclosure embodiments, in order to improve recognition efficiency, when recognizing characters in a text region, a text recognition network may be used to recognize the characters in the text region. When the text recognition network is used for recognizing characters in a text region, the feature map extracted by the text recognition network can be used as a region feature map. Taking a text recognition network as an example, the network with an encoder-decoder structure can take a feature map extracted by a coding layer in the text recognition network as a regional feature map of a text region. It will be appreciated that when multiple coding layers exist in the text recognition network, the feature map output by one of the coding layers may be extracted as a regional feature map.
After obtaining the regional feature map of the text region, in the text recognition algorithm, the regional feature map may be processed based on an attention mechanism (for example, a Self-attention mechanism (Self-attention Mechanism)) by using an attention model, so as to obtain the attention value of each pixel point in the regional feature map, and in the obtained attention value of each pixel point, the greater the attention value is, the higher the probability that the pixel point belongs to a character is.
Step S112: and weighting the corresponding pixel points of the regional characteristic map by using the attention value of the pixel points to obtain a weighted characteristic map.
After the attention value of the pixel point is obtained, the attention model can weight the pixel point corresponding to the regional characteristic image by using the attention value of the pixel point to obtain a weighted characteristic image. In the weighted feature map, the pixel points corresponding to the characters acquire high weights. Thus, the position information of the character in the text region can be obtained in the weighted feature map.
Because the weighted feature map contains information about the location of the character, the weighted feature map can be saved for subsequent localization of the formula within the text region. Specifically, each character identified may generate a weighted feature map, i.e., how many characters are identified in the text region, and a corresponding number of weighted feature maps may be generated.
Step S113: and identifying by using the weighted feature map to obtain characters in the text region.
In one embodiment, the weighted feature map and the feature map output by the encoding layer of the text recognition algorithm may be input into the decoding layer of the text recognition algorithm to complete recognition of the characters in the text region, thereby obtaining the characters in the text region.
Referring to fig. 3, fig. 3 is a schematic diagram of an embodiment of an attention model of the method for detecting an expression of the present application. The region feature map 301 is a region feature map of the coding layer output in the text recognition algorithm. The convolution kernels 302, 303 and 304 are all 1*1 convolution kernels, and are used for converting the region feature map. After conversion, matrix multiplication (f (x) and g (x)) is performed on the intermediate feature map 305 and the intermediate feature map 306, so that an association of each pixel to each pixel is established, and then a softmax operation is utilized, so that an attention value map 308 of each pixel point is obtained. Finally, the intermediate feature map 307 and the attention value map 308 are subjected to matrix calculation, thereby obtaining a weighted feature map 309.
Unlike the foregoing embodiments, the present embodiment can enhance the character recognition capability of the text recognition algorithm by introducing an attention mechanism, and obtain the position information of the recognized characters for the localization of the following calculation formula.
Referring to fig. 4, fig. 4 is a third flow chart of an embodiment of the method for detecting an expression of the present application. The present embodiment is a further extension of the steps described in step S12 in the above embodiment, specifically, the steps include the following step S121 and step S122.
Step S121: the relational characters within the text region are searched.
Based on the recognition of the characters in the text region, the relational characters in the text region can be further searched. For example, all recognized characters may be traversed to search for a relationship. The meaning of the relational symbol may be referred to the specific description in step S11, and will not be repeated here.
In one implementation scenario, if no relational character is searched, it may be considered that no formula exists in the text region, and the text region may be considered not to belong to the object to be processed by the present application, and thus may be ignored.
Step S122: based on the characters on both sides of the relational character, it is determined whether a plurality of formulas exist within the text region.
The operation results of some formulas can be obtained according to the relational character, so that whether a plurality of formulas exist in the text region can be determined according to characters on two sides of the relational character.
In one embodiment, if there is only one relational character for the entire text region, then the text region may be considered to have only one expression, and there are no multiple expressions. If there are multiple relations in the text region, it is further necessary to determine whether there are multiple formulas in the text region.
Referring to fig. 5, fig. 5 is a fourth flowchart of an embodiment of the method for detecting an expression of the present application. Specifically, in the case where a plurality of relations are searched, step S122 may specifically include the following steps S1221 to S1223.
Step S1221: and acquiring a first operation result of the character on the left side of the first relational character, and acquiring a second operation result on the right side of the last relational character.
It should be noted that a first operation result in the present specification corresponds to a third operation result of the claims, a second operation result in the present specification corresponds to a fourth operation result of the claims, a third operation result in the present specification corresponds to a first operation result of the claims, and a fourth operation result in the present specification corresponds to a second operation result of the claims.
Because there are multiple relations, if the relations belong to the same formula, it indicates that the characters in the text region all satisfy the specific mathematical operation result, so that the characters on the left side of the first relation can be obtained, and the corresponding operation result is obtained by calculation, that is, the first operation result. The operation result on the right side of the last relation symbol can be obtained, namely the second operation result. And judging whether a plurality of formulas exist in the text region or not according to the two operation results.
For example, in a text region, the recognized character is "2+2=4+3+3=6", then the first relational character is left "=", the character to the left of the relational character is "2+2", and the first operation result is 4; the last relation is "=" on the right, the character on the right of the character is "6", and the second operation result is 6. For another example, in a text region, the recognized character is "2+2=43+3=2+4", then the first relational character is left "=", the character to the left of the relational character is "2+2", and the first operation result is 4; the last relation is "=" on the right, the character on the right of the character is "2+4", and the second operation result is 6.
Step S1222: if the first operation result and the second operation result meet the first preset condition, determining that an operation formula exists in the text region.
Step S1223: if the first operation result and the second operation result do not meet the first preset condition, determining that a plurality of formulas exist in the text region.
It is to be noted that the first preset condition in the present specification corresponds to the second preset condition of the claims, and the second preset condition in the present specification corresponds to the first preset condition of the claims.
In one embodiment, the first preset condition may include: the first operation result and the second operation result meet the relation of the first relation symbol and the relation of the last relation symbol, namely the first operation result and the second operation result meet the operation requirement of the first relation symbol and the operation requirement of the last relation symbol. For example, in the above example, the first relation is "=", the last relation is "=", the first operation result is 4, the second operation result is 6, and 4 is not equal to 6, where it may be determined that there are a plurality of formulas in the text region.
For example, in a text region, the recognized character is "7+2=9+3 <8", the first relational symbol is "=", and the first operation result is 9; the last relation is "<", and the second operation result is 8. The first operation result does not meet the operation requirement of the first relational symbol and the operation requirement of the last relational symbol, so that the first operation result and the second operation result do not meet the first preset condition, and a plurality of formulas can be determined to exist in the text region.
As another example, in a text region, the recognized character is "5+2=7=3+4", the first relational symbol is "=", and the first operation result is 7; the last relation is "=", and the second operation result is 7. The first operation result meets the operation requirement of the first relation symbol and the last relation symbol, so that the first operation result and the second operation result meet the first preset condition, and 1 operation formula can be determined to exist in the text region.
Thus, by utilizing the relationship of the characters on both sides of the relational character, it is possible to determine whether or not a plurality of formulas are present in the text region, thereby making it possible to locate the formulas in the text region in which a plurality of formulas are present later.
Referring to fig. 6, fig. 6 is a fifth flowchart of an embodiment of the method for detecting an expression of the present application. The present embodiment is a further extension of the steps described in step S13 in the above embodiment, specifically, the steps include the following steps S131 to S135.
Step S131: the relational characters within the text region are searched.
For a specific description of this step, please refer to the above step S121, and the description thereof is omitted here.
Step S132: acquiring a third operation result of the character positioned on the left side of the relational character, and selecting a plurality of characters on the right side of the relational character; wherein the plurality of characters and the third operation result meet a second preset condition.
After the relations within the text region are searched, an arithmetic determination may be made for each relation. Specifically, a third operation result of the character located at the left side of the relational symbol may be obtained, and a plurality of characters are selected at the right side of the relational symbol, where the selected plurality of characters and the third operation result satisfy a second preset condition.
When the relational character is selected, the first relational character on the left side can be selected first, and the expression where the relational character is located can be judged.
When selecting the character on the left side of the relation, if there is an operation operator on the left side of the relation, the operation operator closest to the relation on the left side of the relation may be determined first, the operation operators are, for example, "+", "-", "x", "-and/or the like, then the characters on the left and right sides of the operation operator are selected, and then the third operation result is determined according to the selected characters.
The specific process of selecting a plurality of characters may be to pre-select a first character on the right side of the relational symbol, and determine whether the first character and the third operation result meet a second preset condition, and if so, select the first character. If not, the first character and the second character on the right side of the relational character are preselected, and whether the first character, the second character and the third operation result meet a second preset condition is judged. If not, the first, second and third characters to the right of the relationship are selected, and so on. The second preset condition is, for example: the fourth operation result and the third operation result of the plurality of characters enable the relational character to be established, namely the fourth operation result and the third operation result of the plurality of characters meet the operation requirement of the relational character.
For example, if the character is recognized as "8+9=176+3=9" in the text region, the third operation result of the character on the left side of the first relational character is 17. When the character on the right side of the relational character is selected, the first character "1" may be selected, that is, the fourth operation result is 1, and then it is determined whether the "1" and the third operation result 17 satisfy the second preset condition. Obviously, the "1" and the third operation result 17 do not satisfy the second preset condition, at this time, the first character "1" and the second character "7" on the right side of the relational symbol need to be selected, that is, the fourth operation result is 17, at this time, it is continuously determined whether the fourth operation result 17 and the third operation result 17 satisfy the second preset condition, and obviously, the fourth operation result and the third operation result satisfy the second preset condition, at this time, it can be determined that the selected plurality of characters are the first character and the second character on the right side of the relational symbol.
S133: it is determined whether a number of characters to the right of the relationship are selected.
When a number of characters on the right side of the relational character is selected, if a number of characters is not selected, step S134 may be performed, and if a number of characters is selected, step S135 may be performed.
Step S134: and displaying an operation error mark at a preset position of the text region.
Since a number of characters are not selected, this expression can be considered to be miscalculated, and thus an arithmetic error flag can be displayed at a preset position of the text region. The preset position is, for example, beside the operator, and the specific position can be set according to the need, which is not limited herein. The arithmetic error flag is, for example, "X".
After determining the arithmetic operation error, it may be judged for the second relational symbol from the left side in the text region, and step S132 may be executed again. After the second relational character is judged, a third relational character from the left side in the text region is judged, and so on.
Step S135: based on the number of characters, a formula location within the text region is determined for a formula containing the number of characters.
After the selection of a number of characters, this means that one equation is determined, so that the location of the equation containing the number of characters within the text region can be determined. Specifically, the expression positions include the positions of the characters on the right side of the relational character, the positions of the relational character, and the positions of the characters (including the operation operators) on the left side of the selected relational character.
Thus, the arithmetic position can be determined by judging the relationship between the character on the left side and the character on the right side of the relational character.
Referring to fig. 7, fig. 7 is a sixth flowchart of an embodiment of the method for detecting an arithmetic expression according to the present application, in which step S135 specifically includes the following steps S1351 and S1352.
Step S1351: and obtaining the character position of the last character in the text area in the plurality of characters by using the attention value.
In the embodiment of the present disclosure, the attention value indicates the possibility that a character exists at the pixel point position, and the larger the attention value is, the greater the possibility that a character exists. The attention value obtaining manner may refer to the related description in the foregoing disclosed embodiments, and will not be repeated herein. After the attention value of each pixel point is obtained and the corresponding weighted feature map is obtained according to the attention value, the position of the pixel point where the character contained in the weighted feature map is located, namely the position of the character, can be determined.
Because the weighted feature map contains the character position information, the weighted feature map can be used to obtain the character position of the last character in the text region among the above-mentioned several characters (the characters to the right of the selected relational character). Specifically, a weighted feature diagram corresponding to the last character in a plurality of characters is searched, and the character position of the last character in the text area is obtained according to the position information of the last character contained in the weighted feature diagram.
Step S1352: using the character positions, the formula positions within the text region of the formula containing the plurality of characters are determined.
When the character position of the last character in the text area is obtained, the formula position of the formulas where the characters are located in the text area can be determined according to the character position. The position of the expression includes, for example, a position at which the expression starts and a position at which the expression ends.
In one embodiment, if the operator of the formula is the first operator in the text region to begin at the left, then it may be determined that the beginning position of the text region to the left is the position where the formula begins and the last character is the character position within the text region to the end of the formula.
In one embodiment, the position of the leftmost character in the characters on the left side of the selection relational symbol, that is, the position at which the expression starts, may be obtained first, and the specific determination method is the same as that of determining the character position of the last character in the text region. Then, the position of the leftmost character of the characters on the left side of the relational character and the position of the last character in the text region (the position where the expression ends) are the positions of the expression.
In one embodiment, after the formula position is determined, an operation correct marker, such as "v", may be displayed at the formula position of the text region, for example, an operation correct marker may be displayed at the formula end position.
The automatic correction of the calculation formula can be realized by displaying the calculation correct mark or the calculation error mark, so that the pressure of correction personnel is reduced, and the correction efficiency is improved.
In one embodiment, after determining the formula position, the above step S131 may be performed again, starting from the character next to the above several characters: searching for a relationship in the text region and subsequent steps. At this time, the area searched in step S131 may exclude the position of the already determined arithmetic. With this, the localization of the formula in the text region can be continued.
Thus, by using the weighted feature map, the positions of the characters can be determined, and based on the selected number of characters, the localization of the formulas can be achieved, distinguishing the formulas in which the text regions of a plurality of formulas exist.
Referring to fig. 8, fig. 8 is a flowchart of an embodiment of an electronic device according to the present application. The electronic device 80 comprises a memory 81 and a processor 82 coupled to each other, the memory 81 having stored therein program instructions, the processor 82 being adapted to execute the program instructions to implement the steps of any of the above-described method embodiments.
Specifically, the processor 82 is configured to control itself and the memory 81 to implement the steps of any of the above-described embodiments of the method. The processor 82 may be an integrated circuit chip having signal processing capabilities. The processor 82 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 82 may be commonly implemented by an integrated circuit chip.
In the disclosed embodiment, the processor 82 is configured to perform detecting a text region of an expression in an image to be detected, and recognize a character in the text region; checking whether a plurality of formulas exist in the text region by using the characters; based on the presence of multiple formulas in the text region, the formulas in the text region are located using the characters.
According to the scheme, whether the plurality of formulas exist in the text area or not is checked, and under the condition that the plurality of formulas exist, the formulas in the text area are positioned by utilizing the recognized characters, so that the plurality of formulas in the text area can be distinguished, the accuracy of the identification of the formulas is improved, the follow-up correction is facilitated, and the correction efficiency is improved.
In some disclosed embodiments, the characters include a relational character. The processor 82 is configured to locate an expression in the text region using the character, including: searching for a relational character within the text region; acquiring a third operation result of the character positioned on the left side of the relational character, and selecting a plurality of characters on the right side of the relational character; wherein the plurality of characters and the third operation result meet a second preset condition; based on the number of characters, a formula location within the text region is determined for a formula containing the number of characters.
Unlike the foregoing embodiments, by judging the relationship between the character on the left side and the character on the right side of the relational symbol, the arithmetic position can be determined, thereby realizing the positioning of the expression.
In some disclosed embodiments, the processor 82 is configured to identify characters within a text region, including: extracting a regional feature map of a text region, and obtaining the attention value of each pixel point in the regional feature map by utilizing the regional feature map based on an attention mechanism; the larger the attention value is, the higher the possibility that the pixel belongs to the character is; weighting the corresponding pixel points of the regional feature map by using the attention values of the pixel points to obtain a weighted feature map; and identifying by using the weighted feature map to obtain characters in the text region. The processor 82 is configured to determine, based on the plurality of characters, a formula location within the text region of a formula containing the plurality of characters, including: obtaining the character position of the last character in the text area in a plurality of characters by using the attention value; using the character positions, the formula positions within the text region of the formula containing the plurality of characters are determined.
Unlike the previous embodiments, by using weighted feature graphs, the locations of the characters can be determined, and based on the selected number of characters, the localization of the formulas can be achieved, distinguishing the formulas in which text regions of multiple formulas exist.
In some disclosed embodiments, after the processor 82 is configured to determine, based on the number of characters, a formula location within the text region for a formula containing the number of characters, the processor 82 is further configured to re-perform the step of searching for a relational character within the text region and subsequent steps starting with a character subsequent to the number of characters.
Unlike the previous embodiments, locating the formulas in the text region can continue by re-executing the step of indexing the relationships within the text region and subsequent steps.
In some disclosed embodiments, the second preset condition includes: the fourth and third operation results of the plurality of characters hold a relationship, and/or the relationship includes any one of: equal to, about equal to, greater than, less than, not greater than.
In some disclosed embodiments, the processor 82 is further configured to display an arithmetic correct marker at an arithmetic location of the text region; in case that a number of characters are not selected, an operation error flag is displayed at a preset position of the text region.
Different from the embodiment, the automatic correction of the calculation formula can be realized by displaying the calculation correct mark or the calculation error mark, so that the pressure of correction personnel is reduced, and the correction efficiency is improved.
In some disclosed embodiments, the characters include a relational character, and the processor 82 is configured to check whether a plurality of formulas exist in the text region, including: searching for a relational character within the text region; based on the characters on both sides of the relational character, it is determined whether a plurality of formulas exist within the text region.
Unlike the previous embodiments, whether or not there are multiple formulas within the text region may be determined based on the characters on both sides of the glyph.
In some disclosed embodiments, the processor 82 is configured to determine whether a plurality of formulas exist in the text region based on the character to the left of the relational character and the character to the right of the relational character, including: under the condition that a plurality of relational symbols are searched, a first operation result of the character on the left side of the first relational symbol is obtained, and a second operation result on the right side of the last relational symbol is obtained; if the first operation result and the second operation result meet the first preset condition, determining that an operation formula exists in the text region; if the first operation result and the second operation result do not meet the first preset condition, determining that a plurality of formulas exist in the text region.
Unlike the foregoing embodiment, by utilizing the relationship of the characters on both sides of the relational character, it is possible to determine whether or not there are a plurality of formulas in the text region, thereby making it possible to locate the formulas in the text region in which the plurality of formulas exist subsequently.
Referring to fig. 9, fig. 9 is a schematic diagram of a frame of an embodiment of a computer readable storage device according to the present application. The computer-readable storage device 90 stores program instructions 901 executable by the processor, the program instructions 901 for implementing any of the above-described arithmetic detection methods.
In the above-described aspect, each of the plurality of formulas in the text region can be distinguished by checking whether the plurality of formulas exist in the text region and, in the case where the plurality of formulas exist, locating the formulas in the text region by using the recognized characters.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage means. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a storage means, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the method of the various embodiments of the present application. The aforementioned storage device includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims (9)

1. A method of detecting an expression, comprising:
detecting a text region of an arithmetic expression in an image to be detected, and identifying characters in the text region;
checking whether a plurality of formulas exist in the text region by using the characters;
locating the formulas in the text region using the characters based on the presence of the plurality of formulas within the text region;
wherein the character comprises a relational character; said locating said expression in said text region using said character comprises:
searching for a relational character within the text region;
acquiring a first operation result of a character positioned on the left side of the relational character, and selecting a plurality of characters on the right side of the relational character; the plurality of characters and the first operation result meet a first preset condition, and the first preset condition comprises: the second operation result and the first operation result of the plurality of characters enable the relational character to be established;
based on the number of characters, a formula location within the text region is determined for a formula that includes the number of characters.
2. The method of claim 1, wherein the identifying the character within the text region comprises:
extracting a regional feature map of the text region, and obtaining the attention value of each pixel point in the regional feature map by using the regional feature map based on an attention mechanism; wherein the larger the attention value is, the higher the probability that the pixel belongs to the character is;
weighting the pixel points corresponding to the regional feature map by using the attention value of the pixel points to obtain a weighted feature map;
identifying by using the weighted feature diagram to obtain characters in the text region;
the determining, based on the number of characters, a formula location within the text region of a formula containing the number of characters, comprising:
obtaining the character position of the last character in the text area by using the attention value;
and determining the expression position of the expression containing the characters in the text area by utilizing the character positions.
3. The method of claim 1, wherein after the determining, based on the number of characters, a formula position within the text region for a formula containing the number of characters, the method further comprises:
the step of searching for a relationship in the text region and subsequent steps are re-executed starting from a character subsequent to the number of characters.
4. The method of claim 1, wherein the relational symbol comprises any one of: equal to, about equal to, greater than, less than, not greater than.
5. The method of claim 1, further comprising at least one of:
displaying an arithmetic correct mark at an arithmetic position of the text region;
and displaying an operation error mark at a preset position of the text region under the condition that the characters are not selected.
6. The method of claim 1, wherein the character comprises a relational character, and wherein the checking whether there are a plurality of formulas within the text region comprises:
searching for a relational character within the text region;
based on the characters on both sides of the relational character, whether a plurality of formulas exist in the text region is determined.
7. The method of claim 6, wherein the determining whether a plurality of formulas exist within the text region based on the character to the left of the relational character and the character to the right of the relational character comprises:
under the condition that a plurality of relational symbols are searched, acquiring a third operation result of the character on the left side of the first relational symbol, and acquiring a fourth operation result on the right side of the last relational symbol;
if the third operation result and the fourth operation result meet a second preset condition, determining that an operation formula exists in the text region;
if the third operation result and the fourth operation result do not meet the second preset condition, determining that a plurality of formulas exist in the text region;
wherein the second preset condition includes: the third operation result and the fourth operation result hold the relational symbol.
8. An electronic device comprising a memory and a processor coupled to each other, the memory having stored therein program instructions for executing the program instructions to implement the method of detecting the expression of any one of claims 1 to 7.
9. A storage device storing program instructions executable by a processor for implementing the method of detecting the expression of any one of claims 1 to 7.
CN202011644726.0A 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device Active CN112712075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011644726.0A CN112712075B (en) 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011644726.0A CN112712075B (en) 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device

Publications (2)

Publication Number Publication Date
CN112712075A CN112712075A (en) 2021-04-27
CN112712075B true CN112712075B (en) 2023-12-01

Family

ID=75548089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011644726.0A Active CN112712075B (en) 2020-12-30 2020-12-30 Arithmetic detection method, electronic equipment and storage device

Country Status (1)

Country Link
CN (1) CN112712075B (en)

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
US8818033B1 (en) * 2012-04-27 2014-08-26 Google Inc. System and method for detecting equations
CN104268540A (en) * 2014-09-05 2015-01-07 宇龙计算机通信科技(深圳)有限公司 Equation processing method and device based on images and terminal
CN104751148A (en) * 2015-04-16 2015-07-01 同方知网数字出版技术股份有限公司 Method for recognizing scientific formulas in layout file
CN105404497A (en) * 2015-10-26 2016-03-16 北京锐安科技有限公司 Logic expression analysis method and apparatus
CN105913057A (en) * 2016-04-12 2016-08-31 中国传媒大学 Projection and structure characteristic-based in-image mathematical formula detection method
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN106980856A (en) * 2016-01-15 2017-07-25 上海谦问万答吧云计算科技有限公司 Formula identification method and system and symbolic reasoning computational methods and system
CN107301411A (en) * 2016-04-14 2017-10-27 科大讯飞股份有限公司 Method for identifying mathematical formula and device
CN108364009A (en) * 2018-02-12 2018-08-03 掌阅科技股份有限公司 Recognition methods, computing device and the computer storage media of two-dimensional structure formula
CN109241861A (en) * 2018-08-14 2019-01-18 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and storage medium
CN109614944A (en) * 2018-12-17 2019-04-12 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and readable storage medium storing program for executing
CN110210467A (en) * 2019-05-28 2019-09-06 广州华多网络科技有限公司 A kind of formula localization method, image processing apparatus, the storage medium of text image
CN110490056A (en) * 2019-07-08 2019-11-22 北京三快在线科技有限公司 The method and apparatus that image comprising formula is handled
CN110705399A (en) * 2019-09-19 2020-01-17 安徽七天教育科技有限公司 Method for automatically identifying mathematical formula
CN110929573A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Examination question checking method based on image detection and related equipment
CN111340020A (en) * 2019-12-12 2020-06-26 科大讯飞股份有限公司 Formula identification method, device, equipment and storage medium
CN111401353A (en) * 2020-03-17 2020-07-10 重庆邮电大学 Method, device and equipment for identifying mathematical formula
CN111738105A (en) * 2020-06-04 2020-10-02 科大讯飞股份有限公司 Formula identification method and device, electronic equipment and storage medium
CN111738169A (en) * 2020-06-24 2020-10-02 北方工业大学 Handwriting formula recognition method based on end-to-end network model
CN112101359A (en) * 2020-11-11 2020-12-18 广州华多网络科技有限公司 Text formula positioning method, model training method and related device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4181310B2 (en) * 2001-03-07 2008-11-12 昌和 鈴木 Formula recognition apparatus and formula recognition method
CN106295629B (en) * 2016-07-15 2018-06-15 北京市商汤科技开发有限公司 structured text detection method and system

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8818033B1 (en) * 2012-04-27 2014-08-26 Google Inc. System and method for detecting equations
CN103810493A (en) * 2012-11-06 2014-05-21 夏普株式会社 Method and apparatus for identifying mathematical formula
CN104268540A (en) * 2014-09-05 2015-01-07 宇龙计算机通信科技(深圳)有限公司 Equation processing method and device based on images and terminal
CN104751148A (en) * 2015-04-16 2015-07-01 同方知网数字出版技术股份有限公司 Method for recognizing scientific formulas in layout file
WO2017031716A1 (en) * 2015-08-26 2017-03-02 北京云江科技有限公司 Method for analyzing and recognizing handwritten mathematical formula structure in natural scene image
CN105404497A (en) * 2015-10-26 2016-03-16 北京锐安科技有限公司 Logic expression analysis method and apparatus
CN106980856A (en) * 2016-01-15 2017-07-25 上海谦问万答吧云计算科技有限公司 Formula identification method and system and symbolic reasoning computational methods and system
CN105913057A (en) * 2016-04-12 2016-08-31 中国传媒大学 Projection and structure characteristic-based in-image mathematical formula detection method
CN107301411A (en) * 2016-04-14 2017-10-27 科大讯飞股份有限公司 Method for identifying mathematical formula and device
CN108364009A (en) * 2018-02-12 2018-08-03 掌阅科技股份有限公司 Recognition methods, computing device and the computer storage media of two-dimensional structure formula
CN109241861A (en) * 2018-08-14 2019-01-18 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and storage medium
CN109614944A (en) * 2018-12-17 2019-04-12 科大讯飞股份有限公司 A kind of method for identifying mathematical formula, device, equipment and readable storage medium storing program for executing
CN110210467A (en) * 2019-05-28 2019-09-06 广州华多网络科技有限公司 A kind of formula localization method, image processing apparatus, the storage medium of text image
CN110490056A (en) * 2019-07-08 2019-11-22 北京三快在线科技有限公司 The method and apparatus that image comprising formula is handled
CN110705399A (en) * 2019-09-19 2020-01-17 安徽七天教育科技有限公司 Method for automatically identifying mathematical formula
CN110929573A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Examination question checking method based on image detection and related equipment
CN111340020A (en) * 2019-12-12 2020-06-26 科大讯飞股份有限公司 Formula identification method, device, equipment and storage medium
CN111401353A (en) * 2020-03-17 2020-07-10 重庆邮电大学 Method, device and equipment for identifying mathematical formula
CN111738105A (en) * 2020-06-04 2020-10-02 科大讯飞股份有限公司 Formula identification method and device, electronic equipment and storage medium
CN111738169A (en) * 2020-06-24 2020-10-02 北方工业大学 Handwriting formula recognition method based on end-to-end network model
CN112101359A (en) * 2020-11-11 2020-12-18 广州华多网络科技有限公司 Text formula positioning method, model training method and related device

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Deep Learning-Based Formula Detection Method for PDF Documents;Liangcai Gao等;《2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR)》;553-558 *
ScanSSD: Scanning Single Shot Detector for Mathematical Formulas in PDF Document Images;Parag Mali等;《Arxiv》;1-8 *
印刷体数学公式识别系统的研究与实现;卢晓卫;《中国优秀硕士学位论文全文数据库 信息科技辑》;第2011年卷(第12期);I138-1720 *
印刷体文档中的数学公式识别算法的研究;张自强;《中国优秀硕士学位论文全文数据库 信息科技辑》;第2017年卷(第3期);I138-5037 *
自监督学习在脱机数学公式手写体识别中的研究与实现;姜斯文;《中国优秀硕士学位论文全文数据库 信息科技辑》;第2020年卷(第8期);I138-533 *

Also Published As

Publication number Publication date
CN112712075A (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN109284355B (en) Method and device for correcting oral arithmetic questions in test paper
CN106980856B (en) Formula identification method and system and symbolic reasoning calculation method and system
CN109800320B (en) Image processing method, device and computer readable storage medium
CN110765907A (en) System and method for extracting paper document information of test paper in video based on deep learning
CN111062389A (en) Character recognition method and device, computer readable medium and electronic equipment
JP7077483B2 (en) Problem correction methods, devices, electronic devices and storage media for mental arithmetic problems
CN112036292A (en) Character recognition method and device based on neural network and readable storage medium
CN107491536B (en) Test question checking method, test question checking device and electronic equipment
CN109284700B (en) Method, storage medium, device and system for detecting multiple faces in image
CN112347997A (en) Test question detection and identification method and device, electronic equipment and medium
CN115082659A (en) Image annotation method and device, electronic equipment and storage medium
JP6146209B2 (en) Information processing apparatus, character recognition method, and program
CN114357206A (en) Education video color subtitle generation method and system based on semantic analysis
CN104281842A (en) Face picture name identification method and device
CN112712075B (en) Arithmetic detection method, electronic equipment and storage device
CN111639643B (en) Character recognition method, character recognition device, computer equipment and storage medium
CN116935057A (en) Target evaluation method, electronic device, and computer-readable storage medium
US20020186885A1 (en) Verifying results of automatic image recognition
JP4160206B2 (en) Database registration method using character recognition device
CN110851349B (en) Page abnormity display detection method, terminal equipment and storage medium
CN115937875A (en) Text recognition method and device, storage medium and terminal
CN108021918B (en) Character recognition method and device
CN112614107A (en) Image processing method and device, electronic equipment and storage medium
CN111090989A (en) Prompting method based on character recognition and electronic equipment
CN111078921A (en) Subject identification method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant