CN108241847A - La Taihe forms formula processing method and its device in a kind of text identification - Google Patents

La Taihe forms formula processing method and its device in a kind of text identification Download PDF

Info

Publication number
CN108241847A
CN108241847A CN201611227736.8A CN201611227736A CN108241847A CN 108241847 A CN108241847 A CN 108241847A CN 201611227736 A CN201611227736 A CN 201611227736A CN 108241847 A CN108241847 A CN 108241847A
Authority
CN
China
Prior art keywords
formula
character
fragment
space character
left bracket
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611227736.8A
Other languages
Chinese (zh)
Other versions
CN108241847B (en
Inventor
白建国
熊蜀光
周迅溢
兴百桥
杨镜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xintang Sichuang Educational Technology Co Ltd
Original Assignee
Beijing Xintang Sichuang Educational Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xintang Sichuang Educational Technology Co Ltd filed Critical Beijing Xintang Sichuang Educational Technology Co Ltd
Priority to CN201611227736.8A priority Critical patent/CN108241847B/en
Publication of CN108241847A publication Critical patent/CN108241847A/en
Application granted granted Critical
Publication of CN108241847B publication Critical patent/CN108241847B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Character Discrimination (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The embodiment of the present application offer is to provide La Taihe forms formula processing method and its device in a kind of text identification, the method, including:The formula space character quantity of formula in text identification is obtained, and judges whether the formula space character quantity is even number;It is such as even number, the position on formula head is determined according to the character types before formula space character first in each formula fragment;The position of formula tail portion is determined according to the character types after formula space character last in each formula fragment;Extra formula space character is deleted, obtains complete La Taihe forms formula.The embodiment of the present application can make La Taihe form formula fragments Fully automated synthesis save the cost of labor of image identification for La Taihe form formula, improve recognition efficiency.

Description

La Taihe forms formula processing method and its device in a kind of text identification
Technical field
The application belongs to image identification technical field, and in particular to the La Taihe form formula manipulation in a kind of text identification Method and its device.
Background technology
La Taihe (LATEX, transliteration " La Taihe ") is a kind of composing system based on Τ Ε Χ, by american computer scholar Lesley Lambert (Leslie Lamport) is developed in phase early 1980s, using this form, even if user There is no the knowledge of typesetting and programming that can also give full play to the power provided by TeX, it can be at several days or even several small When it is interior generation much have books quality printed matters.Show particularly for the complicated table of generation and mathematical formulae, this point It is prominent.Therefore it is highly suitable for generating the science and technology of high printing quality and Mathematics document.This system is equally applicable to generate The document of every other type from simple mail to completed books.
In traditional computer aided instruction system, teacher is generally required a large amount of paper topic and exercise volume topic Input computer system, student's online connection and teacher to be facilitated to teach online.The process of this test questions input often disappears A large amount of manpower and materials are consumed, and progress is but usually very slow.It easily can efficiently be completed using image recognition technology Most typings of topic, but because the formula included in topic can not be identified by overall time, scheme As the result of identification, it is also necessary to artificial secondary intervention, so causing the promotion of efficiency very limited.If image can be known Formula fragment (with a part for the La Taihe form formula that formula list separator the separates) automation of other La Taihe forms Mode merge, will save image identification cost of labor, improve recognition efficiency.
Therefore, how to be automated in image identification and La Taihe form formula were handled, become in the prior art The technical issues of urgent need to resolve.
Invention content
One of the technical issues of the embodiment of the present application solves is to provide the La Taihe form formula in a kind of text identification Processing method and its device can make La Taihe form formula fragments Fully automated synthesis save figure for La Taihe form formula As the cost of labor of identification, recognition efficiency is improved.
The embodiment of the present application provides the La Taihe form formula processing methods in a kind of text identification, including:
Obtain text identification in formula formula space character quantity, and judge the formula space character quantity whether be Even number;
It is such as even number, formula fragment is determined according to the character types before the first formula space character of each formula fragment Head position;
The tail portion of formula fragment is determined according to the character types after the last formula space character of each formula fragment Position;
Extra formula space character is deleted, obtains complete La Taihe forms formula.
In the application in the specific implementation, the method further includes:
It is such as odd number, character or formula interval in formula fragment is not included in before searching each formula space character Symbol, and a formula space character is inserted into after the character or formula space character.
It is in the application in the specific implementation, described as even number, according to the first formula space character of each formula fragment it Preceding character types determine that the position on the head of formula fragment includes:
Detect the type of the first character before the first formula space character of each formula fragment;
If first character is Chinese, any in formula space character, punctuation mark, terminate Look-ahead, Determine position of the character for the head of formula fragment after the formula space character;
If first character is number, letter or mathematic sign, the formula space character and described is exchanged The position of first character, and continue to detect the position for determining formula fragment head forward;
If first character is right parenthesis, whether left bracket is obtained according to Look-ahead, determines that the formula is broken The position on the head of piece.
If whether obtained according to Look-ahead in the application in the specific implementation, first character is right parenthesis Left bracket is obtained, determines that the position on the head of the formula fragment includes:
If first character is right parenthesis, judge whether Look-ahead obtains left bracket;
If lookup does not obtain left bracket, terminate Look-ahead, determine that the character after the formula space character is The position on the head of formula fragment;
Left bracket is obtained if searched, and the character between the right parenthesis and left bracket is letter and/or mathematic sign And letter and/or mathematic sign with number, the formula space character is inserted into before the left bracket.
In the application in the specific implementation, character after the last formula space character of each formula fragment of the basis Type determines that the position of the tail portion of formula fragment includes:
Detect the type of the second character after the last formula space character of each formula fragment;
If second character is Chinese, any in formula space character, punctuation mark, terminate to search backward, Character before determining the formula space character is the position of the tail portion of formula fragment;
If second character is letter, number or mathematic sign, the formula space character and described is exchanged The position of second character, and continue to detect the position for determining formula fragment tail portion backward;
If second character is left bracket, according to acquisition right parenthesis is searched whether backward, determine that the formula is broken The position of the tail portion of piece.
If in the application in the specific implementation, second character is left bracket, basis searches whether to obtain backward Right parenthesis is obtained, determines that the position of the tail portion of the formula fragment includes:
If second character is left bracket, judge to search whether to obtain right parenthesis backward;
If lookup does not obtain right parenthesis, terminate to search backward, determine that the character after the formula space character is The position of the tail portion of formula fragment;
Right parenthesis is obtained if searched, and the character between the right parenthesis and left bracket is letter and/or mathematic sign And letter and/or mathematic sign with number, the formula space character is inserted into behind the right parenthesis.
In the application in the specific implementation, the extra formula space character is specially:Two continuous formula blank characters Number.
The corresponding above method, the application also provide the La Taihe form formula manipulation devices in a kind of text identification, including:
Quantity judgment module for obtaining the formula space character quantity of formula in text identification, and judges the formula Whether space character quantity is even number;
Head determining module, for being such as even number, according to the word before the first formula space character of each formula fragment Symbol type determines the position on the head of formula fragment;
Tail portion determining module, it is true for the character types after the last formula space character according to each formula fragment Determine the position of the tail portion of formula fragment;
Puncture module for deleting extra formula space character, obtains complete La Taihe forms formula.
In the application in the specific implementation, described device further includes:
Symbol is inserted into module, for being such as odd number, is not included in formula fragment before searching each formula space character Character either formula space character and a formula space character is inserted into after the character or formula space character.
In the application in the specific implementation, the head determining module includes:
First character judging unit, for detecting first before the first formula space character of each formula fragment The type of character;
First searches end unit, if being Chinese for first character, appointing in formula space character, punctuation mark One kind then terminates Look-ahead, determines position of the character for the head of formula fragment after the formula space character;
First character crosspoint, if being number for first character, alphabetical or mathematic sign, clearing house The position of formula space character and first character is stated, and continues to detect the position for determining formula fragment head forward;
Whether left bracket searching unit if being right parenthesis for first character, a left side is obtained according to Look-ahead Bracket determines the position on the head of the formula fragment.
In the application in the specific implementation, the left bracket searching unit includes:
First judgment sub-unit if being right parenthesis for first character, judges whether Look-ahead obtains a left side Bracket;
First obtains subelement, if not obtaining left bracket for searching, terminates Look-ahead, determines between the formula Position of the character for the head of formula fragment after symbol;
First does not obtain subelement, if obtaining left bracket for searching, and the word between the right parenthesis and left bracket It is letter and/or mathematic sign and letter and/or mathematic sign and number to accord with, and the formula space character is inserted into described Before left bracket.
In the application in the specific implementation, the tail portion determining module includes:
Second character judging unit, for detecting after the last formula space character of each formula fragment The type of two characters;
Second searches end unit, if being Chinese for second character, appointing in formula space character, punctuation mark One kind then terminates to search backward, and the character before determining the formula space character is the position of the tail portion of formula fragment;
Second character crosspoint, if being letter for second character, digital or mathematic sign, clearing house The position of formula space character and second character is stated, and continues to detect the position for determining formula fragment tail portion backward;
Right parenthesis searching unit, it is right according to searching whether to obtain backward if being left bracket for second character Bracket determines the position of the tail portion of the formula fragment.
In the application in the specific implementation, the right parenthesis searching unit includes:
Second judgment sub-unit if being left bracket for second character, judges to search whether to obtain backward right Bracket;
Second obtains subelement, if not obtaining right parenthesis for searching, terminates to search backward, determines between the formula Position of the character for the tail portion of formula fragment after symbol;
Second does not obtain subelement, if obtaining right parenthesis for searching, and the word between the right parenthesis and left bracket It is letter and/or mathematic sign and letter and/or mathematic sign and number to accord with, and the formula space character is inserted into described Behind right parenthesis.
In the application in the specific implementation, the extra formula space character is specially:Two continuous formula blank characters Number.
When the embodiment of the present application judges the formula space character quantity for even number, according to first public affairs in each formula fragment Character types before formula space character determine the position on formula head in formula fragment, according to last in each formula fragment Character types after formula space character determine the position of formula tail portion in formula fragment.And then it deletes between extra formula Every symbol, complete La Taihe forms formula is obtained.The application can make La Taihe form formula fragments Fully automated synthesis to draw Safe conspicuous form formula, saves the cost of labor of image identification, improves recognition efficiency.
Description of the drawings
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or it will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments described in application, for those of ordinary skill in the art, can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the one embodiment flow of La Taihe forms formula processing method in a kind of text identification that the application provides Figure;
Fig. 2 is another embodiment flow of La Taihe form formula processing methods in a kind of text identification that the application provides Figure;
Fig. 3 is that step S2 mono- is implemented in the La Taihe form formula processing methods in a kind of text identification that the application provides Example flow chart;
Fig. 4 is that step S24 mono- is real in the La Taihe form formula processing methods in a kind of text identification that the application provides Apply a flow chart;
Fig. 5 is that step S3 mono- is implemented in the La Taihe form formula processing methods in a kind of text identification that the application provides Example flow chart;
Fig. 6 is that step S34 mono- is real in the La Taihe form formula processing methods in a kind of text identification that the application provides Apply a flow chart;
Fig. 7 is one example structure of La Taihe form formula manipulation device in a kind of text identification that the application provides Figure;
Fig. 8 is another example structure of La Taihe form formula manipulation devices in a kind of text identification that the application provides Figure;
Fig. 9 is head determining module in the La Taihe form formula manipulation devices in a kind of text identification that the application provides One example structure figure;
Figure 10 is that head determines mould in the La Taihe form formula manipulation devices in a kind of text identification that the application provides One example structure figure of left bracket searching unit in block;
Figure 11 is that tail portion determines mould in the La Taihe form formula manipulation devices in a kind of text identification that the application provides One example structure figure of block;
Figure 12 is that tail portion determines mould in the La Taihe form formula manipulation devices in a kind of text identification that the application provides One example structure figure of right parenthesis searching unit in block;
Figure 13 is the hardware of the electronic equipment of the La Taihe form formula processing methods in the text identification that the application provides Structure diagram;
Figure 14 is the flow chart of one concrete application scene of the application.
Specific embodiment
When the embodiment of the present application judges the formula space character quantity for even number, according to first public affairs in each formula fragment Character types before formula space character determine the position on formula head in formula fragment, according to last in each formula fragment Character types after formula space character determine the position of formula tail portion in formula fragment.And then it deletes between extra formula Every symbol, complete La Taihe forms formula is obtained.The application can make La Taihe form formula fragments Fully automated synthesis to draw Safe conspicuous form formula, saves the cost of labor of image identification, improves recognition efficiency.
Although the application can have many various forms of embodiments, in the accompanying drawings display and will herein in detail The specific embodiment of description, it should be appreciated that the disclosure of this embodiment should be considered as the example of principle, and be not intended to this Shen It please be limited to the specific embodiment being shown and described.In the following description, identical label shows for describing the several of attached drawing Identical, similar or corresponding part in figure.
As used herein, "one" or " one kind " of term are defined as one (kind) or more than one (kind).As herein It is used, term " multiple " is defined as two or more than two.As used herein, term " other " is defined as at least again It is one or more.As used herein, term "comprising" and/or " having " are defined to include (that is, open language).Such as Used herein, term " coupling " is defined as connecting, but is not necessarily to be directly connected to, and is not necessarily mechanically to connect. As used herein, term " program " or " computer program " or similar terms are defined as designed on the computer systems The instruction sequence of execution." program " or " computer program " may include subprogram, function, process, object method, object implementatio8, Executable application, applet, servlet, source code, object code, shared library/dynamic load library and/or design are used In the other instruction sequences performed on the computer systems.
Table is referred to " one embodiment ", " some embodiments ", " embodiment " or similar terms in entire this document Show that a particular feature, structure, or characteristic described in conjunction with the embodiments is included at least one embodiment of the invention.Therefore, exist The appearance of this word in the various places of entire this specification need not all represent identical embodiment.It is in addition, described specific Feature, structure or characteristic can combine in any suitable manner in one or more embodiments without limitation.
As used herein, term "or" should be construed as inclusive or represent any one or any group It closes.Therefore, " A, B or C " expression " following any one:A;B;C;A and B;A and C;B and C;A, B and C ".Only when element, When function, step or the combination of action inherently mutually exclusive in some way, it will the exception of this definition occurs.
In order to which those skilled in the art is made to more fully understand the technical solution in the application, below in conjunction with the embodiment of the present application In attached drawing, the technical solution in the embodiment of the present application is clearly and completely described, it is clear that described embodiment is only It is some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's all other embodiments obtained should all belong to the range of the application protection.
Further illustrate that the application implements with reference to illustrations.
Referring to Fig. 1, one embodiment of the application provides the La Taihe form formula processing methods in a kind of text identification, packet It includes:
S1, the formula space character quantity for obtaining formula in text identification, and judge that the formula space character quantity is No is even number.
For La Taihe form formula, all mathematical formulaes should be put between formula space character, but from net One complete formula is usually divided into multiple formula fragments by the La Taihe form formula crawled on standing.
For example, the La Taihe forms formula " topic 1 crawled:|-$$\frac{1}{2}$$|+$$\sqrt{12}$$-2$$^ { -1 } $ $ ", contain formula fragment:$$\frac{1}{2}$$、$$\sqrt{12}$$、$$^{-1}$$.And this is complete public Formula should be $ $ |-frac { 1 } { 2 } |+sqrt { 12 } -2^ { -1 } $ $.
Since the La Taihe forms formula crawled includes the formula fragment of some formula, so it is a formula originally to cause Topic, nowadays be shown three formula.Such as " topic 1:|-$$\frac{1}{2}$$|+$$\sqrt{12}$$-2$$^ There are 6 formula space character $ $ in { -1 } $ $ ", and in pairs, part among them is this 6 formula space character $ $ Formula fragment for latex forms.Whether the quantity for calculating formula fragment therein is even number, such as " |-$ $ frac { 1 } { 2 } $ $ |+$ $ the quantity of formula fragment in sqrt { 12 } $ $ -2 $ $ ^ { -1 } $ $ " be 3.
S2, it is such as even number, formula is determined according to the character types before the first formula space character of each formula fragment The position on the head of fragment.
If the quantity of formula space character is even number, can be according to the first formula space character of each formula fragment Character types before determine the head position of formula fragment.
S3, the tail that formula fragment is determined according to the character types after the last formula space character of each formula fragment The position in portion.
If the quantity of formula blank character is even number, can be according to the last formula space character of each formula fragment Character types later determine the position of the tail portion of formula fragment.
S4, extra formula space character is deleted, obtains complete La Taihe forms formula.
Formula space character extra in multiple formula fragments is deleted, gets through the formula on both sides, obtains complete draw Safe conspicuous form formula.
Therefore, the application can make La Taihe form formula fragments Fully automated synthesis save figure for La Taihe form formula As the cost of labor of identification, recognition efficiency is improved.
In the application another specific embodiment, referring to Fig. 2, the method further includes:
S5, it is such as odd number, the character or formula being not included in before searching each formula space character in formula fragment Space character, and a formula space character is inserted into after the character or formula space character.
If the quantity of formula space character is odd number, to each formula space character $ $, first is found from the front A character or formula space character $ $ not in formula, and formula space character $ $ are inserted into behind the character.
Such as following topic 2:
It calculates:|-$ $ frac { 1 } { 2 } $ $ |+$ $ (topic 2)
For first couple of formula space character $ $, first character not in formula of the front for colon ":", then warp Topic after handling for the first time is crossed to reform into:
It calculates:$$|-$$\frac{1}{2}$$|+$$
In the same way, entire formula manipulation just becomes after finishing:
It calculates:$$|-$$$$\frac{1}{2}$$$$|+$$
After handling in this way, the number of the formula space character $ $ in topic has reformed into even number, then performs step S2.
In the application another specific embodiment, referring to Fig. 3, the step S2 includes:
S21, detection each formula fragment first formula space character before the first character type.
If S22, first character are Chinese, any in formula space character, punctuation mark, terminate to look into forward It looks for, determines position of the character for the head of formula fragment after the formula space character.
Specifically, if first character is any in Chinese, formula space character, punctuation mark, show institute State the part that the first character is not formula fragment.For example, before first formula space character $ $ in " (1), $ $ a+b $ $ " First character for ", ", it is determined that the position of character " a " after the formula space character $ $ for the head of formula fragment.
If S23, first character for number, letter or mathematic sign, exchange the formula space character and The position of first character, and continue to detect the position for determining formula fragment head forward.
Specifically, if first character is digital, letter or mathematic sign, show first character for public affairs A part for formula fragment.For example, the first character before first formula space character $ $ in " 6+ $ $ 5+9 $ $ " is "+", then hand over The position for changing the formula space character and first character obtains " 6 $ $+5+9 $ $ ".Continue to detect forward and determine that formula is broken The position of sheet head, the position for exchanging the formula space character and first character obtain " 6+5+9 ".
If S24, first character are right parenthesis, whether left bracket is obtained according to Look-ahead, determines the public affairs The position on the head of formula fragment.
Specifically, include referring to Fig. 4, the step S24:
If S241, first character are right parenthesis, judge whether Look-ahead obtains left bracket.
If S242, lookup do not obtain left bracket, terminate Look-ahead, determine the word after the formula space character Accord with the position on the head for formula fragment.
If acquisition left bracket S243, is searched, and the character between the right parenthesis and left bracket is letter and/or mathematics Symbol and letter and/or mathematic sign and number, the formula space character is inserted into before the left bracket.
For the embodiment of the present application whether when first character is right parenthesis, Look-ahead is needed to need will be between the formula It moves forward to before the left bracket before the right parenthesis, that is, is needed according to whether left bracket can be found, really every symbol The position on the head of the fixed formula fragment.
In the application another specific embodiment, referring to Fig. 5, the step S3 includes:
S31, detection each formula fragment last formula space character after the second character type.
If S32, second character are Chinese, any in formula space character, punctuation mark, terminate to look into backward It looks for, the character before determining the formula space character is the position of the tail portion of formula fragment.
Specifically, if second character is any in Chinese, formula space character, punctuation mark, show institute State the part that the second character is not formula fragment.For example, after first formula space character $ $ in " $ $ a+b $ $, " Two characters for ", ", it is determined that the position of character " b " before the formula space character $ $ for the tail portion of formula fragment.
If S33, second character for letter, number or mathematic sign, exchange the formula space character and The position of second character, and continue to detect the position for determining formula fragment tail portion backward.
Specifically, if second character is digital, letter or mathematic sign, show second character for public affairs A part for formula fragment.For example, the second character after last formula space character $ $ in " $ $ 5+9 $ $ -2 " is "-", then The position for exchanging the formula space character and second character obtains " 5+9- 2 ".Continue to detect backward and determine formula The position of fragment tail portion, the position for exchanging the formula space character and second character obtain " 5+9-2 ".
If S34, second character are left bracket, according to acquisition right parenthesis is searched whether backward, the public affairs are determined The position of the tail portion of formula fragment.
Specifically, include referring to Fig. 6, the step S34:
If S341, second character are left bracket, judge to search whether to obtain right parenthesis backward;
If S342, lookup do not obtain right parenthesis, terminate to search backward, determine the word after the formula space character Accord with the position of the tail portion for formula fragment.
If acquisition right parenthesis S343, is searched, and the character between the right parenthesis and left bracket is letter and/or mathematics Symbol and letter and/or mathematic sign and number, the formula space character is inserted into behind the right parenthesis.
The embodiment of the present application needs to search whether that needs will be between the formula backward when second character is left bracket It moves backward to behind the right parenthesis behind the left bracket, that is, is needed according to whether right parenthesis can be found, really every symbol The position of the tail portion of the fixed formula fragment.
In the application another specific embodiment, the extra formula space character is specially:Two continuous formula Space character.
It handles and completes according to step S1 to S3, it may appear that the situation that two formula space character $ $ link together, i.e. " $ $ $ $ ", this situation show that the front and back of this " $ $ $ $ " is all real formula, so can directly delete " $ $ $ $ " at this time It removes, gets through the formula on both sides.
Referring to Fig. 7, the corresponding above method, another embodiment of the application provides the La Taihe forms public affairs in a kind of text identification Formula processing unit, including:
Quantity judgment module 71 for obtaining the formula space character quantity of formula in text identification, and judges the public affairs Whether formula space character quantity is even number.
Head determining module 72, for being such as even number, before the first formula space character of each formula fragment Character types determine the position on the head of formula fragment.
Tail portion determining module 73, for the character types after the last formula space character according to each formula fragment Determine the position of the tail portion of formula fragment.
Puncture module 74 for deleting extra formula space character, obtains complete La Taihe forms formula.
For La Taihe form formula, all mathematical formulaes should be put between formula space character, but from net One complete formula is usually divided into multiple formula fragments by the La Taihe form formula crawled on standing.
For example, the La Taihe forms formula " topic 1 crawled:|-$$\frac{1}{2}$$|+$$\sqrt{12}$$-2$$^ { -1 } $ $ ", contain formula fragment:$$\frac{1}{2}$$、$$\sqrt{12}$$、$$^{-1}$$.And this is complete public Formula should be $ $ |-frac { 1 } { 2 } |+sqrt { 12 } -2^ { -1 } $ $.
Since the La Taihe forms formula crawled includes the formula fragment of some formula, so it is a formula originally to cause Topic, nowadays be shown three formula.Such as " topic 1:|-$$\frac{1}{2}$$|+$$\sqrt{12}$$-2$$^ There are 6 formula space character $ $ in { -1 } $ $ ", and in pairs, part among them is this 6 formula space character $ $ Formula fragment for latex forms.Whether the quantity for calculating formula fragment therein is even number, such as " |-$ $ frac { 1 } { 2 } $ $ |+$ $ the quantity of formula fragment in sqrt { 12 } $ $ -2 $ $ ^ { -1 } $ $ " be 3.
If the quantity of formula space character is even number, can be according to the first formula space character of each formula fragment Character types before determine the head position of formula fragment.
If the quantity of formula fragment is even number, can according to the last formula space character of each formula fragment it Character types afterwards determine the position of the tail portion of formula fragment.
Formula space character extra in formula fragment is deleted, the formula on both sides is got through, obtains complete La Taihe Form formula.
Therefore, the application can make La Taihe form formula fragments Fully automated synthesis save figure for La Taihe form formula As the cost of labor of identification, recognition efficiency is improved.
In the application another specific embodiment, referring to Fig. 8, described device further includes:
Symbol is inserted into module 75, and for being such as odd number, formula fragment is not included in before searching each formula space character In character either formula space character and a formula space character is inserted into after the character or formula space character.
If the quantity of formula space character is odd number, to each formula space character $ $, first is found from the front A character or formula space character $ $ not in formula, and formula space character $ $ are inserted into behind the character.
Such as following topic 2:
It calculates:|-$ $ frac { 1 } { 2 } $ $ |+$ $ (topic 2)
For first couple of formula space character $ $, first character not in formula of the front for colon ":", then warp Topic after handling for the first time is crossed to reform into:
It calculates:$$|-$$\frac{1}{2}$$|+$$
In the same way, entire formula manipulation just becomes after finishing:
It calculates:$$|-$$$$\frac{1}{2}$$$$|+$$
After handling in this way, the number of the formula space character $ $ in topic has reformed into even number, then performs head and determine Module 72.
In the application another specific embodiment, referring to Fig. 9, the head determining module 72 includes:
First character judging unit 721, for before detecting the first formula space character of each formula fragment The type of first character.
First searches end unit 722, if being Chinese for first character, formula space character, punctuation mark In it is any, then terminate Look-ahead, determine position of the character for the head of formula fragment after the formula space character.
First character crosspoint 723 if being number for first character, alphabetical or mathematic sign, is handed over The position of the formula space character and first character is changed, and continues to detect the position for determining formula fragment head forward.
Whether left bracket searching unit 724 if being right parenthesis for first character, obtains according to Look-ahead Left bracket determines the position on the head of the formula fragment.
Specifically, if first character is any in Chinese, formula space character, punctuation mark, show institute State the part that the first character is not formula fragment.For example, before first formula space character $ $ in " (1), $ $ a+b $ $ " First character for ", ", it is determined that the position of character " a " after the formula space character $ $ for the head of formula fragment.
Specifically, if first character is digital, letter or mathematic sign, show first character for public affairs A part for formula fragment.For example, the first character before first formula space character $ $ in " 6+ $ $ 5+9 $ $ " is "+", then hand over The position for changing the formula space character and first character obtains " 6 $ $+5+9 $ $ ".Continue to detect forward and determine that formula is broken The position of sheet head, the position for exchanging the formula space character and first character obtain " 6+5+9 ".
Specifically, referring to Figure 10, the left bracket searching unit 724 includes:
First judgment sub-unit 724a if being right parenthesis for first character, judges whether Look-ahead obtains Obtain left bracket.
First obtains subelement 724b, if not obtaining left bracket for searching, terminates Look-ahead, determines the public affairs Position of the character for the head of formula fragment after formula space character.
First does not obtain subelement 724c, if obtaining left bracket for searching, and between the right parenthesis and left bracket Character be letter and/or mathematic sign and letter and/or mathematic sign and number, the formula space character is inserted into Before the left bracket.
For the embodiment of the present application whether when first character is right parenthesis, Look-ahead is needed to need will be between the formula It moves forward to before the left bracket before the right parenthesis, that is, is needed according to whether left bracket can be found, really every symbol The position on the head of the fixed formula fragment.
In the application another specific embodiment, referring to Figure 11, the tail portion determining module 73 includes:
Second character judging unit 731, after detecting the last formula space character of each formula fragment The second character type.
Second searches end unit 732, if being Chinese for second character, formula space character, punctuation mark In it is any, then terminate to search backward, determine position of the character for the tail portion of formula fragment before the formula space character.
Second character crosspoint 733 if being letter for second character, digital or mathematic sign, is handed over The position of the formula space character and second character is changed, and continues to detect the position for determining formula fragment tail portion backward.
Right parenthesis searching unit 734, if being left bracket for second character, basis searches whether to obtain backward Right parenthesis determines the position of the tail portion of the formula fragment.
Specifically, if second character is any in Chinese, formula space character, punctuation mark, show institute State the part that the second character is not formula fragment.For example, after first formula space character $ $ in " $ $ a+b $ $, " Two characters for ", ", it is determined that the position of character " b " before the formula space character $ $ for the tail portion of formula fragment.
Specifically, if second character is digital, letter or mathematic sign, show second character for public affairs A part for formula fragment.For example, the second character after last formula space character $ $ in " $ $ 5+9 $ $ -2 " is "-", then The position for exchanging the formula space character and second character obtains " 5+9- 2 ".Continue to detect backward and determine formula The position of fragment tail portion, the position for exchanging the formula space character and second character obtain " 5+9-2 ".
Specifically, referring to Figure 12, the right parenthesis searching unit 734 includes:
Second judgment sub-unit 734a if being left bracket for second character, judges to search whether to obtain backward Obtain right parenthesis;
Second obtains subelement 734b, if not obtaining right parenthesis for searching, terminates to search backward, determines the public affairs Position of the character for the tail portion of formula fragment after formula space character.
Second does not obtain subelement 734c, if obtaining right parenthesis for searching, and between the right parenthesis and left bracket Character be letter and/or mathematic sign and letter and/or mathematic sign and number, the formula space character is inserted into Behind the right parenthesis.
The embodiment of the present application needs to search whether that needs will be between the formula backward when second character is left bracket It moves backward to behind the right parenthesis behind the left bracket, that is, is needed according to whether right parenthesis can be found, really every symbol The position of the tail portion of the fixed formula fragment.
In the application another specific embodiment, the extra formula space character is specially:Two continuous formula Space character.
It is completed according to above-mentioned resume module, it may appear that the situation that two formula space character $ $ link together, i.e. " $ $ $ $ ", this situation show that the front and back of this " $ $ $ $ " is all real formula, so can directly delete " $ $ $ $ " at this time It removes, gets through the formula on both sides.
Figure 13 is that the hardware configuration of the electronic equipment of the La Taihe form formula processing methods in the application text identification shows It is intended to.According to Figure 13, which includes:
One or more processors 1310 and memory 1320, in Figure 13 by taking a processor 1310 as an example.
The equipment of La Taihe form formula processing methods in text identification can also include:Input unit 1330 and output Device 1330.
Processor 1310, memory 1320, input unit 1330 and output device 1330 can by bus or other Mode connects, in Figure 13 for being connected by bus.
Memory 1320 is used as a kind of non-volatile computer readable storage medium storing program for executing, available for storing non-volatile software journey Sequence, non-volatile computer executable program and module, such as the La Taihe forms in the text identification in the embodiment of the present application Corresponding program instruction/the module of formula processing method is (for example, the list setup module 131, poster shown in attached drawing 13 are inserted into module 132).Processor 1310 is stored in non-volatile software program, instruction and module in memory 1320 by operation, so as to The La Taihege in above method embodiment text identification is realized in the various function application of execute server and data processing Formula formula processing method.
Memory 1320 can include storing program area and storage data field, wherein, storing program area can store operation system System, the required application program of at least one function;The La Taihe forms that storage data field can be stored in text identification are public Formula processing unit uses created data etc..In addition, memory 1320 can include high-speed random access memory 1320, Can also include nonvolatile memory 1320, a for example, at least magnetic disk storage 1320, flush memory device or other it is non-easily The property lost solid-state memory 1320.In some embodiments, memory 1320 is optional including remotely being set relative to processor 1310 The memory 1320 put, these remote memories 1320 can pass through network connection to audio mode selector.Above-mentioned network Example include but not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.
Input unit 1330 can receive the La Taihege in the number of input or character information and generation and text identification The key signals that the user setting and function control of formula formula manipulation device are related input.Output device 1330 may include loud speaker Etc. equipment.
One or more of modules are stored in the memory 1320, when by one or more of processors During 1310 execution, the La Taihe form formula processing methods in the text identification in above-mentioned any means embodiment are performed.
The said goods can perform the method that the embodiment of the present application is provided, and has the corresponding function module of execution method and has Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to the method that the embodiment of the present application is provided.
The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:
(1) mobile communication equipment:The characteristics of this kind equipment is that have mobile communication function, and to provide speech, data It communicates as main target.This Terminal Type includes:Smart mobile phone (such as iPhone), multimedia handset, functional mobile phone and low Hold mobile phone etc..
(2) super mobile personal computer equipment:This kind equipment belongs to the scope of personal computer, there is calculating and processing work( Can, generally also have mobile Internet access characteristic.This Terminal Type includes:PDA, MID and UMPC equipment etc., such as iPad.
(3) portable entertainment device:This kind equipment can show and play multimedia content.The kind equipment includes:Audio, Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.
(4) server:The equipment for providing the service of calculating, the composition of server are total including processor, hard disk, memory, system Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy Power, stability, reliability, safety, scalability, manageability etc. are more demanding.
(13) other have the function of the electronic device of data interaction.
Further illustrate that the application realizes below by one concrete application scene of the application.
Referring to Figure 14, the method includes:
1401st, the topic for carrying out Text region is received.
1402nd, whether judge in topic comprising formula space character.
If formula space character $ $ the 1403rd, are not included, without being handled.
If the 1404th, comprising formula space character $ $, the quantity N of calculation formula space character.
1405th, whether the quantity N of judgment formula space character is even number.
If being not the 1406, even number, first word not in formula is found before each formula space character $ $ The position of symbol, a formula space character $ $ is inserted into face behind, and performs step 1407.
1407th, if it is even number, the formula space character is grouped two-by-two, obtains the formula fragment in formula.
1408th, judge whether to have handled all formulas fragment.
If the 1409, having handled all formulas fragment, the formula space character of repetition is retrieved and deleted.
If the 1410, not handled all formulas fragment, judge that the formula space character in each formula fragment is The no beginning in place formula fragment.
If the beginning the 1411, in place formula fragment, character before judgment formula space character $ $ whether be The content included in formula, if it is not, then return to step 1408.
The 1412nd, if the character before formula space character $ $ is the content included in formula, exchange equation separator $ The position of $ and front character, and return to step 1411.
If the beginning of formula fragment where the 1413, being not at, character behind judgment formula separator $ $ whether be The content included in formula, if it is not, then return to step 1408..
The 1414th, if the character behind formula separator $ $ is the content included in formula, exchange equation separator $ $ With the position of character below, and return to step 1413.
For example, topic 1 " calculates:|-$$\frac{1}{2}$$|+$$\sqrt{12}$$-2$$^{-1}$$;", by one Become successively after series of processes:
It calculates:$$|-\frac{1}{2}$$|+$$\sqrt{12}$$-2$$^{-1}$$;
It calculates:$$|-\frac{1}{2}$$$$|+\sqrt{12}$$-2$$^{-1}$$;
It calculates:$$|-\frac{1}{2}$$$$|+\sqrt{12}$$$$-2^{-1}$$;
It calculates:$$|-\frac{1}{2}|+\sqrt{12}-2^{-1}$$;
For example, topic 2 " $ $ frac { 1 } { 4 } $ $ a $ $ ^ { 2 } $ $ -9 (b-c) $ $ ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } $ $ a-3b+3c, another factor are () ", by becoming successively after a series of processing:
$ $ frac { 1 } { 4 } $ $ a $ $ ^ { 2 } $ $ -9 (b-c) $ $ ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } $ $ a- 3b+3c, another factor are ()
$ $ frac { 1 } { 4 } $ $ $ $ a^ { 2 } $ $ -9 (b-c) $ $ ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } $ $ a- 3b+3c, another factor are ()
$ $ frac { 1 } { 4 } $ $ $ $ a^ { 2 } $ $ $ $ -9 (b-c) ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } $ $ a- 3b+3c, another factor are ()
$ $ frac { 1 } { 4 } $ $ $ $ a^ { 2 } $ $ $ $ -9 (b-c) ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } $ $ a- 3b+3c, another factor are ()
$ $ frac { 1 } { 4 } $ $ $ $ a^ { 2 } $ $ $ $ -9 (b-c) ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } $ $ a- 3b+3c, another factor are ()
$ $ frac { 1 } { 4 } $ $ $ $ a^ { 2 } $ $ $ $ -9 (b-c) ^ { 2 } $ $ a factor be $ $ frac { 1 } { 2 } a-3b+ 3c $ $, another factor are ()
$ $ { 2 } -9 (b-c) ^ { 2 } $ $ of frac { 1 } { 4 } a^ a factor be $ $ frac { 1 } { 2 } a-3b+3c $ $, separately One factor is ()
The apparatus embodiments described above are merely exemplary, wherein the module illustrated as separating component can To be or may not be physically separate, the component shown as module may or may not be physics mould Block, you can be located at a place or can also be distributed on multiple network modules.It can be selected according to the actual needs In some or all of module realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying creativeness Labour in the case of, you can to understand and implement.
It will be understood by those skilled in the art that embodiments herein can be provided as method, apparatus (equipment) or computer Program product.Therefore, in terms of the application can be used complete hardware embodiment, complete software embodiment or combine software and hardware Embodiment form.Moreover, the meter for wherein including computer usable program code in one or more can be used in the application The computer journey that calculation machine usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of sequence product.
The application is with reference to the method, apparatus (equipment) of embodiment and the flow chart and/or box of computer program product Figure describes.It should be understood that each flow and/or the side in flowchart and/or the block diagram can be realized by computer program instructions The combination of flow and/or box in frame and flowchart and/or the block diagram.These computer program instructions can be provided to logical With the processor of computer, special purpose computer, Embedded Processor or other programmable data processing devices to generate a machine Device so that the instruction generation performed by computer or the processor of other programmable data processing devices is used to implement in flow The device of function specified in one flow of figure or multiple flows and/or one box of block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction generation being stored in the computer-readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps are performed on calculation machine or other programmable devices to generate computer implemented processing, so as in computer or The instruction offer performed on other programmable devices is used to implement in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment of the application has been described, those skilled in the art once know basic creation Property concept, then additional changes and modifications may be made to these embodiments.So appended claims be intended to be construed to include it is excellent It selects embodiment and falls into all change and modification of the application range.Obviously, those skilled in the art can be to the application Various modification and variations are carried out without departing from spirit and scope.If in this way, these modifications and variations of the application Belong within the scope of the application claim and its equivalent technologies, then the application is also intended to exist comprising these modification and variations It is interior.

Claims (14)

1. a kind of La Taihe form formula processing methods in text identification, which is characterized in that including:
The formula space character quantity of formula in text identification is obtained, and judges whether the formula space character quantity is even Number;
It is such as even number, the head of formula fragment is determined according to the character types before the first formula space character of each formula fragment The position in portion;
The position of the tail portion of formula fragment is determined according to the character types after the last formula space character of each formula fragment It puts;
Extra formula space character is deleted, obtains complete La Taihe forms formula.
2. the method as described in claim 1, which is characterized in that the method further includes:
It is such as odd number, the character or formula blank character being not included in before searching each formula space character in formula fragment Number, and a formula space character is inserted into after the character or formula space character.
3. the method as described in claim 1, which is characterized in that it is described as being even number, according to the first public affairs of each formula fragment Character types before formula space character determine that the position on the head of formula fragment includes:
Detect the type of the first character before the first formula space character of each formula fragment;
If first character is Chinese, any in formula space character, punctuation mark, terminate Look-ahead, determine Position of the character for the head of formula fragment after the formula space character;
If first character is number, alphabetical or mathematic sign, the formula space character and described first are exchanged The position of character, and continue to detect the position for determining formula fragment head forward;
If first character is right parenthesis, whether left bracket is obtained according to Look-ahead, determines the formula fragment The position on head.
4. method as claimed in claim 3, which is characterized in that if first character be right parenthesis, according to Before search whether obtain left bracket, determine that the position on the head of the formula fragment includes:
If first character is right parenthesis, judge whether Look-ahead obtains left bracket;
If lookup does not obtain left bracket, terminate Look-ahead, determine that the character after the formula space character is formula The position on the head of fragment;
If search obtain left bracket, and the character between the right parenthesis and left bracket for letter and/or mathematic sign and Letter and/or mathematic sign and number, the formula space character is inserted into before the left bracket.
5. method as described in claim 1, which is characterized in that the last formula space character of each formula fragment of basis Character types later determine that the position of the tail portion of formula fragment includes:
Detect the type of the second character after the last formula space character of each formula fragment;
If second character is Chinese, any in formula space character, punctuation mark, terminate to search backward, determine Position of the character for the tail portion of formula fragment before the formula space character;
If second character is letter, digital or mathematic sign, the formula space character and described second are exchanged The position of character, and continue to detect the position for determining formula fragment tail portion backward;
If second character is left bracket, according to acquisition right parenthesis is searched whether backward, the formula fragment is determined The position of tail portion.
6. method as claimed in claim 5, which is characterized in that if second character be left bracket, according to After search whether obtain right parenthesis, determine that the position of the tail portion of the formula fragment includes:
If second character is left bracket, judge to search whether to obtain right parenthesis backward;
If lookup does not obtain right parenthesis, terminate to search backward, determine that the character after the formula space character is formula The position of the tail portion of fragment;
If search obtain right parenthesis, and the character between the right parenthesis and left bracket for letter and/or mathematic sign and Letter and/or mathematic sign and number, the formula space character is inserted into behind the right parenthesis.
7. method as claimed in claim 6, which is characterized in that the extra formula space character is specially:Two continuous Formula space character.
8. a kind of La Taihe form formula manipulation devices in text identification, which is characterized in that including:
Quantity judgment module for obtaining the formula space character quantity of formula in text identification, and judges the formula interval Whether symbol quantity is even number;
Head determining module, for being such as even number, according to the character type before the first formula space character of each formula fragment Type determines the position on the head of formula fragment;
Tail portion determining module determines public affairs for the character types after the last formula space character according to each formula fragment The position of the tail portion of formula fragment;
Puncture module for deleting extra formula space character, obtains complete La Taihe forms formula.
9. device as claimed in claim 8, which is characterized in that described device further includes:
Symbol is inserted into module, for being such as odd number, the word being not included in before searching each formula space character in formula fragment It accords with either formula space character and a formula space character is inserted into after the character or formula space character.
10. device as claimed in claim 9, which is characterized in that the head determining module includes:
First character judging unit, for detecting the first character before the first formula space character of each formula fragment Type;
First searches end unit, if being Chinese for first character, any in formula space character, punctuation mark Kind, then terminate Look-ahead, determine position of the character for the head of formula fragment after the formula space character;
First character crosspoint if being number for first character, alphabetical or mathematic sign, exchanges the public affairs The position of formula space character and first character, and continue to detect the position for determining formula fragment head forward;
Whether left bracket searching unit if being right parenthesis for first character, left bracket is obtained according to Look-ahead, Determine the position on the head of the formula fragment.
11. device as claimed in claim 10, which is characterized in that the left bracket searching unit includes:
First judgment sub-unit if being right parenthesis for first character, judges whether Look-ahead obtains left bracket;
First obtains subelement, if if not obtaining left bracket for searching, terminates Look-ahead, determines between the formula Position of the character for the head of formula fragment after symbol;
First does not obtain subelement, if obtaining left bracket for searching, and the character between the right parenthesis and left bracket is Letter and/or mathematic sign and letter and/or mathematic sign and number, are inserted into the left side by the formula space character and include Before number.
12. device as claimed in claim 11, which is characterized in that the tail portion determining module includes:
Second character judging unit, for detecting the second word after the last formula space character of each formula fragment The type of symbol;
Second searches end unit, if being Chinese for second character, any in formula space character, punctuation mark Kind, then terminate to search backward, the character before determining the formula space character is the position of the tail portion of formula fragment;
Second character crosspoint if being letter for second character, digital or mathematic sign, exchanges the public affairs The position of formula space character and second character, and continue to detect the position for determining formula fragment tail portion backward;
Right parenthesis searching unit, if being left bracket for second character, basis searches whether to obtain right parenthesis backward, Determine the position of the tail portion of the formula fragment.
13. device as claimed in claim 12, which is characterized in that the right parenthesis searching unit includes:
Second judgment sub-unit if being left bracket for second character, judges to search whether to obtain right parenthesis backward;
Second obtains subelement, if not obtaining right parenthesis for searching, terminates to search backward, determines the formula blank character Position of the character for the tail portion of formula fragment after number;
Second does not obtain subelement, if obtaining right parenthesis for searching, and the character between the right parenthesis and left bracket is Letter and/or mathematic sign and letter and/or mathematic sign and number, are inserted into the right side by the formula space character and include Behind number.
14. device as claimed in claim 13, which is characterized in that the extra formula space character is specially:Two companies Continuous formula space character.
CN201611227736.8A 2016-12-27 2016-12-27 Lateh format formula processing method and device in text recognition Active CN108241847B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611227736.8A CN108241847B (en) 2016-12-27 2016-12-27 Lateh format formula processing method and device in text recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611227736.8A CN108241847B (en) 2016-12-27 2016-12-27 Lateh format formula processing method and device in text recognition

Publications (2)

Publication Number Publication Date
CN108241847A true CN108241847A (en) 2018-07-03
CN108241847B CN108241847B (en) 2021-02-26

Family

ID=62702564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611227736.8A Active CN108241847B (en) 2016-12-27 2016-12-27 Lateh format formula processing method and device in text recognition

Country Status (1)

Country Link
CN (1) CN108241847B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507067A (en) * 2019-01-31 2020-08-07 北京易真学思教育科技有限公司 Acquisition method for displaying formula picture, and method and device for transferring formula picture
CN113139547A (en) * 2020-01-20 2021-07-20 阿里巴巴集团控股有限公司 Text recognition method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5572625A (en) * 1993-10-22 1996-11-05 Cornell Research Foundation, Inc. Method for generating audio renderings of digitized works having highly technical content
CN101149790A (en) * 2007-11-14 2008-03-26 哈尔滨工程大学 Chinese printing style formula identification method
CN101329731A (en) * 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
CN101388068A (en) * 2007-09-12 2009-03-18 汉王科技股份有限公司 Mathematical formula identifying and coding method
CN102033856A (en) * 2009-09-29 2011-04-27 北大方正集团有限公司 Formula composing method and system thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5572625A (en) * 1993-10-22 1996-11-05 Cornell Research Foundation, Inc. Method for generating audio renderings of digitized works having highly technical content
CN101388068A (en) * 2007-09-12 2009-03-18 汉王科技股份有限公司 Mathematical formula identifying and coding method
CN101149790A (en) * 2007-11-14 2008-03-26 哈尔滨工程大学 Chinese printing style formula identification method
CN101329731A (en) * 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
CN102033856A (en) * 2009-09-29 2011-04-27 北大方正集团有限公司 Formula composing method and system thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HUIFANG GUO等: "A method of adding an attribute into MathML for formula retrieval", 《IEEE》 *
田学东等: "基于统计特征的数学公式抽取方法的研究", 《计算机工程》 *
陈立辉等: "基于LaTex的Web数学公式提取方法研究", 《计算机科学》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507067A (en) * 2019-01-31 2020-08-07 北京易真学思教育科技有限公司 Acquisition method for displaying formula picture, and method and device for transferring formula picture
CN113139547A (en) * 2020-01-20 2021-07-20 阿里巴巴集团控股有限公司 Text recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108241847B (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN106534548B (en) Voice error correction method and device
CN111522994B (en) Method and device for generating information
CN105786980B (en) Method, device and equipment for merging different instances describing same entity
CN110781668B (en) Text information type identification method and device
CN106202028B (en) A kind of address information recognition methods and device
CN110611840B (en) Video generation method and device, electronic equipment and storage medium
CN108268441A (en) Sentence similarity computational methods and apparatus and system
CN104933084A (en) Method, apparatus and device for acquiring answer information
CN107168957A (en) A kind of Chinese word cutting method
CN113642659B (en) Training sample set generation method and device, electronic equipment and storage medium
CN110516233B (en) Data processing method, device, terminal equipment and storage medium
CN108255841A (en) A kind of method and its device of topic search
CN108921154A (en) Reading method, device, point read equipment and audio-video document correlating method
CN110489574A (en) A kind of multimedia messages recommended method, device and relevant device
CN104951439A (en) Electronic book and integration obtaining method and system for relevant electronic resources thereof
CN108241847A (en) La Taihe forms formula processing method and its device in a kind of text identification
CN116245097A (en) Method for training entity recognition model, entity recognition method and corresponding device
CN109359308A (en) Machine translation method, device and readable storage medium storing program for executing
CN108133209A (en) Target area searching method and its device in a kind of text identification
CN113240485B (en) Training method of text generation model, text generation method and device
CN108133168A (en) Formula searching method and its device in a kind of text identification
CN108255798A (en) A kind of input method and its device of La Taihe forms formula
CN106502988A (en) The method and apparatus that a kind of objective attribute target attribute is extracted
CN111968624B (en) Data construction method, device, electronic equipment and storage medium
CN105608067A (en) Automatic knowledge extraction method and apparatus for network teaching system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant