CN1084503C - Method for automatically correcting truncating error of document and device thereof - Google Patents

Method for automatically correcting truncating error of document and device thereof Download PDF

Info

Publication number
CN1084503C
CN1084503C CN96100537A CN96100537A CN1084503C CN 1084503 C CN1084503 C CN 1084503C CN 96100537 A CN96100537 A CN 96100537A CN 96100537 A CN96100537 A CN 96100537A CN 1084503 C CN1084503 C CN 1084503C
Authority
CN
China
Prior art keywords
character
mentioned
automatically
candidate matrix
scoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN96100537A
Other languages
Chinese (zh)
Other versions
CN1162158A (en
Inventor
张照煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transpacific IP Pte Ltd.
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial Technology Research Institute ITRI filed Critical Industrial Technology Research Institute ITRI
Priority to CN96100537A priority Critical patent/CN1084503C/en
Publication of CN1162158A publication Critical patent/CN1162158A/en
Application granted granted Critical
Publication of CN1084503C publication Critical patent/CN1084503C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The present invention relates to a method for automatic correction method for a document identification character truncating error and a device composed of the method to provide an automatic correction function for the truncating error when a character is truncated. First, an alternate character matrix is expanded into an expanded alternate character matrix by establishing a perpendicular character structure table and a horizontal character structure table which probably has the character truncating error in advance according to a document format of vertical writing or longitudinal writing, character string combinations of the expanded alternate character matrix are scored by a language model, the highest one is selected, and the character truncating error is corrected automatically.

Description

The document identification cut character error more correction method and device automatically
The present invention is relevant for a kind of error correction method thereof and device thereof of document identification, cuts the more correction method and the equipment therefor thereof automatically of character error when being particularly to the identification of Chinese character document.Its application category comprises hand-written Chinese identification on Chinese list reading machine, printing/hand-written Chinese text identification system, the pen based computer environment/line, manuscript paper reading machine, reaches other Chinese character document identification system.
Fig. 1 represents the processing flow chart of general Chinese character document identification.At first utilize image pick-up in step 10, for example common scanner (scanner) is converted into electronic signal with the literal image of file.Above-mentioned file may comprise block letter and handwritten form in practical application, so word space may not be identical.Picture and text separation, literal cutting are then carried out in the pre-treatment of step 20, find out a series of Chinese character textBox image.Then gained Chinese character textBox image is extracted its statistical nature or architectural feature in that step 30 is other, calculate each literal image characteristic values.Again with above-mentioned eigenwert with train the parameter model of the recognition word collection of gained to carry out aspect ratio in advance, find out similarity is the highest one or more candidate and corresponding similarity scoring therebetween, to constitute candidate matrix (step 50) to (step 40).Above-mentioned steps 10-50 is the identification stage of normal words, and the gained result is the candidate matrix; But reach the document identification stage, then need to carry out aftertreatment by language model.
With " crow " two words is example, might be regarded as " bird crow " when the text-recognition of reality, and gained candidate matrix class is following form seemingly:
Bird (20) crow (17)
Its similarity scoring of each candidate right side digitized representation of crow (22) refined (30), its numerical value is littler, expression and former font image similarity degree higher (that is otherness is littler).As mentioned above, the similarity degree of " bird crow " anti-" crow " comes highly.Therefore, the aftertreatment that step 60 is carried out promptly is to utilize language model to correct above-mentioned issuable text-recognition mistake, for example utilizes dictionary to select " crow " but not " bird crow ".General language model scoring can utilize the statistics scoring of knowing, and continues and shows or clump continues and shows or mark frequently based on the speech long word of dictionary as word table, speech the continue table, part of speech of word that continue between table, speech that continue, and shows with probable value or fractional value.Select the highest candidate word string of similarity degree as result's output by step 70 at last.
In the document identification, the mistake that is taken place between similar " crow " and " bird " generally is referred to as the replaceability mistake, results from feature extraction and aspect ratio in the step.In addition, also have a kind of character error of cutting, result from and cut the word step in the pre-treatment.Cut the character error of cutting that character error generally comprises the property cut apart, become " family jin " as " institute " by identification, " crow " become " tooth bird " by identification, and compressibility cut character error, become " just " as " capital is outstanding " by identification.For the hard and fast rule manuscript paper document that bright lattice/dark lattice are arranged, the problem of cutting character error is also not serious; But when the Chinese character document being arranged shortly or not having the input of natural handwriting of bright lattice/dark lattice, it is then quite obvious to cut character error.
Error-detecting of knowing at present and error correction technology all are confined to handle replaceability mistake aspect, Taiwan patent 81104438,80102492,80107315,83103817.For cutting character error, product now and laboratory system are all to provide manually-operated corrigendum instrument to solve.In practical application, obviously be not effective scheme.
Fundamental purpose of the present invention, what be to provide a kind of document identification cuts character error correction method more automatically, in order to the character error of cutting in effective solution text-recognition, improves the correctness of identification.
Another object of the present invention, what be to provide a kind of document identification cuts character error equipment more automatically, can produce the high identification result of correctness according to text-recognition gained candidate matrix.
According to above-mentioned purpose, what the invention provides a kind of document identification cuts character error correction method more automatically, in order to cut the character error corrigendum according to a perpendicular candidate matrix of writing document, above-mentioned candidate matrix is via producing behind the text-recognition, the present invention utilizes representative can cut apart and merge the vertical/horizontal font structural table of vertical/horizontal font, vertical/horizontal character coupling or uncoupling means expands to above-mentioned candidate matrix and expands the candidate matrix, the processing of marking of word string after utilizing a language model to above-mentioned expansion candidate matrix combined treatment again, select the highest word string of scoring, can will cut character error and correct automatically.
In addition, what the present invention also provided a kind of document identification cuts character error equipment more automatically, in order to cut the character error corrigendum according to a perpendicular candidate matrix of writing document, above-mentioned candidate matrix is via producing behind the text-recognition, it comprises: a vertical character coupling or uncoupling means receives above-mentioned candidate matrix, according to a vertical font structural table, it is expanded to expansion candidate matrix, use the situation that character is cut apart and character merges in the above-mentioned candidate matrix of expression; And a language model scoring apparatus, with the processing of marking of the word string after the above-mentioned expansion candidate matrix combined treatment, select the highest word string of its scoring, correct automatically will cut character error.
For above-mentioned purpose of the present invention, feature and advantage can be become apparent, this paper is especially exemplified by a specific embodiment, and conjunction with figs., is described in detail below:
Brief Description Of Drawings:
Fig. 1 is a process flow diagram of knowing the document discrimination method.
Fig. 2 is the automatic more process flow diagram of correction method of character error of cutting of the present invention.
Fig. 3 is the automatic more calcspar of equipment of character error of cutting of the present invention.
Fig. 4 separates about of the present invention and separated portions font example table up and down.
Generally cutting character error is the pre-treatment step that results from the document identification, of the present invention cut character error automatically more correction method then be before carrying out post-processing step, the candidate matrix is expanded to and expands the candidate matrix according to cutting apart situation and combination situation, with tangent character error more automatically.
The font structure of Chinese text, according to the relative position relation of each link (connected component), can divide into up and down separate (for example " calling together "), about separate (for example " institute "), partly contain (for example " asking ") and contain types such as (for example " returning ") entirely.When the document identification system is carried out pre-treatment literal cutting action,, generally adopt horizontal or vertical scanning to cut apart according to ways of writing.Therefore, it is the easiest of the perpendicular type of separation literal up and down that occurs in when writing to cut character error, occurs in left and right sides type of separation literal when writing across the page.On the other hand, cut character error, can divide into that the property cut apart is cut character error and compressibility is cut character error according to reason.But when that produced when after the cutting or merging back and improper literal, then the text-recognition stage can be thought it by mistake the normal text that another is altogether irrelevant, makes processing become very difficult.
Therefore, at the perpendicular literal that may cut character error in the document and handle for present embodiment institute desire of writing, have following condition:
(1) can be separated into two or more parts in succession up and down, and each in succession parts all form normal text.
(2) do not comprise after the separation that parts in succession can form the literal of the word sequence in succession of frequent appearance, for example: " two " ← → " one by one ".
In like manner, the literal that may cut character error and handle for present embodiment institute desire in the document of writing across the page has following condition:
(1) can about be separated into two or more parts in succession, and each in succession parts all form normal text.
(2) do not comprise after the separation that parts in succession can form the literal of the word sequence in succession of frequent appearance, for example: " good " ← → " woman ".
In the present embodiment, be that the word that still belongs to BIG-513051 character library (the second word collection) after separating with the interior literal of BIG-55401 character library (the first word collection) is an example, wherein, can be separated into up and down two, three, four in succession the literal of parts respectively have 397,14 and 1, can about be separated into two, three in succession the literal of parts respectively have 1570 and 38.List respectively among Fig. 4 about part and separate and the example that separates up and down.In addition, above-mentioned first word collection and the visual actual state of the second word collection are adjusted voluntarily, and certain first word collection can be identical with the second word collection.
According to above-described corresponding relation, can set up vertical font structural table and horizontal font structural table respectively, write document and write across the page document identification corrigendum use for perpendicular.The font structural table can represent that both are slightly different in the data statement with tabular structure or reticulate texture.With " paste " is example, can about be separated into " Mi Guyue " or " rice recklessly ", the tabular structure expression of various combinations can being itemized this moment, reticulate texture then can be represented according to stratum's segmentation.
Utilize vertical font structural table and horizontal font structural table, can handle the character error of cutting of property cut apart and compressibility.Fig. 2 represents to cut the automatic more process flow diagram of correction method of character error.Wherein, the flow process of text-recognition before the stage is constant, that is with the candidate matrix as input.According to the document format write, respectively the perpendicular document of writing is handled (step 52) with the document of writing across the page.For the perpendicular document of writing, deciliter handle (step 54) with vertical character and the candidate matrix of N * M is extended to expands the candidate matrix, wherein N is input word number, the M candidate number for each input word.In vertical character deciliter processing, be that preceding L higher candidate of similarity degree word for word cut apart and possible merging, to check all possible character error of cutting, wherein L is the positive integer that is not more than M.As for the adjustment in the similarity scoring, then can set according to actual demand.In the present embodiment, get L=1; When cutting apart character (C → C1, C2), C (SC) → C1 (SC) then, C2 (0); When merging character (C1, C2 → C), C1 (SC1) then, C2 (SC2) → C (SC1+SC2+15), wherein SC, SC1, SC2 represent the similarity scoring of corresponding character.Then carry out aftertreatment (step 60) with a language model, it is the highest to find out scoring in the word string by various combinations.By such handling procedure, can will cut character error and correct automatically, obtain correct result's output (step 70).For the document of writing across the page, processing mode is identical, repeats no more herein.
Above-mentioned character is cut apart, character merges, the word string combination, processing such as language model word string scoring, can interlock or batch mode carry out, for example former word string combination → scoring → character is cut apart → word string combination → scoring → character merging → word string combination → scoring, or character is cut apart → combination → scoring of character merging → word string.In addition, character merge with dividing processing all be that candidate matrix with input is an object, that is the result after the dividing processing no longer does to merge and handles, the result who merges after handling also no longer carries out dividing processing.
Now with example explanation present embodiment, the document fragment of being imported is:
" Tokyo especially be exactly electricity logical target "
Candidate matrix according to text-recognition stage gained is:
East 34 cards, 34 bundles 35
Capital 47 is cooked 64 64
Outstanding 35 In-particular, 48 arts 58
Its 35 dustpan 51 calculates 54
Capital 52 is cooked 58 65
Outstanding 43 In-particular, 52 arts 59
Be 29 fixed 42 foots 43
Electricity 35 hails 37 secondary rainbows 37
Logical 39 suitable 48 is near by 53
Family 52 table tennis 61 Yin 67
55 liter of 58 row 74 of jin
43 about 63 hooks 63
Order 32 months 48 times 60
Mark 35 stupefied 41 43
Wherein, each candidate right side is its similarity scoring, and the numerical value little person's similarity degree of healing is higher.Utilize dividing processing, can be with above-mentioned candidate matrix expansion, wherein
43 about 63 hooks 63
→ white 43
Spoon 0
Mark 35 stupefied 41 43
→ wood 35
Ticket 0
Utilize to merge and handle then:
Capital 47 is cooked 64 64
Outstanding 35 In-particular, 48 arts 58
With regard to 97
Capital 52 is cooked 58 65
Outstanding 43 In-particular, 52 arts 59
With regard to 110
Family 52 table tennis 61 Yin 67
55 liter of 58 row 74 of jin
Institute 122
TOP V through word string combination scoring gained in the original candidate matrix is in regular turn:
1[2132] Tokyo especially the capital be the target of the logical family of electricity jin especially
2[2127] Tokyo especially the capital be the target of the logical family of electricity jin especially
3[2123] Tokyo especially the capital be the target of the logical family of secondary rainbow jin especially
4[2121] Tokyo especially the capital be the target of the logical family of electricity jin especially
5[2120] Tokyo especially the ancestor be the target of the logical family of electricity jin especially
Wherein the best result person is numbering 1 (scoring is 2132).As for scoring, enumerate following numerical example now via the new word string combination of expansion candidate matrix gained:
A[2105] Tokyo especially the capital be the logical family of electricity jin white peony root target especially
B[2099] Tokyo especially the capital be the order wood ticket of the logical family of electricity jin especially
C[2113] east is the target of the logical family of electricity jin especially with regard to its capital
D[2143] Tokyo especially is exactly the target of the logical family of electricity jin
E[2160] Tokyo especially be exactly electricity logical target
Numbering A general " " → " white peony root ", scoring descends; B is with " mark " → " wooden ticket " for numbering, and scoring descends; C is with first " capital is outstanding " → " just " for numbering, and scoring reduces; D is with second " capital is outstanding " → " just " for numbering, and scoring is risen; E person is with second " capital is outstanding " → " just " and " family jin " → " institute " for numbering, and scoring is not only risen, and be best result (2160), so word string makes up and be correct output result, cuts character error simultaneously and also corrects automatically.
Fig. 3 cuts the automatic more calcspar of equipment of character error.Vertical/horizontal character coupling or uncoupling means 80 utilizes suitable processings of cutting apart or merges with the candidate matrix of input, produces the expansion candidate matrix of correspondence, marks and selects wherein soprano via language model scoring apparatus 82, as the aftertreatment result.Wherein vertical/horizontal character coupling or uncoupling means 80 and language model scoring apparatus 82 can be implemented by computer program.
Though the present invention discloses as above with concrete implementation column; but it is not in order to qualification the present invention, any those skilled in the art, without departing from the spirit and scope of the present invention; can do a little modification and retouching, so protection scope of the present invention should be as the criterion with the qualification person of accompanying Claim institute.

Claims (21)

  1. A document identification cut character error correction method more automatically, can be in order to cut the character error corrigendum according to a perpendicular candidate matrix of writing document, above-mentioned candidate matrix can is characterized in that via producing behind the text-recognition:
    Utilize representative may cut apart and a vertical font structural table that merges the font of cutting character error, one vertical character coupling or uncoupling means expands to above-mentioned candidate matrix and expands the candidate matrix, the processing of marking of word string after utilizing a language model to above-mentioned expansion candidate matrix combined treatment again, select the highest word string of scoring, can will cut character error and correct automatically.
  2. 2. the character error correction method more automatically of cutting as claimed in claim 1, above-mentioned vertical font structural table is the font that utilizes one first word to concentrate, the each several part of its vertical separation still is the font that one second word is concentrated, the both sides relation table of being set up.
  3. 3. the character error correction method more automatically of cutting as claimed in claim 2, wherein above-mentioned vertical font structural table is to utilize the tabular structure to represent.
  4. 4. the character error correction method more automatically of cutting as claimed in claim 2, wherein above-mentioned vertical font structural table is to utilize reticulate texture to represent.
  5. 5. the character error correction method more automatically of cutting as claimed in claim 2, the wherein above-mentioned first word collection can be identical with the above-mentioned second word collection.
  6. 6. the character error correction method more automatically of cutting as claimed in claim 1, wherein above-mentioned vertical character coupling or uncoupling means, utilize above-mentioned vertical font structural table, the capable character that carries out of the higher preceding L of probability in the above-mentioned candidate matrix is merged processing or character dividing processing, produce above-mentioned expansion candidate matrix, L is a positive integer and the total line number that is not more than above-mentioned candidate matrix.
  7. 7. the character error correction method more automatically of cutting as claimed in claim 6, wherein above-mentioned character dividing processing, character merge processing, combined treatment and scoring to be handled to interlock and carries out, to select the highest word string of scoring.
  8. 8. the character error correction method more automatically of cutting as claimed in claim 6, wherein above-mentioned character dividing processing, character merge processing, combined treatment and scoring to be handled and can batch carry out, to select the highest word string of scoring.
  9. A document identification cut character error equipment more automatically, can be in order to cut the character error corrigendum according to a perpendicular candidate matrix of writing document, above-mentioned candidate matrix can is characterized in that comprising via producing behind the text-recognition:
    One vertical character coupling or uncoupling means receives above-mentioned candidate matrix, according to a vertical font structural table, it is expanded to expansion candidate matrix, with the situation of representing that character is cut apart and character merges in the above-mentioned candidate matrix; And
    One language model scoring apparatus with the processing of marking of the word string after the above-mentioned expansion candidate matrix combined treatment, is selected the highest word string of its scoring, corrects automatically will cut character error.
  10. A document identification cut character error correction method more automatically, can be in order to cut the character error corrigendum according to a candidate matrix of writing across the page document, above-mentioned candidate matrix can is characterized in that via producing behind the text-recognition:
    Utilize representative may cut apart and merge the horizontal font structural table of the font of cutting character error, one horizontal character coupling or uncoupling means expands to above-mentioned candidate matrix and expands the candidate matrix, the processing of marking of word string after utilizing a language model to above-mentioned expansion candidate matrix combined treatment again, select the highest word string of scoring, can will cut character error and correct automatically.
  11. 11. the character error correction method more automatically of cutting as claimed in claim 10, wherein above-mentioned horizontal font structural table is the font that utilizes one first word to concentrate, and the each several part of its horizontal separation still is the font that one second word is concentrated, the both sides relation table of being set up.
  12. 12. the character error correction method more automatically of cutting as claimed in claim 11, wherein above-mentioned horizontal font structural table is to utilize the tabular structure to represent.
  13. 13. the character error correction method more automatically of cutting as claimed in claim 11, wherein above-mentioned horizontal font structural table is to utilize reticulate texture to represent.
  14. 14. the character error correction method more automatically of cutting as claimed in claim 11, the wherein above-mentioned first word collection can be identical with the above-mentioned second word collection.
  15. 15. the character error correction method more automatically of cutting as claimed in claim 10, wherein above-mentioned horizontal character coupling or uncoupling means, utilize above-mentioned horizontal font structural table, to capable character merging processing or the character dividing processing from left to right of carrying out of the higher preceding L of probability in the above-mentioned candidate matrix, produce above-mentioned expansion candidate matrix, L is a positive integer and the total line number that is not more than above-mentioned candidate matrix.
  16. Handle to interlock and carry out 16. the character error correction method more automatically of cutting as claimed in claim 15, wherein above-mentioned character dividing processing, character merge processing, combined treatment and scoring, select the highest word string of scoring.
  17. 17. the character error correction method more automatically of cutting as claimed in claim 15, merges processing, combined treatment and scoring and handles and can batch carry out wherein above-mentioned dividing processing, selects the highest word string of scoring.
  18. 18. the character error correction method more automatically of cutting as claimed in claim 10, wherein above-mentioned horizontal character coupling or uncoupling means, utilize above-mentioned horizontal font structural table, to capable character merging processing or the character dividing processing from right to left of carrying out of the higher preceding L of probability in the above-mentioned candidate matrix, produce above-mentioned expansion candidate matrix, L is a positive integer and the total line number that is not more than above-mentioned candidate matrix.
  19. Handle to interlock and carry out 19. the character error correction method more automatically of cutting as claimed in claim 18, wherein above-mentioned character dividing processing, character merge processing, combined treatment and scoring, to select the highest word string of scoring.
  20. Handle and batch to carry out 20. the character error correction method more automatically of cutting as claimed in claim 18, wherein above-mentioned character dividing processing, character merge processing, combined treatment and scoring, to select the highest word string of scoring.
  21. 21. a document identification cut character error equipment more automatically, can be in order to cut the character error corrigendum according to a candidate matrix of writing across the page document, above-mentioned candidate matrix can is characterized in that comprising via producing behind the text-recognition:
    One horizontal character coupling or uncoupling means receives above-mentioned candidate matrix, according to a horizontal font structural table, it is expanded to expansion candidate matrix, with the situation of representing that character is cut apart and character merges in the above-mentioned candidate matrix; And
    One language model scoring apparatus with the processing of marking of the word string after the above-mentioned expansion candidate matrix combined treatment, is selected the highest word string of its scoring, corrects automatically will cut character error.
CN96100537A 1996-04-09 1996-04-09 Method for automatically correcting truncating error of document and device thereof Expired - Fee Related CN1084503C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN96100537A CN1084503C (en) 1996-04-09 1996-04-09 Method for automatically correcting truncating error of document and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN96100537A CN1084503C (en) 1996-04-09 1996-04-09 Method for automatically correcting truncating error of document and device thereof

Publications (2)

Publication Number Publication Date
CN1162158A CN1162158A (en) 1997-10-15
CN1084503C true CN1084503C (en) 2002-05-08

Family

ID=5116645

Family Applications (1)

Application Number Title Priority Date Filing Date
CN96100537A Expired - Fee Related CN1084503C (en) 1996-04-09 1996-04-09 Method for automatically correcting truncating error of document and device thereof

Country Status (1)

Country Link
CN (1) CN1084503C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101303731B (en) * 2007-05-09 2010-09-01 仁宝电脑工业股份有限公司 Method for generating printing line

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101311946B (en) * 2007-05-25 2010-10-27 仁宝电脑工业股份有限公司 Character identification method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN86100683A (en) * 1986-01-28 1987-08-19 中国人民解放军58026部队 A kind of ONLINE RECOGNITION device of handwritten Chinese character

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN86100683A (en) * 1986-01-28 1987-08-19 中国人民解放军58026部队 A kind of ONLINE RECOGNITION device of handwritten Chinese character

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101303731B (en) * 2007-05-09 2010-09-01 仁宝电脑工业股份有限公司 Method for generating printing line

Also Published As

Publication number Publication date
CN1162158A (en) 1997-10-15

Similar Documents

Publication Publication Date Title
CN1226717C (en) Automatic new term fetch method and system
CN1145872C (en) Method for automatically cutting and identiying hand written Chinese characters and system for using said method
EP1052593B1 (en) Form search apparatus and method
JP2713622B2 (en) Tabular document reader
CN1163841C (en) On-line hand writing Chinese character distinguishing device
US20040139384A1 (en) Removal of extraneous text from electronic documents
JPH0798765A (en) Direction-detecting method and image analyzer
US20100198827A1 (en) Method for finding text reading order in a document
WO2000020985A9 (en) Conversion of data representing a document to other formats for manipulation and display
WO2001071649A1 (en) Method and system for searching form features for form identification
CN1786965A (en) Method for acquiring news web page text information
CN1141666C (en) Online character recognition system for recognizing input characters using standard strokes
US20070133029A1 (en) Method of recognizing text information from a vector/raster image
CN1916940A (en) Template optimized character recognition method and system
CN1084503C (en) Method for automatically correcting truncating error of document and device thereof
CN1056933C (en) Chinese wrongly writen character automatic correcting method and device
CN1317664C (en) Confused stroke order library establishing method and on-line hand-writing Chinese character identifying and evaluating system
CN1437162A (en) Font recogtnizing method based on single Chinese characters
JPH08320914A (en) Table recognition method and device
CN1426017A (en) Method and its system for checking multiple electronic files
CN1955979A (en) Automatic extraction device, method and program of essay title and correlation information
CN1702682A (en) Document processing device and document processing method
CN1302415C (en) English-Chinese translation machine
JP2781150B2 (en) Character division method
JP4334068B2 (en) Keyword extraction method and apparatus for image document

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: YUDONG TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Effective date: 20070202

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20070202

Address after: Taiwan, China

Patentee after: Transpacific IP Pte Ltd.

Address before: Hsinchu County of Taiwan Province

Patentee before: Industrial Technology Research Institute

C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20020508

Termination date: 20110409