CN1501273A - Method of converting handwritten note into literal text and traveling equipment therefor - Google Patents

Method of converting handwritten note into literal text and traveling equipment therefor Download PDF

Info

Publication number
CN1501273A
CN1501273A CNA021464901A CN02146490A CN1501273A CN 1501273 A CN1501273 A CN 1501273A CN A021464901 A CNA021464901 A CN A021464901A CN 02146490 A CN02146490 A CN 02146490A CN 1501273 A CN1501273 A CN 1501273A
Authority
CN
China
Prior art keywords
hand
character
converted
writing text
written notes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA021464901A
Other languages
Chinese (zh)
Other versions
CN1271537C (en
Inventor
赖洪波
史敬威
关如冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Motorola Mobile Communication Technology Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN 02146490 priority Critical patent/CN1271537C/en
Publication of CN1501273A publication Critical patent/CN1501273A/en
Application granted granted Critical
Publication of CN1271537C publication Critical patent/CN1271537C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The invention discloses a transformation method for converting hand written text into words text that can be applied to personal digital assistant (PDA) or mobile devices having PDA function comprising, dividing the output area into uninterrupted sub-regions having the same shape and range, recording the character information inputted into the hand-written input area and discarding the character information inputted outside the hand-written input area, extracting the character characteristic in the picture document, calling the hand-written recognition engine to identify the character characteristic and storing the recognition result into the buffer, displaying the words text in the buffer formed through recognition onto the display screen. The method by the invention can realize simple and high performance transferring procedure with minimum computer resources.

Description

A kind of method and mobile device thereof that hand-written notes is converted to writing text
Technical field
The present invention relates to image recognition technology, particularly relate to a kind of be applied to PDA(Personal Digital Assistant) or have the PDA function mobile device hand-written notes are converted to the method for writing text, and the mobile device of realizing this method.
Background technology
PDA(Personal Digital Assistant) is because its powerful application function and the characteristic that is easy to carry about with one have obtained application more and more widely at present.And why PDA can be widely used, and this is can help the characteristics of user record information closely-related with it.The user needs at any time that some is important transaction record to get off for the purpose of memorandum, at this moment the method that the information that just can use PDA to provide is imported and preserved, the information of preserving as e-file both was easy to carry, and be convenient to subsequent treatment again, so this function became the huge advantage of PDA.At present, PDA offers that the user carries out information input and the method for preserving mainly contains two kinds, first method is a handwriting recognition, user's input character one by one in the handwriting input district just, PDA discerns this character of user's input after the user imports a character immediately then, just convert writing text to, be shown to the output area then; But this identifying needs to consume the regular hour for computing PDA limited in one's ability, is not suitable for the situation that the user need promptly import a plurality of characters.For example in meeting or interview process, the user may need promptly a lot of Word message of input continuously, and at this moment very long identification waits for that process is insufferable to the user.
In order to address this problem, PDA provides second kind of method that information is imported and preserved to the user, and that takes down in short-hand application program exactly.By the shorthand application program, the user can carry out continual continuous input with stylus in the handwriting input district of PDA, character in the handwriting input district can be shown to the output area automatically, but it is different with last a kind of method, what be shown to the output area this moment still is user's original handwriting tracks, just hand-written notes.This method guaranteed the user can fast recording under bulk information, provide great convenience to the user.
But, because it in most of the cases all is unusual Useful Information concerning the user that the user needs fast continuously the information of input, therefore the user wishes and can remember edlin into to this writing pencil very much, for example increase, delete or revise some character or the like, this with regard to needs in this case PDA also can carry out literal identification to hand-written notes, just the formed image information of hand-written notes can be converted to the word content information that it comprises.But up to the present, also there is not method can on PDA, realize this conversion.If the user need edit these information, need to carry out again the handwriting recognition process of first method as described above so, it is very inconvenient that the user still can feel.Therefore, the user wishes strongly and can remember the whole identification of row into to whole writing pencil, to bring into play the function of shorthand application program better.
The technology of carrying out whole identification for character image has had the application of some moulding on other equipment such as computing machine, for example on computers with the matching used optical character identification of scanner (OCR) software.The identifying of this OCR software is divided into following process basically:
(1) scanning input characters image;
(2) image is carried out pre-service, comprise slant correction and filtering interfering noise etc.;
(3) image layout is analyzed and understood;
(4) to capable cutting of image and character segmentation;
(5) carry out the selection and the extraction of feature based on the individual character image;
(6) classify based on the pattern of individual character characteristics of image;
(7) give recognition result with the pattern that is classified;
(8) recognition result is edited, revised and handles.
In aforementioned calculation machine character image identifying, (2), the algorithm more complicated in (3) and (4) step, need take a large amount of computational resources, and because the hardware configuration of general PDA is lower, particularly the data-handling capacity of computing chip is lower, therefore can not finish these complicated algorithm on PDA, this also is why up to the present PDA can't be converted to the user reason of writing text by the hand-written notes of shorthand application program input.
Summary of the invention
In view of this, an object of the present invention is to provide the conversion method that a kind of advantages of simplicity and high efficiency that only needs to take the low computational effort resource is converted to hand-written notes writing text.
Another object of the present invention provides a kind of conversion equipment that the modular converter that uses said method is installed.
Above-mentioned purpose of the present invention is achieved by the following technical solutions:
A kind of PDA of being applied to or have the conversion method that hand-written notes is converted into writing text of the mobile device of PDA function comprises the steps:
A. the output area is divided into and has identical shaped and continuous subregion scope, and record is input to the character information in the handwriting input district and abandons the character information that is input to outside the handwriting input district, the information of each character of record is transformed in the continuous subregion of output area in proper order, preserves formed picture file;
B. extract character feature in the picture file;
C. calling handwriting recognition engine discerns this character feature and recognition result is kept in the buffering;
D. the writing text that will form through identification in will cushioning is shown on the display screen.
In the above-mentioned conversion method that hand-written notes is converted into writing text, the information of each character that will write down in step a is transformed in the process in the continuous subregion of output area in proper order, can compress processing to character information.This compression is handled to adopt and is taken out collimation method, and can be that 1 multiplication of voltage contracts.
In the above-mentioned conversion method that hand-written notes is converted into writing text, the range size of the continuous subregion of each output area can be the handwriting input district scope 1/4th.The continuous subregion of output area and the shape in handwriting input district can be rectangle or square.
In the above-mentioned conversion method that hand-written notes is converted into writing text, in steps d, may further include and set in advance common identification error contrast dictionary, in will cushioning in the writing text that identification forms is shown to process on the screen, the common identification error of system call contrast dictionary carries out automatic error correction to the recognition result in the buffering, and the recognition result that will carry out after the automatic error correction is shown on the screen.
In the above-mentioned conversion method that hand-written notes is converted into writing text, handwriting recognition engine can adopt the Chinese character hand-written recognition engine.
A kind of PDA or have the mobile device of PDA function, except CPU, internal memory and the display screen that is electrically connected to bus, also further comprise carrying out above-mentioned hand-written notes being converted to the modular converter of the conversion method of writing text, this modular converter is electrically connected to CPU, internal memory and display screen.
By technical scheme of the present invention as can be seen, because the OCR recognition methods of prior art relatively, the present invention is by abandoning the information of the character part that exceeds the handwriting input district, the size of each hand-written notes is fixed, thereby whole hand-written former notes have been divided into the independent hand-written notes of normal size, need not like this to carry out cutting according to some complicated cutting algorithms again, therefore omitted the pre-service that whole hand-written former notes is comprised slant correction and filtering interfering noise, the step that image layout is analyzed and understood and capable cutting of image and character segmentation etc. are needed a large amount of computational resources, simplified processing procedure, improve processing speed, thereby realized only need taking the goal of the invention that the low computational effort resource can simply be converted to hand-written notes writing text efficiently.
Simultaneously, use the present invention can on PDA, realize the batch identification of the hand-written former notes of entire chapter, improved processing speed, also significantly reduced and to have repeated to input to the inconvenience that the user brings.Making does not in this way increase extra handwriting recognition storehouse owing to be not required to be image recognition, thereby makes the user can make full use of existing computational resource, has avoided unnecessary extraneous expense.
Description of drawings
Fig. 1 shows according to the shape of handwriting input district on the PDA of the present invention and output area and position example;
Fig. 2 shows image acquisition flow process of the present invention;
Fig. 3 shows 16 * 16 pixel standard grids of dividing according to whole output area of the present invention;
Fig. 4 shows according to hand-written notes example of the present invention;
Fig. 5 shows according to the shorthand browsing file window example on the PDA of the present invention;
Fig. 6 show of the present invention to image carry out pre-service to output end product flow process;
Fig. 7 shows according to the identification on the PDA of the present invention and finishes the window example;
Fig. 8 is the schematic representation of apparatus that hand-written notes is converted to writing text according to of the present invention.
Embodiment
Below in conjunction with the drawings and specific embodiments the present invention is described in detail.
The present invention can be divided into four-stage with the method that hand-written notes are converted to writing text: image acquisition, image pre-service, character recognition and post-processed.Characteristics of the present invention are to carry out some special processing in data acquisition, thereby can carry out integral body identification to hand-written notes too on the lower PDA of hardware configuration.
In the shorthand program of present PDA is used, data acquisition roughly is under the situation of opening PDA shorthand application software, directly carry out the literal input by stylus in the handwriting input district, when the user after the handwriting input district has been write, the output area can be transferred in the character of having write, and preserves as picture file then.That's about the size of it for data acquisition of the present invention, but carried out some special processing in this process.To introduce the present invention below in detail is how to carry out these special processings.
As shown in Figure 1, the size that might as well suppose the handwriting input district of PDA is 32 * 32 pixels, and what adopt in the example of the present invention is two frame handwriting input districts, and the size in each handwriting input district all is 32 * 32 pixels.In actual conditions, the shape in handwriting input district also can make rectangle, and for example its size is 32 * 24 pixels.
Opened the user under the situation of shorthand application program, by stylus input character within the handwriting input district.Generally speaking, the size in handwriting input district is enough big for the general user, and the user can not exceed the scope in handwriting input district in the ordinary course of things when input character.If but the user has exceeded the size in handwriting input district because of carelessness when input character, in order to guarantee in subsequent process, not carry out complicated operations such as cutting, as a special processing of the present invention, will abandon the character information that exceeds the handwriting input district in the present invention, just do not consider.Write down the character information within the handwriting input district then.
After a character is finished in user's input, this character information of system log (SYSLOG), after finishing input character in the handwriting input district, use is taken out collimation method the character information that the handwriting input district imports is shown to the output area.The collimation method of taking out that the present invention uses is a kind of known technology, and it is a kind of compression algorithm that diminishes in fact, for example will meet the line of odd number to take out, and can reduce the capacity of storage like this under the situation that does not influence identification.For example a line segment may be expressed as 10 11 00 10 01 00 01 with scale-of-two, after taking out the collimation method compression, only keeps the locational value of even numbers, and promptly this line segment changes into 1101000 and representing.Like this, take out after the collimation method compression for 1 times of a hand-written notes process level and vertical both direction, its size becomes 1/4th of original input characters size, and the word after so just this can being simplified is placed in the grid of one 16 * 16 pixel.
As shown in Figure 3,, fix each CSD in the output area, and to fix its size be 16 * 16 pixels, within the output area, just have the subregion of continuously arranged a plurality of 16 * 16 pixels like this as another special processing of the present invention.When the user as shown in Figure 4 behind character of the every input in handwriting input district, through taking out collimation method it is compressed to 1/4th of original size, be presented at then in the standard subregion of 16 * 16 pixels of output area.Each word of importing in the handwriting input district is carried out same processing, all import up to all words and finish and all be placed in the subregion of output area 16 * 16 pixels in proper order.After the literal of user's input takes a panel or finishes whole input process, the hand-written notes of output area are preserved as picture file.
By above-mentioned two special processings, the formed picture file of the present invention is different with the formed picture file of prior art, it can be divided into the continuum character picture of standard at an easy rate, thereby omitted in general identifying, must carry out image is comprised the pre-service of slant correction and filtering interfering noise etc., image layout is analyzed and understood and to steps such as capable cutting of image and character segmentations, owing to do not need to take again the calculating that a large amount of resources is carried out these complexity, therefore can finish at an easy rate from pre-service to the subsequent process that shows final recognition result.
Next the writing pencil of preserving is remembered into capable pre-service in data acquisition.As shown in Figure 5, if the user has selected to open the identification page, then carry out step as shown in Figure 6.At first read the picture file of being preserved, carry out cutting then and handle.Because each the hand-written notes in the output area all are positioned at fixing screen position, and its size also is 16 * 16 pixels of standard, therefore can omit complicated inclination rectification, filtering interfering noise, to image layout analyze and understand, to processes such as capable cutting of image and character segmentations.
Actual image pre-service is a known technology, for example can determine that it is black to each pixel of each hand-written notes or for white, and represent with 1 and 0 respectively, can obtain a binary sequence after like this each pixel being analyzed, with this binary sequence basis that identification is handled as successive character.
Character recognition process also is a known technology, for a character information that whenever reads in the preprocessing process, and binary sequence just, the system call handwriting recognition engine carries out identification, and the ISN with this Chinese character is kept in the buffer memory simultaneously.Each character for the output area repeats this process, can intactly obtain all writing text information of whole hand-written notes.
In subsequent processes, in order to improve the recognition effect that the user imports Chinese character, system can also provide the function of automatic error correction.And just because of the present invention with all input contents unify identification, therefore just make error correction become possibility, in the handwriting input of prior art, because the literal to input is independent identification, internal logical relationship between the literal of being imported is isolated out, so can't judge and whether import mistake.And the method for the application of the invention, what store in the buffer memory is a complete sentence, so just can remove some manifest error.Also have manyly as for the algorithm of error correction, for example can use existing artificial intelligence (AI) identification error, but these methods are more loaded down with trivial details.Therefore can set up a common speech identification error contrast dictionary in the present invention.For example " how do you do in the order sky " will be made into " today, how do you do " automatically.After the entire chapter document recognition was finished, system carried out a phrase error correction again in buffer memory, and whether search has " order sky " such vocabulary, will replace with " today " if any.After system's error correction is finished, the writing text in the buffer memory is shown to the output area as shown in Figure 7, can select to delete former hand-written notes then.
Above-mentionedly describe the method that hand-written notes is converted to writing text of the present invention in detail, in addition, the present invention also provides a kind of device that hand-written notes is converted to writing text, it is on the basis of existing PDA, increase by one and be used to use method of the present invention to carry out the hand-written modular converter of taking down notes the writing text conversion, its synoptic diagram as shown in Figure 8.As seen in Figure 8, this modular converter and CPU, touch-screen and internal memory are electrically connected, and under their help, comprise steps such as image acquisition, image pre-service, character recognition and post-processed by what carry out above-mentioned conversion method, thereby realize hand-written notes are converted to the purpose of writing text.
In data acquisition of the present invention, except taking out collimation method, also can use other compression algorithm, as long as can reduce memory capacity, do not influence literal identification again and get final product.And the present invention also is not limited to PDA, and for the mobile device or the handheld device of any resource-constrained, as long as they have the shorthand function, the present invention can be suitable for.Therefore being appreciated that above-mentioned only is detailed introduction to one embodiment of the present of invention, not in order to restriction protection scope of the present invention.

Claims (9)

1. a conversion method that hand-written notes is converted into writing text that is applied to PDA(Personal Digital Assistant) or has the mobile device of PDA function comprises the steps:
A. the output area is divided into and has identical shaped and continuous subregion scope, and record is input to the character information in the handwriting input district and abandons the character information that is input to outside the handwriting input district, the information of each character of record is transformed in the described continuous subregion of output area in proper order, preserves formed picture file;
B. extract character feature in the described picture file;
C. calling handwriting recognition engine discerns this character feature and recognition result is kept in the buffering;
D. the writing text that forms through identification in the described buffering is shown on the display screen.
2. the conversion method that hand-written notes is converted to writing text according to claim 1, it is characterized in that, the information of each character that will write down in step a is transformed in the process in the described continuous subregion of output area in proper order, and described character information is compressed processing.
3. according to claim 2 hand-written notes are converted to the conversion method of writing text, it is characterized in that described compression is handled to adopt and taken out collimation method.
4. according to claim 2 hand-written notes are converted to the conversion method of writing text, it is characterized in that described compression is handled and adopted 1 multiplication of voltage to contract.
5. according to claim 4 hand-written notes are converted to the conversion method of writing text, it is characterized in that, the range size of the continuous subregion of each output area be the handwriting input district scope 1/4th.
6. according to claim 1 hand-written notes are converted to the conversion method of writing text, it is characterized in that, the continuous subregion of output area and handwriting input district are shaped as rectangle or square.
7. the conversion method that hand-written notes is converted to writing text according to claim 1, it is characterized in that, in steps d, further comprise and set in advance common identification error contrast dictionary, in described will the buffering in the writing text that identification forms is shown to process on the screen, the described common identification error contrast of system call dictionary carries out automatic error correction to the recognition result in the buffering, and the recognition result that will carry out after the automatic error correction is shown on the screen.
8. according to claim 1 hand-written notes are converted to the conversion method of writing text, it is characterized in that described handwriting recognition engine adopts the Chinese character hand-written recognition engine.
9. a PDA or have the mobile device of PDA function, at least comprise the CPU, internal memory and the display screen that are electrically connected to bus, it is characterized in that, comprise that further enforcement of rights requires 1 described hand-written notes to be converted to the modular converter of the conversion method of writing text, this modular converter is electrically connected to CPU, internal memory and display screen.
CN 02146490 2002-11-12 2002-11-12 Method of converting handwritten note into literal text and traveling equipment therefor Expired - Fee Related CN1271537C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 02146490 CN1271537C (en) 2002-11-12 2002-11-12 Method of converting handwritten note into literal text and traveling equipment therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 02146490 CN1271537C (en) 2002-11-12 2002-11-12 Method of converting handwritten note into literal text and traveling equipment therefor

Publications (2)

Publication Number Publication Date
CN1501273A true CN1501273A (en) 2004-06-02
CN1271537C CN1271537C (en) 2006-08-23

Family

ID=34232760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 02146490 Expired - Fee Related CN1271537C (en) 2002-11-12 2002-11-12 Method of converting handwritten note into literal text and traveling equipment therefor

Country Status (1)

Country Link
CN (1) CN1271537C (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011079432A1 (en) * 2009-12-29 2011-07-07 Nokia Corporation Method and apparatus for generating a text image
CN102289322A (en) * 2011-08-25 2011-12-21 盛乐信息技术(上海)有限公司 Method and system for processing handwriting
CN102736830A (en) * 2011-04-13 2012-10-17 联想移动通信科技有限公司 Handwriting input method and terminal equipment
CN106325596A (en) * 2016-08-17 2017-01-11 广州视睿电子科技有限公司 Automatic error correction method and system for writing handwriting
CN110348306A (en) * 2019-06-06 2019-10-18 上海学印教育科技有限公司 A kind of hand-written inputting method and system
WO2020125345A1 (en) * 2018-12-17 2020-06-25 掌阅科技股份有限公司 Electronic book note processing method, handwriting reading device, and storage medium
CN113608656A (en) * 2021-08-20 2021-11-05 掌阅科技股份有限公司 Note processing method, electronic device and storage medium

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011079432A1 (en) * 2009-12-29 2011-07-07 Nokia Corporation Method and apparatus for generating a text image
CN102736830A (en) * 2011-04-13 2012-10-17 联想移动通信科技有限公司 Handwriting input method and terminal equipment
CN102289322A (en) * 2011-08-25 2011-12-21 盛乐信息技术(上海)有限公司 Method and system for processing handwriting
CN106325596A (en) * 2016-08-17 2017-01-11 广州视睿电子科技有限公司 Automatic error correction method and system for writing handwriting
WO2018032697A1 (en) * 2016-08-17 2018-02-22 广州视睿电子科技有限公司 Automatic error correction method and system for handwriting
CN106325596B (en) * 2016-08-17 2019-04-30 广州视睿电子科技有限公司 A kind of written handwriting automatic error correction method and system
WO2020125345A1 (en) * 2018-12-17 2020-06-25 掌阅科技股份有限公司 Electronic book note processing method, handwriting reading device, and storage medium
CN110348306A (en) * 2019-06-06 2019-10-18 上海学印教育科技有限公司 A kind of hand-written inputting method and system
CN113608656A (en) * 2021-08-20 2021-11-05 掌阅科技股份有限公司 Note processing method, electronic device and storage medium

Also Published As

Publication number Publication date
CN1271537C (en) 2006-08-23

Similar Documents

Publication Publication Date Title
CN108664996B (en) Ancient character recognition method and system based on deep learning
CN109614944B (en) Mathematical formula identification method, device, equipment and readable storage medium
JP4142463B2 (en) System and method for facilitating pattern recognition
Hassan et al. Handwritten bangla numeral recognition using local binary pattern
US4653107A (en) On-line recognition method and apparatus for a handwritten pattern
US8761500B2 (en) System and methods for arabic text recognition and arabic corpus building
CN1630302A (en) Handset capable of automatically recording characters and images, and method of recording and processing thereof
CN1488120A (en) Method, device and computer program for recognition of a handwritten character
CN1118484A (en) Document image processor with defect detection
US11816883B2 (en) Region proposal networks for automated bounding box detection and text segmentation
CN109635805B (en) Image text positioning method and device and image text identification method and device
Demilew et al. Ancient Geez script recognition using deep learning
Kumar et al. Offline handwritten Gurmukhi character recognition: analytical study of different transformations
CN115860271A (en) System and method for managing art design scheme
WO2024027349A1 (en) Printed mathematical formula recognition method and apparatus, and storage medium
CN1271537C (en) Method of converting handwritten note into literal text and traveling equipment therefor
US5448651A (en) Texture discrimination method
Degtyarenko et al. Hierarchical recurrent neural network for handwritten strokes classification
Ning et al. MT-YOLOv5: Mobile terminal table detection model based on YOLOv5
Rajyagor et al. Handwritten character recognition using deep learning
CN111832390B (en) Handwritten ancient character detection method
Alrasheed et al. Evaluation of Deep Learning Techniques for Content Extraction in Spanish Colonial Notary Records
CN115830620B (en) Archive text data processing method and system based on OCR
CN110147785A (en) Image-recognizing method, relevant apparatus and equipment
CN117152438A (en) Lightweight street view image semantic segmentation method based on improved deep LabV3+ network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: LIANXIANG (BEIJING) CO. LTD.; HUMANTEC INDUSTRIAL

Free format text: FORMER OWNER: LIANXIANG (BEIJING) CO. LTD.

Effective date: 20080627

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20080627

Address after: No. 6, Pioneer Road, Beijing, Haidian District: 100085

Co-patentee after: Lenovo Mobile Communication Technology Ltd.

Patentee after: Lenovo (Beijing) Co., Ltd.

Address before: No. 6, Pioneer Road, Haidian District information industry base, Beijing, China: 100085

Patentee before: Lenovo (Beijing) Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20060823

Termination date: 20201112

CF01 Termination of patent right due to non-payment of annual fee