CN101424979A - Method for enhancing intellectualization of input method - Google Patents

Method for enhancing intellectualization of input method Download PDF

Info

Publication number
CN101424979A
CN101424979A CNA200810227547XA CN200810227547A CN101424979A CN 101424979 A CN101424979 A CN 101424979A CN A200810227547X A CNA200810227547X A CN A200810227547XA CN 200810227547 A CN200810227547 A CN 200810227547A CN 101424979 A CN101424979 A CN 101424979A
Authority
CN
China
Prior art keywords
cursor
chinese character
literal
character
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA200810227547XA
Other languages
Chinese (zh)
Inventor
宋熠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vimicro Corp
Original Assignee
Vimicro Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vimicro Corp filed Critical Vimicro Corp
Priority to CNA200810227547XA priority Critical patent/CN101424979A/en
Publication of CN101424979A publication Critical patent/CN101424979A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)

Abstract

The invention provides a method for improving the intelligent capacity of an input method, which comprises: the optical character recognition is performed for a character at the front of a cursor so as to obtain a Chinese character at the front of the cursor; the obtained Chinese character is matched with a Chinese character to be input at the cursor, the Chinese character with high matching degree is displayed at the front of an optional Chinese character sequence.

Description

A kind of method that improves intellectualization of input method
Technical field
The present invention relates to a kind of method of input method coupling, particularly a kind of according to the method for OCR result to the input method coupling.
Technical background
At computer application field, the Chinese character input is a kind of indispensable function.And spelling input method is more common in the input method of Chinese character.Chinese character typing personnel are when adopting spelling input method typing Chinese character, and its input speed depends on the speed that Chinese character typing personnel itself knock keyboard, also depend on the matching capacity of spelling input method itself simultaneously.More intelligent input method of Chinese character can make first-selected Chinese character item for the entry personnel wishes the content imported most, has reduced the typing personnel like this owing to choose the spent time of word, has just improved typing personnel's input speed virtually.
In the prior art, input method can be mated with the content that is about to typing according to the content of typing before, obtain having more the alternative phrase of efficient on one side, but for the character that has existed, the software of Chinese character typing such as Word, MSN etc. do not provide corresponding interface to make character before input method is obtained cursor under a lot of situations.For revising document, when the change cursor position was restarted the situation of new typing, input method can't be accomplished the intelligence coupling, therefore can reduce the compatible degree of coupling greatly.
Summary of the invention
In view of this, the object of the present invention is to provide a kind of can only input method, can utilize OCR to carry out optical character identification, to exist with cursor before character carry out intelligence with the content that is about to typing and mate.
In order to achieve the above object, a kind of method that improves intellectualization of input method of the present invention may further comprise the steps:
Literal before the cursor position is carried out optical character identification, obtain the preceding Chinese character of cursor;
With the Chinese character that gets access to and the input of cursor place wait import the Chinese character coupling, the Chinese character that matching degree is high is presented at the front of alternative Chinese character sequence.
Further, describedly literal before the cursor position carried out the optics Chinese Character Recognition comprise:
The writing direction of identification literal is if set type for perpendicular version then discern the Chinese character of cursor top, if set type for horizontal version then discern the Chinese character in cursor left side.
Further, the recognition methods of described text composition direction is the some literal of identification cursor some literal in top and cursor left side, the center that obtains all literal:
If laterally the center of literal is on the same horizontal line then thinks that horizontal version sets type;
If vertically the center of literal is on the same horizontal line then thinks vertical composing.
Further, the recognition methods of described text composition direction is the some literal in identification cursor left side, and the center that obtains all literal is thought that if the center of all literal is on the same horizontal line horizontal version sets type, otherwise thought vertical composing.
Further, the recognition methods of described text composition direction is obtained the distance between the literal for the some literal of identification cursor some literal in top and cursor left side,
If laterally the distance between the literal is less than the vertical distance between the literal then think that horizontal version sets type;
If vertically the distance between the literal is greater than the vertical distance between the literal then think vertical composing.
Further, the method that the literal before the cursor position is carried out optical character identification is specially:
Judge font, the character boundary at place, cursor place;
Determine the screenshotss size according to the writing direction of screen resolution and literal at that time, and the screenshotss content is carried out optical character identification.
Further, obtain first three Chinese character of cursor, if Chinese character to be imported is a Chinese character, then first three Chinese character of cursor and one wait that the priority of importing the Chinese character coupling is higher than previous Chinese character of cursor and a priority of waiting to import the Chinese character coupling, are higher than preceding two Chinese characters and one and wait to import the priority that Chinese character mates.
Further, obtain preceding two Chinese characters of cursor, preceding two Chinese characters of cursor are higher than previous Chinese character of cursor and the priority of waiting to import a Chinese character coupling with waiting the priority of importing two Chinese character couplings.
Further, if the Chinese Character Recognition result before the cursor is an adjective, then improve the clooating sequence of waiting to import the nouns and adjectives in the alternative Chinese character; If the Chinese Character Recognition result before the cursor is for adverbial word then improve the clooating sequence of waiting to import adjective and verb in the alternative Chinese character.
Further, if the recognition result before the cursor is a punctuation mark, then it is not mated.
The present invention has improved the coupling compatible degree between preceding existing character of cursor and character to be entered greatly owing to adopted OCR identification, has improved user's user satisfaction.
Description of drawings
Fig. 1 is the process flow diagram of a specific embodiment of the present invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention express clearlyer, the present invention is further described in more detail below in conjunction with drawings and the specific embodiments.
Character before a kind of method that improves intellectualization of input method of the present invention utilizes the OCR character recognition to cursor is discerned, and has improved the matching degree of input method.
Fig. 1 is the process flow diagram of a specific embodiment of the present invention, the present invention includes following steps:
Step 101, the writing direction of identification literal;
If set type then discern the Chinese character of cursor top, if set type for horizontal version then discern the Chinese character in cursor left side for perpendicular version.
The recognition methods of text composition direction can be following several method:
1) the some literal of identification cursor some literal in top and cursor left side, the center that obtains all literal:
If laterally the center of literal is on the same horizontal line then thinks that horizontal version sets type;
If vertically the center of literal is on the same horizontal line then thinks vertical composing.
2) the some literal in identification cursor left side, the center that obtains all literal is thought that if the center of all literal is on the same horizontal line horizontal version sets type, otherwise is thought vertical composing.
3) the some literal of identification cursor some literal in top and cursor left side obtain the distance between the literal,
If laterally the distance between the literal is less than the vertical distance between the literal then think that horizontal version sets type;
If vertically the distance between the literal is greater than the vertical distance between the literal then think vertical composing.
Step 102 is carried out optical character identification to the literal before the cursor position, obtains the preceding Chinese character of cursor;
The present invention adopts the method for the cursor place being located real-time screenshotss, and preamble is carried out OCR identification, and recognition result is mated, to realize more high efficiency input method.Because the screenshotss scope is little, the position is accurate, thereby treatment capacity is also less relatively.
Screenshotss of the present invention are handled font, character boundary and the character color of at first judging place, cursor place in a specific embodiment, determine the screenshotss size according to the writing direction of screen resolution and literal at that time then, and the screenshotss content is carried out OCR identification.
For example under 1024 * 768 resolution, No. four words account for the space of 29 * 29 pixels greatly, if therefore only get the previous Chinese character of cursor, then only need get about 30 * 30 pixels and get final product.
Step 103 is imported the Chinese character coupling with to be imported the waiting of Chinese character and cursor place that gets access to, and the Chinese character that matching degree is high is presented at the front of alternative Chinese character sequence.
For different recognition results, different disposal routes is arranged:
When before recognition result is thought place, cursor place, being punctuation mark, it is not mated;
When before recognition result is thought cursor, being adjective, then that nouns and adjectives is alternative in advance, the alternative postpone of verb.Such as typing " lovely user " typing lovely after input is when " user " again, the alternative rank of " support " should lag behind " user "
Priority for the front and back coupling can adopt:
Two word priority of two word couplings are higher than the priority of a word of a word coupling, such as " The flowers bloom luxuriantly ", when two words " are bloomed " in intact " fresh flower " typing more afterwards of typing, it is (n.+v.) more appropriate to be matched to " blooming ", and " peanut " coupling priority this time should be lower than " blooming ".
Intelligence coupling only needs to obtain that the content of 3 words get final product before the cursor, thereby discerns the bit error rate and to be matched to power relative very high, system resource is taken also very limited, thereby has improved user's comfort greatly.
For different Chinese character numbers different coupling priority is arranged, if Chinese character to be imported is a Chinese character, then first three Chinese character of cursor and one wait that the priority of importing the Chinese character coupling is higher than previous Chinese character of cursor and a priority of waiting to import the Chinese character coupling, are higher than preceding two Chinese characters and one and wait to import the priority that Chinese character mates.
Preceding two Chinese characters of cursor are higher than previous Chinese character of cursor and the priority of waiting to import a Chinese character coupling with waiting the priority of importing two Chinese character couplings.
The present invention is not limited only to spelling input method, and is suitable equally for the input method of other types.
The above only is preferred embodiment of the present invention, and is in order to restriction the present invention, within the spirit and principles in the present invention not all, any modification of being done, is equal to replacement etc., all should be included within protection scope of the present invention.

Claims (10)

1. method that improves intellectualization of input method may further comprise the steps:
Literal before the cursor position is carried out optical character identification, obtain the preceding Chinese character of cursor;
With the Chinese character that gets access to and the input of cursor place wait import the Chinese character coupling, the Chinese character that matching degree is high is presented at the front of alternative Chinese character sequence.
2. method according to claim 1 is characterized in that, describedly literal before the cursor position is carried out the optics Chinese Character Recognition comprises:
The writing direction of identification literal is if set type for perpendicular version then discern the Chinese character of cursor top, if set type for horizontal version then discern the Chinese character in cursor left side.
3. method according to claim 2 is characterized in that, the recognition methods of described text composition direction is the some literal of identification cursor some literal in top and cursor left side, the center that obtains all literal:
If laterally the center of literal is on the same horizontal line then thinks that horizontal version sets type;
If vertically the center of literal is on the same horizontal line then thinks vertical composing.
4. method according to claim 3, it is characterized in that, the recognition methods of described text composition direction is the some literal in identification cursor left side, obtain the center of all literal, think that if the center of all literal is on the same horizontal line horizontal version sets type, otherwise think vertical composing.
5. method according to claim 2 is characterized in that, the recognition methods of described text composition direction is obtained the distance between the literal for the some literal of identification cursor some literal in top and cursor left side,
If laterally the distance between the literal is less than the vertical distance between the literal then think that horizontal version sets type;
If vertically the distance between the literal is greater than the vertical distance between the literal then think vertical composing.
6. method according to claim 1 is characterized in that, the method for the literal before the cursor position being carried out optical character identification is specially:
Judge font, the character boundary at place, cursor place;
Determine the screenshotss size according to the writing direction of screen resolution and literal at that time, and the screenshotss content is carried out optical character identification.
7. method according to claim 1, it is characterized in that, obtain first three Chinese character of cursor, if Chinese character to be imported is a Chinese character, then first three Chinese character of cursor and one wait that the priority of importing the Chinese character coupling is higher than previous Chinese character of cursor and a priority of waiting to import the Chinese character coupling, are higher than preceding two Chinese characters and one and wait to import the priority that Chinese character mates.
8. method according to claim 1 is characterized in that, obtains preceding two Chinese characters of cursor, and preceding two Chinese characters of cursor are higher than previous Chinese character of cursor and the priority of waiting to import a Chinese character coupling with waiting the priority of importing two Chinese character couplings.
9. method according to claim 1 is characterized in that, if the Chinese Character Recognition result before the cursor is an adjective, then improves the clooating sequence of waiting to import the nouns and adjectives in the alternative Chinese character; If the Chinese Character Recognition result before the cursor is for adverbial word then improve the clooating sequence of waiting to import adjective and verb in the alternative Chinese character.
10. method according to claim 1 is characterized in that, if the recognition result before the cursor is a punctuation mark, then it is not mated.
CNA200810227547XA 2008-11-27 2008-11-27 Method for enhancing intellectualization of input method Pending CN101424979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA200810227547XA CN101424979A (en) 2008-11-27 2008-11-27 Method for enhancing intellectualization of input method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA200810227547XA CN101424979A (en) 2008-11-27 2008-11-27 Method for enhancing intellectualization of input method

Publications (1)

Publication Number Publication Date
CN101424979A true CN101424979A (en) 2009-05-06

Family

ID=40615620

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA200810227547XA Pending CN101424979A (en) 2008-11-27 2008-11-27 Method for enhancing intellectualization of input method

Country Status (1)

Country Link
CN (1) CN101424979A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609108A (en) * 2012-02-15 2012-07-25 张群 Electronic equipment for inputting characters and method for inputting characters into electronic equipment
CN103076962A (en) * 2012-12-27 2013-05-01 华为技术有限公司 Search hint generating method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609108A (en) * 2012-02-15 2012-07-25 张群 Electronic equipment for inputting characters and method for inputting characters into electronic equipment
CN103076962A (en) * 2012-12-27 2013-05-01 华为技术有限公司 Search hint generating method and device
CN103076962B (en) * 2012-12-27 2016-11-23 华为技术有限公司 A kind of Search Hints generates method and apparatus

Similar Documents

Publication Publication Date Title
CN104598577B (en) A kind of extracting method of Web page text
US11341322B2 (en) Table detection in spreadsheet
CN110334346A (en) A kind of information extraction method and device of pdf document
RU2007138952A (en) LIST OF AUTOMATIC FILLING AND HANDWRITING
US20070040707A1 (en) Separation of Components and Characters in Chinese Text Input
EP2110758B1 (en) Searching method based on layout information
CN101354727A (en) Method and apparatus for establishing links between digital document catalog and text
CN104750678A (en) Image text recognizing translation glasses and method
CN102970596A (en) Method and system for realizing multi-language font display of set top box and set top box
CN102194117A (en) Method and device for detecting page direction of document
CN112115111A (en) OCR-based document version management method and system
CN106227808A (en) A kind of method removing mail interference information and method for judging rubbish mail
CN107562480A (en) A kind of POS multi-lingual implementation method and its system
CN104331400B (en) A kind of Mongolian code conversion method and device
CN204537126U (en) A kind of image text identification translation glasses
CN104182966A (en) Automatic splicing method of regular shredded paper
CN102682457A (en) Rearrangement method for performing adaptive screen reading on print media image
CN101424979A (en) Method for enhancing intellectualization of input method
CN102103612A (en) Information extraction method and device
CN105955986A (en) Character converting method and apparatus
CN102033614A (en) Intelligently combined formula input method and system
Prakash et al. Information extraction in unstructured multilingual web documents
CN109308146A (en) A kind of character string adaptivenon-uniform sampling display methods and system based on control property
CN101673406B (en) Method and device for setting font
CN110032999A (en) A kind of low resolution licence plate recognition method that Hanzi structure is degenerated

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20090506