Summary of the invention
According to an aspect, the present invention is a kind of method of binarization gray shade scale text image, comprises step: the resolution that reduces all big or small grey scale image is to generate the gray shade scale file that a resolution reduces; The gray shade scale file that described resolution reduces is carried out the first binarization step to generate first binary picture; It is text filed to define to analyze described first binary picture; With to the described text filed execution second binarization step of the grey scale image of described whole sizes to generate second binary picture.
This method can also comprise step: calculate the angle of skew in described first binary picture and described angle of skew is applied to grey scale image described text filed of described whole sizes.
This method can also comprise step: define the particular text attribute in described second binary picture.
This method can also comprise step: described particular text attribute is inputed to an OCR engine.
Analyze described first binary picture and can comprise level or the vertically profiling analysis of carrying out described first binary picture to define text filed step.
Analyze described first binary picture and can comprise that to define text filed step definition is around described text filed rectangle frame.
The resolution that reduces whole grey scale image of size can be reduced to 1/16 of its original size with the size of the grey scale image of described whole sizes with the step of the grey scale image that generates a resolution and reduce, and is perhaps littler.
The resolution that reduces whole grey scale image of size can be reduced to 1/14 of its original size with the size of the grey scale image of described whole sizes with the step of the grey scale image that generates a resolution and reduce, and is perhaps littler.
According on the other hand, the present invention is an optical character recognition system, comprising: a microprocessor; A ROM (read-only memory) (ROM) that is operatively coupled to described microprocessor; A programmable storage that is operatively coupled to described microprocessor; With a camera module that is operatively coupled to described microprocessor; The code of storing was obtained and be stored in the grey scale image of the whole sizes in the described programmable storage by described camera module with reduction resolution is carried out on described microprocessor operation ground in described ROM, thereby generate the grey scale image that a resolution reduces, the grey scale image that described resolution reduces is carried out the one or two evolutionary step to generate one first binary file, analyze described first binary picture with define text filed and to the described text filed execution second binarization step of the grey scale image of described whole sizes to generate second binary picture.
Described microprocessor can also operatively calculate the angle of skew in described first binary picture and described angle of skew is applied to grey scale image described text filed of described whole sizes.
Described microprocessor can also operatively define the particular text attribute in described second binary picture.
Described microprocessor can also operatively be discerned the character in described second binary picture.
In this manual, comprise claims, term " comprises ", " comprising " or similarly term refer to comprising of non-exclusionism, so a kind of method of a series of unit or equipment of comprising not merely comprises these unit, but can also comprise other unlisted unit.
Embodiment
Referring to accompanying drawing, wherein identical reference character is represented identical or units corresponding in all accompanying drawings, and Fig. 1 is a schematic flow diagram, illustrates a kind of embodiment of the present invention.At first, dwindle the grey scale image 105 of whole sizes of a text to generate the grey scale image 110 that resolution reduces at step S1.Then, the grey scale image 110 that binarization resolution reduces in the first binarization step S2 is to generate first binary picture 115.Use one of multiple standards threshold technology as known in the art can carry out described binarization.Then, first binary picture 115 is carried out a crooked step S3 that goes that calculates text skew angle (skewangles).Then, at step S4, definition is text filed in the grey scale image 110 that resolution reduces.Then, at step S5, the text filed and skew angle that the grey scale image 110 of using the resolution reduction is calculated is applied to all grey scale image 105 of size, thereby defines the text filed of all big or small grey scale image 105.At step S6, binarization is grey scale image 105 text filed to generate second binary picture 120 of size all.At step S7, analyze second binary picture 120 with definition particular text attribute, for example each line of text, speech and character.At last, specific text attribute is inputed to the OCR engine 125 that is used for character recognition.
In an embodiment of the present invention, all the grey scale image 105 of size are the images that a width of cloth can obtain from various device, and described equipment for example is take pictures duplicator, facsimile recorder, printer or digital camera of numeral.Reducing all, the step S1 of the resolution of the grey scale image 105 of size will cause losing some information (being pixel) that comprised in this all big or small grey scale image 105.Yet according to the present invention, the loss of information has brought the increase of processing speed.For example, the grey scale image 110 that reduces of resolution can only be all sizes grey scale image 105 1/16.Therefore, grey scale image 110 required times that reduce in step S2 binarization resolution should only be 1/16 of all big or small grey scale image 105 required times of binarization.And the loss of information acceptable normally in the grey scale image 110 that resolution reduces is because go the step of crooked step S3 and the text filed S4 of definition need not require the image of a panel height resolution usually in order to realize an acceptable result.
Can use in the prior art one of known the whole bag of tricks to carry out crooked step S3.For example, the U.S. Pat 4 that is licensing to Mano, 926,490, license to people's such as Bloomberg U.S. Pat 5,187,753, license to people's such as Cullen U.S. Pat 5,854,854 and license to people's such as Wayner U.S. Pat 6,373, described the crooked method of going of prior art in 997, all these documents are incorporated herein by reference.
Also can use one of known the whole bag of tricks in the prior art to carry out the step of the text filed S4 of definition, for example vertical and horizontal wheels contour method.The U.S. Pat 4 that is licensing to people such as Bednar, 562,594, license to people's such as Katoh U.S. Pat 4,776,024 and license to the U.S. Pat 5 of Tan, disclose the example that defines text filed art methods in 172,422, all these documents are incorporated herein by reference.
Skew angle that the grey scale image 110 that step S5 will use resolution to reduce calculates and the text filed grey scale image 105 that is applied to whole sizes, thus the grey scale image 105 of the whole sizes of definition is text filed.According to the amount of text in the grey scale image 105 of whole sizes, text filed area for example can only be all grey scale image 105 of size the total area 1/4 or littler.
Then, the second binarization step S6 whole grey scale image 105 text filed of size of binarization only.Therefore, according to text filed area be whole big or small grey scale image 105 total areas less than 1/4 above-mentioned example, 1/4 of grey scale image 105 required times that the second needed time of binarization step S6 of carrying out should be less than the whole whole sizes of binarization.
At step S7, the binarization that analytical procedure S6 obtains is text filed with definition particular text attribute, for example row, speech or single character.For example, rectangle frame can be around single character so that OCR engine 125 can be discerned single character effectively.
The method of the binarization of the grey scale image 110 that grey scale image 105 that execution is all big or small and resolution reduce can be used any one method in the various binarizing method commonly known in the art.In the U.S. Pat 5,956,421 that licenses to people such as Tanaka with license to an example that discloses such prior art binarizing method in people's such as Heiper the U.S. Pat 6,438,265, above-mentioned two pieces of documents are incorporated herein by reference.
Referring to Fig. 2, illustrate a grid, another binarizing method that can be used for carrying out binarization step of the present invention is described.Each square in the grid shown in Figure 2 is represented a gray shade scale pixel.Diagram numbered pixels 410 is around an object pixel of representing with asterisk at Fig. 2 center.The object pixel representative is with a gray shade scale pixel of the method according to this invention binarization.Use an algorithm to estimate the gray shade scale numerical value of numbered pixels 410 on four direction of surrounding target pixel.On each direction among the four direction i (wherein i=0,1,2,3), calculate the average gray grade numerical value of numbered pixels 410.Determine maximum average gray grade numerical value iMax and minimum average B configuration gray shade scale numerical value iMin according to these four average gray grade numerical value then.If the gray shade scale numerical value of object pixel more near iMax but not iMin, then give this object pixel will distribute one for example 1 value (1) to represent a background pixel.Otherwise, if the gray shade scale numerical value of this object pixel more near iMin but not iMax, then give this object pixel distribute one for example 0 value (0) to represent a text pixel.
According to the needs of a concrete system, also can use the binarization algorithm of other kind within the scope of the invention.For example, can estimate to be located at the gray shade scale numerical value that centers on the neighbor on the rectangle edge, center of an object pixel and determine that this object pixel should be identified as prospect text or background.
Referring to Fig. 3, illustrate the electronic system or the equipment that can be used to realize said method of the present invention, its form for example is a mobile phone 301.This phone 301 comprises the radio frequency communications unit 302 of a connection to communicate by letter with processor 303.Input interface with screen 305 and keyboard 306 forms also can be connected to communicate by letter with processor 303.Also have an alarm 315, be used for when mobile phone 301 receives an incoming call or message, providing an alerting signal.
Processor 303 comprises the encoder/decoder 311 that connects the ROM (read-only memory) (ROM) 312 that is used to store data, and being used for Code And Decode can be by mobile phone 301 transmissions or the speech or other signal that receive.Processor 303 also comprises a microprocessor 313, ROM (read-only memory) (ROM) 314, camera module 319, random access storage device (RAM) 304, static programmable memory 316 and dismountable sim module 318 that it is connected to encoder/decoder 311 with address bus 317 and links to each other by a common data.Selected grey scale image that static programmable memory 316 and sim module 318 can store user's address book file, text message, obtain by described camera module 319 and relevant binarization file, and can store out of Memory.
Radio frequency communications unit 302 is combined type Receiver And Transmitters that have community antenna 307.Communication unit 302 has the transceiver 308 that is connected to antenna 307 by radio frequency amplifier 309.Transceiver 308 is also connected to combined type modulator/demodulator 310, by this modulator/demodulator 310 this communication unit 302 is connected to processor 303.
Microprocessor 313 has a plurality of interfaces, for example is used to be connected to alarm 315, keyboard 306, screen 305 and camera module 319.ROM (read-only memory) 314 storages are used to handle the code of the grey scale image of obtaining by described camera module 319, and described grey scale image for example is the image that comprises text character described above.
Therefore, the user of phone 301 can obtain a whole big or small grey scale image 105 that comprises the file of text, the image of a name card for business for example, telephone set 301 will be stored this all grey scale image 105 of size in random access storage device (RAM) 304, static programmable memory 316 and/or dismountable sim module 318.Then, the user of phone 301 can send an order, for example uses keyboard 306, and request identifies all texts that occur in this all big or small grey scale image 105, and sends to user's address book.
Discerning the order of the text in this all big or small grey scale image 105 can be handled by microprocessor 313.The code that use is stored in ROM 312, then, microprocessor 313 will be carried out method of the present invention described above with binarization text filed in the grey scale image 105 of whole sizes, then according to the specific text attribute of second binary picture, 120 definition.Then, can be embedded in OCR engine 125 and the identification of correlative code execution character thereof in microprocessor 313 and the ROM 312, and the resulting text that identifies is sent to for example user's address book.
Therefore, the present invention is a kind of be used to the improve efficient of text image binarization and the effective system and method for speed.The pixel of carrying out a series of binarization steps by the needs of only handling minimum number can obtain the improvement on efficient and the speed.Crooked step S3 and text filed detection step S4 are removed in grey scale image 110 execution that resolution reduces.Then, only with the specific text area binarization of original whole big or small grey scale image 105 with input to OCR engine 125.
Be to be understood that foregoing description will be an illustrative and nonrestrictive.Although also intactly described the present invention with reference to the accompanying drawings at its preferred embodiment, should be understood that various changes and revise will be conspicuous for a person skilled in the art.This change and modification are appreciated that and are included within the defined protection scope of the present invention of claims.