CN1987894A

CN1987894A - Self adaptive two-valued method, device and storage medium for file

Info

Publication number: CN1987894A
Application number: CN 200510138132
Authority: CN
Inventors: 曾旭; 李献; 肖其林
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-12-22
Filing date: 2005-12-22
Publication date: 2007-06-27
Anticipated expiration: 2025-12-22
Also published as: CN100561504C

Abstract

This invention is about a kind of method, equipment and preserve mediate to realize the automatic binarization to the documents. The method to automatically binarize the Gradation level documents in this invention contains: division step, to divide the gradation level documents into plots, the first determination step, to determine the background plots and litter plots from the plots being divided according to the characteristics of them. The second determination step, to determine the background pixels form the pixels included by the each letter plot determined at the first determination step. The first calculation step, to calculate the area of the threshold value corresponding to every plot. Among them, the threshold value of the background plot determined in the first determination step will be calculated according to all the pixels included in it. And the threshold value of the letter plot determined in the first determination step will be calculated according to the background pixels included in it, and the background pixels is determined in the second determination step. And the binarization step, to binarize the gradation level documents figure.

Description

The self-adaption binaryzation method of document, equipment and storage medium

Technical field

Present invention relates in general to file and picture and handle, relate in particular to optical character identification (OCR).More specifically, the self-adaption binaryzation method, equipment and the storage medium that relate to document.

Background technology

The binaryzation of file and picture is that the file and picture analytic system is such as the first step in the Optical Character Recognition system.The output result of Threshold Segmentation operation is a bianry image, and one of them state representation foreground object is printed text just, and its complementary state is corresponding to background.Binarization method can be divided into two classes: overall and local Threshold Segmentation technology, see  .D.Trier and A.K.Jain.Goal-directed evaluation of binarizaton methods, IEEETransactions on Pattern Analysis and MachineIntelligence, 17 (12): 1191-1201,1995, Yibing Yang and Hong Yan.Anadaptive logical method for binarization of degraded document images, Pattern Recognition, 33 (5): 787-807,2000, and Xiangyun Ye, Mohamed Cheriet.and Ching Y.Suen.Stroke-modelbased characterextraction from gray-level document images, IEEE Transactions onImage Processing, 10 (8): 1152-1161,2001.

In many cases, the gray level that belongs to the pixel of background has sizable different with the gray level that belongs to object pixels.Threshold Segmentation is exactly a kind of instrument that simply and effectively object and background is separated like this.In these cases, the simplest also is that the method for very early time is the global threshold cutting techniques.The most general global threshold cutting techniques, for example Otsu method and Kapur method (are seen  .D.Trier and A.K.Jain.Goal-directed evaluation ofbinarizaton methods, IEEE Transactions on Pattern Analysis andMachine Intelligence, 17 (12): 1191-1201,1995) be, based on histogram analysis.The yardstick of determining threshold value is will separate corresponding to the gray level at histogrammic peak best, and these peaks correspond respectively to different piece in the image such as the image pixel of background or object.

Exist various factors that the quality of bianry image is had adverse influence.These factors comprise:

1. because uneven illumination and unsuitable storage cause the variation of background intensity;

2. the catching in the processing of file and picture, owing to stain, spot and shade cause low-down local contrast;

3. revocable with relevant noise with signal;

4. contrast changes between the preceding scenic spot of image and the background area;

5. the infiltration of the text of document reverse side; And

6. the resolution of grayscale image is low.

All these variablees make the binaryzation process very complicated.The binarization method of selecting " best " is very difficult.OCR method (and other image processing method) thereby just so ineffective.

Desirable adaptive threshold dividing method should be, when being applied to the even page of uneven illumination, should producing with the global threshold dividing method and be applied to illumination the same result during the page very uniformly.The brightness of each pixel is by normalization, with how much compensating illumination.Just judged result should be black or white afterwards.Problem is exactly how to determine the background illuminance at each some place so.

Fig. 1 just is based on this description haply.It is a kind of technology of calculating the threshold value that changes according to background illuminance on entire image.

At first, this technology is divided into smaller piece (step S102) with image, calculates the threshold value (step S104) of each piece then in many ways, and these methods for example are:

^*Calculate the histogram of each piece.Select threshold value according to these histogrammic peaks for each piece;

^*The position calculation threshold value big according to gradient.The position that these gradients are big is to utilize the local maximum of gradient pixel to extract;

^*Obtain threshold value by extracting from the zone that is identified as the character boundary with interpolation, extract with the second derivative method character boundary.

Then, by between the value that each piece is chosen, carrying out interpolation, each point in the image is distributed a threshold value (step S106).At last, use resulting threshold value that image is carried out binaryzation (step S108).The result is more much better than globalize Threshold Segmentation.

The document of relevant binaryzation technology is primarily focused in particular cases differently in the prior art, has following shortcoming:

1. the global threshold cutting techniques is fit to high-quality file and picture, and still the document work for the poor quality must be not fine just;

2. the local threshold cutting techniques gets undersized image work and is not fine, is reduced into less piece because image at first will be sampled, and then enlarges.The border filling can have a strong impact on last result;

Some local average Threshold Segmentation technology for example the Niblack method usually amplify noise, tend to large-area background area is mistakenly classified as text;

4. these technology usually need to use edge detecting technology, interval elimination method and/or post-processed, to remove " ghost image " object;

5. the local threshold cutting techniques requires repeatedly reading image, and is very slow, also is not suitable for high quality graphic.

In addition, when seeking better method, the advantage that will be appreciated that threshold segmentation method should be to calculate and realize simple.If make it too complicated for improvement effect, then segmentation method can surpass it on measure of precision.

Summary of the invention

Therefore, an object of the present invention is to provide a kind of method and apparatus that carries out the efficient adaptive binaryzation.

To achieve these goals, the inventor proposes to finish unification kind of a simple pre-treatment step, thus the problem of avoiding other local auto-adaptive threshold segmentation method when handling small-sized image or high quality graphic, to exist; And impliedly gray-level pixels Threshold Segmentation problem is converted to the block feature classification problem.The inventor proposes and has verified that various general text features are applicable to that all block feature extracts, and has only inappreciable influence for the final binaryzation result in the binaryzation scheme of the present invention.Therefore the inventor also proposes and verified that the block feature classification problem also is very inappreciable in this binaryzation scheme, and general global threshold dividing method is just enough for the block feature classification.

In the block threshold value allocation step, the inventor proposes originally and has verified in this binaryzation scheme, can realize becoming binaryzation result clearly to ropy image with the local average that the second derivative method obtains, and other similar approach does not reach good like this result.This result is even more important for digital camera OCR.

The inventor has also designed a kind of new self-adaptation off-set value, and in this binaryzation scheme, it can help the processing of sparse text image.

Particularly, provide a kind of the gray level file and picture is carried out the method for self-adaption binaryzation, this method comprises: partiting step is divided into piece with the gray level document; First determining step is according to background piece and the text block in the middle of the definite piece of being divided of the feature of piece; Second determining step is determined the background pixel in the included pixel of determined each text block of described first determining step; First calculation procedure, calculate the block threshold value face of the threshold value of each piece of expression, wherein, calculate the threshold value of this background piece based on all included pixels in the determined background piece of described first determining step, based on the threshold value of background pixel calculating text piece included in the determined text block of described first determining step, described background pixel is determined in described second determining step; And the binaryzation step, use block threshold value face that described first calculation procedure calculates with described gray level document image binaryzation.

The present invention also provides a kind of the gray level file and picture has been carried out the equipment of self-adaption binaryzation, and this equipment comprises: classification apparatus is divided into piece with the gray level document; First determines device, according to background piece and the text block in the middle of the definite piece of being divided of the feature of piece; Second determines device, determines that described first determines the background pixel in the included pixel of determined each text block of device; First calculation element, calculate the block threshold value face of the threshold value of each piece of expression, wherein, calculate the threshold value of this background piece based on all included pixels in described first definite determined background piece of device, determine that based on described first background pixel included in the determined text block of device calculates the threshold value of text piece, described background pixel is determined in described second definite device; And the binaryzation device, use block threshold value face that described first calculation element calculated with described gray level document image binaryzation.

A kind of storage medium also is provided, has it is characterized in that having stored therein the program code that is used to realize said method.

The present invention has one or more in the following advantage:

1. this solution is being a kind of scheme of balance aspect high speed and the high reliability ground processing file and picture;

2. it is for low-quality image, and for example digital camera images is fine with the effect of the sparse text image of the background area that comprises bulk;

3. it is for various types of characters, and for example East Asia character all has very good effect, makes sharpening as a result;

4. its calculated amount is as best one can little, does not therefore need extra calculated amount for the processing of simple image.

Description of drawings

Other purpose of the present invention, feature and advantage will become more clear after the detailed description of preferred embodiments reading hereinafter.The accompanying drawing part of book as an illustration is used for the diagram embodiments of the invention, and is used from explanation principle of the present invention with instructions one.In the accompanying drawings:

Fig. 1 is the process flow diagram of general local auto-adaptive method;

Fig. 2 is the process flow diagram of overall process of the present invention;

Fig. 3 is a sample of inferior quality file and picture;

Fig. 4 is the result who applies the present invention to the image pattern among Fig. 3, and the global threshold segmentation result of Otsu method;

Fig. 5 is the block diagram of an example that can realize the computer system of method and apparatus of the present invention;

Fig. 6 is the process flow diagram of the subprocess of Fig. 2, illustrates the details of computing block threshold value face;

Fig. 7 illustrates the effect of the processing among Fig. 6;

Fig. 8 is the process flow diagram of the subprocess among Fig. 2, illustrates the details of calculated threshold face;

Fig. 9 illustrates the effect of the processing among Fig. 8;

Figure 10 is the synoptic diagram of entire process process of the present invention;

Figure 11 illustrates the block diagram of equipment of the present invention;

Figure 12 illustrates the block diagram of block threshold value maker as shown in figure 11.

Embodiment

Computer system for example

Method and apparatus of the present invention can be realized in any messaging device.Described messaging device for example is personal computer (PC), notebook computer, digital camera or is embedded in single-chip microcomputer (SCM) in scanner, duplicating machine, the facsimile recorder etc.For those of ordinary skills, be easy to realize method and apparatus of the present invention by software, hardware and/or firmware.Especially it should be noted that, it is evident that for those of ordinary skills, for any step of carrying out this method or the combination of step, the perhaps any parts of equipment of the present invention or the combination of parts may need to use input-output device, memory device and microprocessor such as CPU etc.May not be certain to mention these equipment in the explanation to method and apparatus of the present invention below, but in fact used these equipment.

As above-mentioned messaging device, the block diagram of Fig. 5 shows giving an example of a computer system, can realize method and apparatus of the present invention therein.It should be noted that the computer system that is shown in Fig. 5 just is used for explanation, does not really want to limit the scope of the invention.

From the angle of hardware, computing machine 1 comprises a

CPU

6,5, RAM of hard disk (HD) 7, a ROM 8 and an input-output device 12.Input-output device can comprise input media such as keyboard, Trackpad, tracking ball and mouse etc., and output unit is such as printer and monitor, and input-output unit is such as floppy disk, CD drive and communication port.

From the angle of software, described computing machine mainly comprises operating system (OS) 9, input/output driver 11 and various application program 10.As operating system, can use any operating system that to buy on the market, such as Windows series (Windows is the trade mark that Microsoft has) or based on the operating system of Linux.Input/output driver is respectively applied for and drives described input-output device.Described application program can be an Any Application, such as text processor, image processing program etc., comprising can any existing program used in this invention and aim at the present invention's application program establishment, that can call described existing program.

Like this, in the present invention, can in the hardware of described computing machine, realize method and apparatus of the present invention by operating system, application program and input/output driver.

In addition, computing machine 1 can be connected to digital device 3 and application apparatus 2.Described digital device can be used as image source, can be camera, scanner or the digitizer that is used for analog image is converted to digital picture.Equipment of the present invention and result that method obtains are output to application apparatus 2, and the latter carries out suitable operation based on described result.Described application apparatus also may be implemented as another (combining with hardware) application program that realize, that be used for further handling image in computing machine 1.

Be used for file and picture is carried out the method and apparatus of self-adaption binaryzation

Briefly, the invention provides a kind of computer implemented method and apparatus, be used for file and picture is carried out self-adaption binaryzation.Fig. 2 is a main flow chart of the present invention, illustrates an embodiment by the new binarization method of an embodiment of as shown in figure 11 equipment of the present invention.The synoptic diagram of Figure 10 illustrates the key step (wherein not having global threshold to cut apart branch) of an embodiment of this method.

Equipment of the present invention comprises: be used for described gray level document is divided into the block feature extraction apparatus 1104 of piece and computing block characteristic image, each pixel of described block feature image is represented one described feature; By described block feature image being carried out the piece mask maker 1106 that Threshold Segmentation is come the computing block mask, wherein this piece mask is distinguished background piece and text block; Block threshold value maker 1110 based on described mask and described gray level file and picture computing block threshold value face; Be respectively applied for by interpolation launch described block threshold value face and described mask and with first spreader 1108 and second spreader 1112 of described gray level file and picture coupling; Calculate the offset value calculator 1116 of off-set value based on the piece mask of the block threshold value face of described expansion and described expansion; Generate the image threshold maker 1114 of final threshold value face by the block threshold value face of adjusting described expansion with described off-set value; And the binaryzation device that carries out binaryzation with the block threshold value of adjusted expansion in the face of the gray level file and picture.

As a kind of preferred implementation, this equipment can also comprise quality Identification device 1102, be used to judge the picture quality of described gray level file and picture, and global threshold dispenser 1118, be used for judging that at described quality Identification device under the high situation of picture quality described gray level file and picture being carried out global threshold cuts apart.In addition, described block feature extraction apparatus, described mask maker, described block threshold value maker, described first and second spreaders, described offset value calculator and described image threshold maker are configured to judge under the not high situation of described picture quality at described quality Identification device and work, and described binaryzation device 1120 is configured to: if described quality Identification device is judged the picture quality height, then the threshold value that obtains with described global threshold dispenser is carried out binaryzation to the gray level file and picture; If described quality Identification device judges that picture quality is not high, then carry out binaryzation in the face of the gray level file and picture with the described block threshold value that passes through the expansion of adjusting.

A document sample and binaryzation result thereof in Fig. 3 and Fig. 4, have been provided.

In step 201, at first the grayscale image I (for example be W * H pixel, see Figure 10) by 1102 pairs of inputs of quality Identification device handles.Quality Identification device 1102 can use the Otsu method, because this method can the evaluate image quality (be seen Xiangyun Ye, MohamedCheriet.and Ching Y.Suen.Stroke-modelbased character extractionfrom gray-level document images, IEEE Transactions on ImageProcessing, 10 (8): 1152-1161,2001) also simultaneously image is carried out global threshold and cut apart, it implements very fast.In the Otsu method, inter-class variance is represented as:

η = \frac{σ_{B}^{2}}{σ_{T}^{2}}, - - - (1)

Optimal threshold t ^*Be confirmed as:

t^{*} = \arg \max_{t &Element; {0,1 \cdot \cdot \cdot, l - 1}} σ_{B}^{2}, - - - (2)

Wherein:

σ_{B}^{2} = \frac{{[μ_{T} ω (t) - μ (t)]}^{2}}{ω (t) [1 - ω (t)]},

σ_{T}^{2} = Σ_{i = 0}^{l - 1} {(i - μ_{T})}^{2} P_{i}, - - - (3)

μ_{T} = Σ_{i = 0}^{l - 1} i p_{i},

μ (t) = Σ_{i = 0}^{l} i p_{i},

ω (t) = Σ_{t = 0}^{t} p_{i} .

Here, l is the total number of greyscale levels in the image; P _iIt is the probability of occurrence of gray level i.The maximal value of μ is expressed as μ ^*, can measuring as picture quality.With regard to the classification separability, rule of thumb select μ ₁=0.83 (step 202).If μ ^*〉=μ ₁T is a high quality graphic with graphic collection then, can use t with global threshold dispenser 1118 to it ^*Perhaps other overall binarization method carries out global threshold and cuts apart (step 208).Like this, binaryzation device 1120 just can generate bianry image (step 209) based on the global threshold of such acquisition.

If μ ^*＜μ ₁Be low-quality image then, handle by the more complicated local auto-adaptive threshold segmentation method that the remaining part shown in Figure 10 is realized with graphic collection.

In step 203, block feature extraction apparatus 1104 is the square block (wb rule of thumb selects, and for example can be 16) that mutual nonoverlapping size is wb * wb with image division.Extract block feature by block feature extraction apparatus 1104, to distinguish text block and background piece.Normally used various text feature can use, and for example piece variance, transition poor (transient difference) (are seen J.Sauvola and M.Pietik ^*Ainen, Adaptive document image binarization, Pattern Recognition, 33:225-236,2000), and at article Finding textregions using localized Measures (Paul Clark and Majid Mirmehdi, Proceedings of the 11th British Machine Vision Converence, pages675-684,2000) other five kinds of text features, perhaps their combination of enumerating in.

Here use piece variance feature as an example.Have the such hypothesis of higher variance based on the pixel in the piece that comprises text than the background area, use adjacent wb * wb block of pixels to calculate variance image V.V comprises the piece variance yields V of each piece _b(x, y).Just, the original image I of W * H pixel is reduced into the block feature image V of W/wb * H/wb pixel, and wherein each pixel is corresponding to a piece b of wb * wb pixel, and its value equals the variance of the pixel among the corresponding piece b.

For one embodiment of the present invention, the variance Vb of each piece of grayscale image can calculate based on following formula:

V_{b} (x, y) = \frac{Σ_{i = 1}^{n} {({\overline{I}}_{b} - I_{bi})}^{2}}{n}, - - - (4)

Wherein, n represents the pixel count in each piece.Ibi represents the value of the pixel i among the piece b, and Ib represents the pixel value of the piece the average image I of piece b.

In a kind of preferred implementation, can determine the variance of each piece by following formula:

V_{b} (x, y) = \frac{Σ_{i = 1}^{n} I_{bl}^{2}}{n} - {\overline{I}}_{b}^{2} . - - - (5)

Feature based on being extracted like this by the block feature extraction apparatus has obtained block feature image V (see figure 10).Just, the pixel among the block feature image V is the feature (such as variance) of the piece among the original image I, sees Figure 10.

Then by piece mask maker 1106 based on block feature image V computing block pattern mask (step 204).Just, for example block feature image V is carried out Threshold Segmentation then with the Otsu method.Should be noted that here, also can replace the Otsu method with other general threshold segmentation method.Key is the gray-level pixels classification is converted to the block feature classification, thereby reduces complexity and calculated amount greatly.Find by experience, for the block feature carrying out image threshold segmentation, overall binaryzation is just enough.The more complicated method of local auto-adaptive Threshold Segmentation or other also can be used, but the result that they produced compares the difference of having only slightly with global approach, therefore there is no need so to do.

Just obtained piece pattern mask Vmask then, i.e. bianry image, it distinguishes text block and background piece.For example, as shown in figure 10, " 1 " represents text block (the annular dash area in the text block is represented stroke), and " 0 " represents the background piece.

Next, obtain block threshold value face Bt604 (its size is identical with V and Vmask) (step 205) by block threshold value maker 1110 based on Vmask.Fig. 6 illustrates the details of this step, and Figure 12 illustrates the details of block threshold value maker 1110.Here, for each pixel (point in other words) (x, y), V (x, y), Vmask (x, y) and Bt (x is y) corresponding to the same non overlapping blocks b among the original gray scale image I.

In step 601, if (x y) is labeled as background piece (Figure 10 is seen in " 0 " among the Vmask) by piece mask maker 1106 to Vmask, and then (x y) is assigned Bt

(step 602), the mean value of all pixels in the background piece just, this value can be obtained by equilibration device 1202.If (x y) is labeled as text block (Figure 10 is seen in " 1 " among the Vmask) by piece mask maker 1106 to Vmask, and then (x y) is assigned Bt (step 603), the mean value of background pixel (for example white pixel) among the text block b just, this value can be obtained by equilibration device 1202.Background pixel among the b is by edge detector 1204 marks.Edge detector 1204 can use the second derivative method.This method is to be a kind of effective method in the dark side at edge or in bright side for detecting pixel.Here, use this method to detect and the tab character stroke between the inner space.For one embodiment of the present invention, edge detector 1204 can be configured to use the LOG operator to detect.LOG (Gauss-Laplace operator, the Laplacianof Gaussian) operator is the second derivative operator of using always.Using the LOG operator can produce relevant given pixel is in the dark side at edge or the information of bright side.This is by realizing LOG operator and piece convolution, and each pixel obtains the value after the convolution like this.Be an example of LOG operator below, it is one 9 * 9 template:

0	-1	-1	-2	-2	-2	-1	-1	0
0	-1	-1	-2	-2	-2	-1	-1	0	-1	-2	-4	-5	-5	-5	-4	-2	-1
-1	-4	-5	-3	0	-3	-5	-4	-1	-1	-2	-4	-5	-5	-5	-4	-2	-1
-1	-4	-5	-3	0	-3	-5	-4	-1	-2	-5	-3	12	24	12	-3	-5	-2
-2	-5	0	24	40	24	0	-5	-2	-2	-5	-3	12	24	12	-3	-5	-2
-2	-5	0	24	40	24	0	-5	-2	-2	-5	-3	12	24	12	-3	-5	-2
-1	-4	-5	-3	0	-3	-5	-4	-1	-2	-5	-3	12	24	12	-3	-5	-2
-1	-4	-5	-3	0	-3	-5	-4	-1	-1	-2	-4	-5	-5	-5	-4	-2	-1
0	-1	-1	-2	-2	-2	-1	-1	0	-1	-2	-4	-5	-5	-5	-4	-2	-1

Be a little skill of convolution below.Owing to know the position of piece in original image, and know the position of each pixel in piece, therefore just know the position of each pixel in original image.Like this, with each pixel convolution of middle body in LOG operator and the original image, like this, just there is not boundary effect (if with the LOG operator directly and the piece convolution, boundary effect will occur).

As the result of convolution, each pixel is endowed a value.If this value is positive, then this pixel is marked as bright pixel (background).If this value is born, then this pixel is marked as dark pixel (character stroke).So just background in the piece and text area can be separated.

Flow process shown in Figure 6 is very important for obtaining clearer binaryzation result.(resemble Mauritius Seegerand Christopher R.Dance if only the text block threshold value is carried out interpolation with background piece on every side, Binarising camera images for OCR, ICDAR, Sep 2001 is the same), then the binaryzation result is for probably smudgy (shown in Fig. 7 (b)) the low-quality image (for example the image that comprises East Asia character of digital camera picked-up, shown in Fig. 7 (a)).The part based on the second derivative method that is proposed on average can address this problem, shown in Fig. 7 (c).Notice that three pieces of this among the figure all are to cut out out, rather than directly Fig. 7 (a) are carried out Threshold Segmentation from the original image of correspondence.

Then, in step 206, by first spreader 1108 and second spreader 1112 with block threshold value face Bt and piece mask Vmask launch with the resolution coupling of original image I, the just pixel of giving original image I with block threshold value face and piece mask assignment.Resulting image is defined as Bu (the block threshold value face image 801 of expansion, the just preliminary threshold value face of original image) and Tmask (the piece mask images 802 of expansion) respectively.For one embodiment of the present invention, first spreader 1108 adopts nearest neighbor element method of interpolation (nearest-neighborinterpolation scheme) to launch bianry image Vmask, and second spreader 1112 adopts bilinear interpolation to launch Bt.Tmask is that ((x y)=1 represents text area to Tmask to bianry image, Tmask (x y)=0 represents background area).

Then, utilize background parts and the textual portions among the Tmask (prospect part) among the block threshold value face Bu of the expansion that second spreader 1112 generated to determine global offset value d by offset value calculator 1116, this value is the mean distance between prospect and the background, as shown in Figure 8.

Shown in step 207, d determines (step 803) by following formula:

d = \frac{\underset{(w, h) &Element; S}{Σ} (I (w, h) - Bu (w, h))}{| S |}, - - - (6)

Wherein:

S＝{(w，h)|Bu(w，h)＞I(w，h) and Tmask(w，h)＝1} (7)

Wherein | S| is the size of S set, just the quantity of the element of S set.

Wherein, Tmask (w, h)=1 the meaning is the pixel (w among the piece mask Tmask that launches, h) be foreground pixel, and Bu (w, h)＞I (w, meaning h) is the pixel (w among the original image I, h) value is less than described preliminary threshold value (threshold value before adjusting, the just threshold value among the block threshold value face Bu of Zhan Kaiing).That is to say that (w, h) (using before d adjusts) is background pixel by preliminary judgement to the pixel among the original image I.

In other words, (pixel (w, h)), the piece under it is judged as foreground blocks for each element in the S set.But according to the block threshold value face Bu that launches, this pixel itself should be a background pixel.This means that this element is and real foreground pixel adjacent pixels.For example, as shown in Figure 7, Fig. 7 (a) can be regarded as a piece, it is judged as foreground blocks in piece mask calculation procedure, but real foreground pixel is the pixel in the stroke, and other pixel is actually background pixel, if having only these other pixels of this piece just to constitute S set.The process that is realized by offset value calculator 1116 as shown in Figure 8 is very important (as shown in Figure 9) for the ghost image pixel of removing in the sparse text image.If there is not secondary to divide when calculating S, d will be too little, as Mauritius Seeger and Christopher R.Dance, and Binarising camera images for OCR, ICDAR is described in the Sep 2001.Thereby shown in Fig. 9 (b), binaryzation result understands severe exacerbation.

At last, shown in step 804, image threshold maker 1114 carry out following calculating with obtain final threshold value face T (w, h):

T(w，h)＝Bu(w，h)-qd， (7)

Wherein (w h) is the coordinate of each pixel in the original image.

Here, q is an empirical constant, can be 1 to 2, preferably 1.5.Like this, binaryzation device 1120 just can generate binary image (step 209) based on threshold value face T.

In the superincumbent explanation, at first judge the quality (step 201) of image, based on the result's (step 202) who judges, the global threshold that obtains with step 208 carries out binaryzation to image, and the threshold value face T that perhaps uses step 203 to 207 to obtain carries out binaryzation (step 209) to image.But in another embodiment, described step 201,202 can be omitted, and is configured to only use threshold value face T that image is carried out binaryzation step 209.

Storage medium

Described purpose of the present invention can also be by realizing with program of operation or batch processing on any messaging device that any image source is communicated by letter with subsequent processing device aforesaid.Described messaging device, image source and subsequent processing device can be known common apparatus.Therefore, described purpose of the present invention also can be only by providing the program code of realizing described method or equipment to realize.That is to say that the storage medium that stores the program code of realizing described method constitutes the present invention.

To those skilled in the art, can realize described method with any program language programming easily.Therefore, omitted detailed description at this to described program code.

Obviously, described storage medium can be well known by persons skilled in the art, and perhaps therefore the storage medium of any kind that is developed in the future also there is no need at this various storage mediums to be enumerated one by one.

Although in conjunction with concrete steps and structrual description the present invention, the present invention is not limited to details as described herein.The application should cover all variation, modification and modification without departing from the spirit and scope of the present invention.

Claims

1. one kind is carried out the method for self-adaption binaryzation to the gray level file and picture, and this method comprises:

Partiting step is divided into piece with the gray level document;

First determining step is determined background piece and text block in the middle of described according to the feature of the piece of being divided;

Second determining step is determined the background pixel in the included pixel of determined each text block of described first determining step;

First calculation procedure, calculate the block threshold value face of the threshold value of each piece of expression, wherein, calculate the threshold value of this background piece based on all included pixels in the determined background piece of described first determining step, based on the threshold value of background pixel calculating text piece included in the determined text block of described first determining step, described background pixel is determined in described second determining step; And

The binaryzation step uses block threshold value face that described first calculation procedure calculates with described gray level document image binaryzation.

2. the method for claim 1, wherein described first determining step comprises the steps:

The computing block characteristic image, wherein each pixel of this block feature image is represented the feature of each piece; And

By this block feature image is carried out Threshold Segmentation computing block mask, wherein, each pixel of piece mask represents that corresponding piece is background piece or text block.

3. method as claimed in claim 2, wherein, described binaryzation step comprises the steps:

By interpolation launch described block threshold value face and piece mask with described gray level file and picture coupling;

Adjust the block threshold value face of expansion with off-set value, this off-set value is based on that the piece mask of the block threshold value face of expansion and expansion calculates; And

The block threshold value face that uses adjusted expansion is with described gray level document image binaryzation.

4. the method for claim 1, wherein in described first calculation procedure, the threshold value that is confirmed as the piece of background piece in described first determining step is based on the mean value calculation of the pixel in this piece, and,

In described first calculation procedure, the threshold value that is confirmed as the piece of text block in described first determining step is based on the mean value calculation of the background pixel in this piece.

5. the method for claim 1, wherein in described second determining step, the background pixel of described text block is based on that bright part in the text block determines, described bright part is come mark by the second derivative method.

6. method as claimed in claim 5, wherein, described second derivative method is used the LOG operator.

7. method as claimed in claim 2, wherein, the step of computing block mask realizes with the Otsu method.

8. the method for claim 1, wherein the feature of each piece is the variance of pixel value included in this piece.

9. method as claimed in claim 3, wherein, the step of launching described block threshold value face realizes that with bilinear interpolation the step of launching described mask realizes with nearest neighbor element method of interpolation.

10. the method for claim 1 also comprises:

The 3rd determining step is determined the picture quality of described gray level file and picture; And

The second binaryzation step if judge the picture quality height in described the 3rd determining step, is then come the described gray level file and picture of binaryzation with global threshold;

Wherein, if judge that in described the 3rd determining step picture quality is not high, then carries out described partiting step, described first determining step, described second determining step, described first calculation procedure and described binaryzation step.

11. method as claimed in claim 10, wherein, picture quality is based on that the Otsu method judges in described the 3rd determining step.

12. one kind is carried out the equipment of self-adaption binaryzation to the gray level file and picture, this equipment comprises:

Classification apparatus is divided into piece with the gray level document;

First determines device, according to background piece and the text block in the middle of the definite piece of being divided of the feature of piece;

Second determines device, determines that described first determines the background pixel in the included pixel of determined each text block of device;

First calculation element, calculate the block threshold value face of the threshold value of each piece of expression, wherein, calculate the threshold value of this background piece based on all included pixels in described first definite determined background piece of device, determine that based on described first background pixel included in the determined text block of device calculates the threshold value of text piece, described background pixel is determined in described second definite device; And

The binaryzation device uses block threshold value face that described first calculation element calculated with described gray level document image binaryzation.

13. equipment as claimed in claim 12, wherein, described first determines that device comprises:

The device of computing block characteristic image, wherein each pixel of this block feature image is represented the feature of each piece; And

By this block feature image being carried out the device of Threshold Segmentation computing block mask, wherein, each pixel of piece mask represents that corresponding piece is background piece or text block.

14. equipment as claimed in claim 13, wherein, described binaryzation device comprises:

By interpolation launch described block threshold value face and piece mask with the device of described gray level file and picture coupling;

Adjust the device of the block threshold value face of expansion with off-set value, this off-set value is based on that the piece mask of the block threshold value face of expansion and expansion calculates; And

Use the device of adjusted block threshold value face with described grayscale image binaryzation.

15. equipment as claimed in claim 12 wherein, in described first calculation element, is based on the mean value calculation of the pixel in this piece in described first threshold value of determining to be confirmed as in the device piece of background piece, and,

In described first calculation element, be based on the mean value calculation of the background pixel in this piece in described first threshold value of determining to be confirmed as in the device piece of text block.

16. equipment as claimed in claim 12 wherein, is determined in the device described second, the background pixel of described text block is based on that bright part in the text block determines, described bright part is come mark by the second derivative counter.

17. equipment as claimed in claim 16, wherein, described second derivative counter uses the LOG operator.

18. equipment as claimed in claim 13, wherein, the device of computing block mask is realized with Otsu equipment.

19. equipment as claimed in claim 12, wherein, the feature of each piece is the variance of pixel value included in this piece.

20. equipment as claimed in claim 14 wherein, launches the device of described block threshold value face and realizes with bilinear interpolation, launches the device of described mask and realizes with nearest neighbor element method of interpolation.

21. equipment as claimed in claim 12 also comprises:

The 3rd determines device, determines the picture quality of described gray level file and picture; And

The second binaryzation device if the described the 3rd definite device is judged the picture quality height, then comes the described gray level file and picture of binaryzation with global threshold;

Wherein, if the described the 3rd definite device judges that picture quality is not high, then described classification apparatus, described first is determined device, described second definite device, described first calculation element and the work of described binaryzation device.

22. equipment as claimed in claim 21 wherein, determines that the described the 3rd picture quality in the device is based on that Otsu equipment judges.

23. a computer-readable recording medium is characterized in that wherein storing the program code of realizing one of claim 1-11 described method.