CN100430958C

CN100430958C - Identifying distance regulator and method thereof and text lines identifier and method thereof

Info

Publication number: CN100430958C
Application number: CNB2005100928158A
Authority: CN
Inventors: 孙俊; 堀田悦伸; 胜山裕; 直井聪
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2005-08-18
Filing date: 2005-08-18
Publication date: 2008-11-05
Anticipated expiration: 2025-08-18
Also published as: JP5211449B2; JP2007052782A; CN1916938A

Abstract

A method for regulating initial identifying distance of candidate character includes calculating structure characteristic value of training sample for said candidate character then regulating said initial identifying distance according to calculated out structure characteristic value.

Description

Adjust the method and apparatus of the initial identification distance of candidate characters

Technical field

The present invention relates to character recognition device and character identifying method, more specifically, relate to line of text recognition device and method that the character in the degeneration line of text is discerned.

Background technology

Along with universal day by day on file and picture is taken of digital camera and Digital Video, the identification of degeneration line of text is more and more paid attention to.Two parts of cutting apart that the identification of degeneration line of text comprised single character recognition and line of text.These two parts organically combine again.

Cut apart for line of text, be to use the most a kind of based on the dividing method of discerning.Fig. 1 is traditional schematic diagram based on the dividing method of discerning.The image of input is at first through binaryzation operation, then the stroke by the coupling assembling analysis of bianry image being obtained literal (most going up among Fig. 1 a row).Can be at the coupling assembling analytical algorithm of image referring to Paul Gonzales, " Digital Image Processing (second edition) ", the 435th page, the Electronic Industry Press, Ruan Qiuqi, Ruan Yuzhi etc. translate.Each coupling assembling can be regarded as a basic separating character (centre one row among Fig. 1).And regard the combination of coupling assembling as synthetic separating character (bottom one row among Fig. 1).In this article, basic separating character and synthetic separating character all are called as character, even they may only be coupling assembling, the radical of no any literal meaning.Then, all carry out character recognition and provide a decipherment distance for each basic separating character and synthetic separating character.A line of text can be broken down into the split path that a lot of bars are combined by basic separating character of difference and synthetic separating character, and the decipherment distance of each split path is to form its basic separating character and the decipherment distance sum of synthetic separating character.The segmentation result of this article one's own profession is to be decided by the split path with minimum decipherment distance sum.When having realized cutting apart, to the recognition result of each basic separating character and synthetic separating character just to the last recognition result of character.

As shown in Figure 1, by " Ha ", " リ " and " The " split path of forming have minimum decipherment distance value, 72.Therefore they are outputted as the last result of cutting apart and discerning.

As can be seen from Figure 1, the value of decipherment distance is not only for recognition result, and also is very important for correctly cutting apart.For example, in Fig. 1, be 21 for the minimum decipherment distance of " Ha ", the decipherment distance of two strokes is respectively 19 and 26 about this character.If the decipherment distance sum of these two strokes is less than 21, even the recognition result of " Ha " is correct, it can be " ノ " and " Dian " two parts by erroneous segmentation still.

Article and the patent much cut apart about line of text have been arranged at present, such as:

Y.Lu in January nineteen ninety-five at Pattern Recognition (Vol.28, no.1, " the Machine Printed Character Segmentation-AnOverview " that delivers on pp.67-80).

S.W.Lee, D.J.Lee, H.S.Park 1996 19 months is at IEEE transactionon pattern analysis and machine intelligence (Vol.18, no.10, " the A New Methodology for Gray-Scale CharacterSegmentation and Recognition " that delivers on pp.1045-1050).

No. the 6th, 327,385, the United States Patent (USP) of Kamitani " Character segmentationdevice and character segmentation system ".

No. the 5th, 692,069, the United States Patent (USP) of Hanson, " Apparatus for performingcharacter segmentation using slant histograms ".

No. the 5th, 172,422, the United States Patent (USP) of Tan, " Fast character segmentationof skewed text lines for optical character recognition ".

The major part of these articles and patent all is the processing at the adhesion literal, and most process object all is a binary image.

For the line of text image of degenerating, traditional binarization method usually can cause the serious disconnected pen (the stroke pixel is lost) or the adhesion of stroke.So poor effect of identification.Method based on Shuangzi space (dual eigenspace) has good recognition effect for degraded character.This method is directly extracted feature from the grayscale character image.Fig. 2 is the process flow diagram that utilizes the Shuangzi space-wise to carry out character recognition.Input is a normalized character picture of process, at first extracts the feature of character picture by first dictionary (dictionary one among Fig. 2).Then, by second dictionary (dictionary two among Fig. 2), this character picture one of M classification of being presorted.At last, the 3rd dictionary (dictionary three among Fig. 2) carries out sophisticated category to the character feature of input, is appointed as a certain class in M the classification.At last, system exports character code and the decipherment distance that identifies.

Method based on the Shuangzi space does not need binaryzation, and it directly acts on gray level image, and the result of binaryzation only is used for pre-segmentation.Because the method based on the Shuangzi space is directly extracted feature from gray level image, avoided the link of binaryzation, so it is for having better resistivity owing to the caused noise of image degradation.But, directly in dividing method, utilize the Shuangzi space-wise that some problems are arranged based on identification.

Fig. 3 shows the synoptic diagram of defective of the character identifying method of prior art.As shown in Figure 3, uppermost image is a line of text image.Second row is the result of binaryzation.Binary image is used for carrying out pre-segmentation.Dotted border among the figure is the result of pre-segmentation.The third line is the gray level image through the basic separating character after the normalization, is the result and the corresponding decipherment distance of identification below each split image.Fourth line is with “ Open through the synthetic separating character " year " after the normalization " the Normalized Grey Level character picture, and the recognition result and the decipherment distance of correspondence.If use traditional dividing method ， “ Open based on identification " will be split into four parts.Because “ is Open " the decipherment distance sum of corresponding four basic separating characters is 5.39+61.01+45.69+20.37=132.46.Er “ Open " decipherment distance itself is 409.71, greater than its decipherment distance sum of four parts.Therefore this line of text can be identified as " year 1 time 1! 11 く ".

Present patent and the article of still not cutting apart at the degeneration line of text.

Summary of the invention

The present invention proposes in view of above situation.The objective of the invention is to utilize the charcter topology feature to adjust original decipherment distance, make it help more cutting apart, utilize the Shuangzi space to cut apart the problem that is occurred thereby solve.

According to an aspect of the present invention, the invention provides a kind of method of adjusting the initial identification distance of candidate characters, may further comprise the steps: architectural feature value calculation procedure, calculate the architectural feature value of the training sample of described candidate characters; Set-up procedure, according to the architectural feature value that described architectural feature value calculation procedure is calculated described initial identification distance is adjusted, the sparse degree value of the character stroke pixel of the training sample that described architectural feature value is described candidate characters, the average stroke hop count of the ranks order of the character stroke of the sparse degree value of the character stroke of the training sample of described candidate characters or the training sample of described candidate characters, the computing that described set-up procedure adopts multiplication or contains multiplication is adjusted described initial identification distance, wherein, the sparse degree value of the character stroke pixel of the training sample of described candidate characters is the ratio of number of character stroke pixel of the training sample of the minimum external foursquare area of training sample of described candidate characters and described candidate characters, the sparse degree value of the character stroke of the training sample of described candidate characters is the ratio of n power of number of character stroke of the training sample of the minimum external foursquare area of training sample of described candidate characters and described candidate characters, and n is a positive integer; The average stroke hop count of the ranks of described character stroke order refers to by each row of calculation training sample image and the stroke number of each row, and acquisition stroke number is asked the number that is on average obtained.

According to a further aspect in the invention, provide a kind of device of adjusting the initial identification distance of candidate characters, having comprised: architectural feature value computing unit, calculate the architectural feature value of the training sample of described candidate characters; Adjustment unit, according to the architectural feature value that described architectural feature value computing unit is calculated described initial identification distance is adjusted, the computing that the average stroke hop count of the ranks order of the character stroke of the sparse degree value of the character stroke of the sparse degree value of the character stroke pixel of the training sample that described architectural feature value is described candidate characters, the training sample of described candidate characters or the training sample of described candidate characters, described set-up procedure adopt multiplication or contain multiplication is adjusted described initial identification distance; Wherein, the sparse degree value of the character stroke pixel of the training sample of described candidate characters is the ratio of number of character stroke pixel of the training sample of the minimum external foursquare area of training sample of described candidate characters and described candidate characters, the sparse degree value of the character stroke of the training sample of described candidate characters is the ratio of n power of number of character stroke of the training sample of the minimum external foursquare area of training sample of described candidate characters and described candidate characters, and n is a positive integer; The average stroke hop count of the ranks of described character stroke order refers to by each row of calculation training sample image and the stroke number of each row, and acquisition stroke number is asked the number that is on average obtained.

Preferably, obtain the training sample of described candidate characters by the character code of described candidate characters.

Preferably, the described computing that contains multiplication for described architectural feature value through taking the logarithm after and the multiplying each other of described initial identification distance.

Preferably, described architectural feature value is identical with respect to the variation tendency of charcter topology with described decipherment distance with respect to the variation tendency of charcter topology, and described set-up procedure (device) adopts the computing be divided by or contain division that described initial identification distance is adjusted.

The present invention can overcome the problem that occurs in the prior art, correctly carries out cutting apart and identification of line of text, has significant technique effect.

Description of drawings

Included accompanying drawing is used for further explaining the present invention, is used from instructions one and explains principle of the present invention.

Fig. 1 is traditional schematic diagram based on the dividing method of discerning;

Fig. 2 utilizes the Shuangzi space-wise to carry out the process flow diagram of character recognition;

Fig. 3 shows the synoptic diagram of defective of the character identifying method of prior art;

Fig. 4 is the process flow diagram of one embodiment of the present of invention;

Fig. 5 shows the characteristics of character shape feature and the synoptic diagram of calculating;

Fig. 6 shows and the corresponding adjustment of Fig. 3 back decipherment distance.

Embodiment

The preferred embodiments of the present invention are described below with reference to accompanying drawings.These embodiment only are explanatory and schematic, are not the restriction to protection scope of the present invention.

Fig. 4 is the process flow diagram of one embodiment of the present of invention.For the normalized character picture 401 of process of input, feature extraction unit 402 utilizes first dictionary 403 to extract the feature of image:

Y＝U ^T(X-X)， (1)

X=[x wherein ₁, x ₂... x _W*h] ^TThe normalized character picture of the process of representing length and width to be respectively w and h.X=[x ₁, x ₂... x _W*h] ^TBe the mean value of the normalization character picture of all training samples.U=[u ₁, u ₂..., u _n] be transformation matrix, wherein a u _i=[u _I1, u _I2..., u _Iw*h] ^T, u _iBe the column vector of an one dimension, this vector is the one-component of matrix U, and its meaning and the mode of obtaining can be referring to the bibliographys of quoting later.u _I1Be this vectorial u _iFirst component, and the like, n is a dimension.First dictionary 403 is made up of U and X.The feature extracting method that formula (1) uses is called principal component analysis method (Principal Component Analysis is called for short PCA).Specific implementation about PCA can be referring to A Wiley-Interscience Publication John Wiley ﹠amp; " Pattern classification " second edition (2001.pp.115～117,568～569) of Sons company, the author is R.O.Duda, P.E.Hart, and D.G.Stork..

Rough sort unit 404 compares characteristic Y of being extracted and the feature that is stored in all kinds of characters in second dictionary 405 in advance.Feature algorithm relatively is a lot, wherein a kind of comparative approach that is based on Euclidean distance: D _i=| Y-Y _i|, D wherein _iBe i character class characteristic Y of characteristic Y distance _iEuclidean distance.Other number of candidate character classes of supposing the output of rough sort unit is M, has the selected output as rough sort of M character class of minimum euclidean distance.

Feature reconstruction unit 406 utilizes the 3rd dictionary 407 to reconstruct M reconstruct feature corresponding to M candidate's classification.The transformation matrix of the 3rd dictionary 407 in store each character class

{\tilde{U}}_{i} = [u_{1}^{i}, u_{2}^{i}, . . ., u_{n_{1}}^{i}]

With average eigenvector C _iThe reconstruct feature Obtain by formula (2):

η_{i} = {\tilde{U}}_{i}^{T} (Y - C_{i}),

{\hat{Y}}_{i} = {\tilde{U}}_{i}^{T} η_{i} + C_{i} - - - (2)

Be one by certain matrix that calculates, concrete algorithm is realized can be referring to the article of back.

Be

Transposed matrix.About reconstruct, can be referring to article:

J.Sun，Y.Hotta，Y.Katsuyama，S.Naoi，“Low?resolutioncharacter?recognition?by?dual?eigenspace?and?synthetic?degradedpatterns”ACM?1st?Hardcopy?Document?Processing?Workshop，pp.15～22，2004。

Meticulous recognition unit 408 among Fig. 4 calculates primitive character Y and M reconstruct feature

Between difference, the character class with minimum differentiation is exactly the result of meticulous identification, the coding of corresponding character class (candidate characters) is outputted as the character code 409 that identifies.Its minimum differentiation is outputted as initial identification distance 410.Attention is in classic method shown in Figure 2, and 410 to be taken as be last decipherment distance, and this might cause the generation of segmentation errors.

In the present embodiment, character shape eigenvalue calculation unit 411 is used to calculate the shape facility of the character that is identified, briefly, the character shape feature is a kind of of charcter topology feature, can be regarded as a kind of description of character stroke complexity, stroke is complicated more, and the value of shape facility is more little, stroke is simple more, and the value of shape facility is big more.The input of shape facility calculated value unit is the character code that identifies and corresponding to the binary character image of this coding (training sample corresponding with candidate characters also can be described as the training sample of candidate characters).The binary character image obtains according to the character code 409 that identifies among Fig. 4, binary character image and every class image corresponding characters coding all is stored in the storage medium (such as hard disk etc.) in advance, just can retrieve all character pictures to encoding according to coding, vice versa.If there is the bianry image that surpasses selected, then the value of character shape feature is the mean value or the weighted mean value of the shape facility Wg value of all bianry image characters, under the situation of using weighted mean value, in described storage medium, should store the weights of this bianry image accordingly.In the present embodiment, the sparse degree with the character stroke pixel is an example calculating character shape facility.Particularly, by following formula calculating character shape facility,

The number of Wg=s * s/ character stroke pixel

Wherein s is the minimum external foursquare length of side of character in the bianry image, the character stroke pixel is exactly those points of representing character stroke in the bianry image, can pass through from top to bottom, from left to right scan image judges that the value of each point knows whether this point is the character stroke pixel.For bianry image, the value of each point has only 2 kinds, and 0 or 1.Therefore 0 corresponding to background, and 1 corresponding to stroke, by calculating the number that 1 number just can obtain the stroke pixel.Minimum external square can be determined by the going up most of searching character stroke pixel, the most following, the most left and the rightest position.Suppose that these 4 values are respectively xs, xe, ys, ye.The boundary rectangle that they can unique definite character stroke image then.The wide w=xe-xs+1 of this rectangle, high h=ye-ys+1.The minimum external foursquare length of side is wide and high the maximum.If w＞h, upwards each expands (w-h)/2 pixel to then minimum external square downwards on high direction by boundary rectangle.If h＞w, each expands (h-w)/2 pixel to then minimum external square to the right left on wide direction by boundary rectangle.

In the above example, be example with the sparse degree of pixel, the calculating of character shape feature is illustrated.But the invention is not restricted to this, can use, such as the sparse degree of stroke as long as can distinguish out the feature of labyrinth character and simple structure character.Particularly, can use following formula calculating character shape facility:

Wg=s * s/ (number of character stroke) ⁿ

Wherein, n is the integer greater than 1, can determine to experience, preferably between 4 to 10.

The reason that why will use the character shape feature is under identical degree of degeneration, character with simple stroke structure is very little such as the decipherment distance of " 1 " and " く ", and the character with complicated stroke structure is such as the “ ease " with “ Open " decipherment distance bigger.This phenomenon as can see from Figure 3.The value characteristics of character shape feature are: the character shape eigenwert of character with complicated stroke structure is smaller, and the character shape eigenwert of character with simple stroke structure is bigger.Example from Fig. 5 is this point as can be seen.Therefore the character shape feature can be used for compensating because the influence that the charcter topology difference is brought decipherment distance.The present invention is not limited to adopt shape facility, if can compensate since the charcter topology difference to the influence of decipherment distance, also can be with other architectural feature.Therefore, in this article, reply character shape feature is carried out the explanation of broad sense, and promptly its value is with respect to the variation tendency of charcter topology and the decipherment distance variation tendency opposite configuration feature with respect to charcter topology.Also can be the average stroke hop count of ranks order such as it.At first scan each row of character picture, the number of statistics stroke scans each row then, statistics stroke number, and last mean value also can be represented the stroke complexity of character.Particularly, the statistical method of the stroke section of delegation is as follows in the image: for each row in the image, scanning from left to right, note first value that runs into and be 1 picture element, this is the left hand edge of a stroke, down scans then, notes from 1 to become 0 picture element again, this is the right hand edge of stroke, the corresponding stroke section of left hand edge and right hand edge.Continue down to search for from 0 and become 1 picture element (left hand edge of second stroke section) and become 0 picture element (right hand edge of second stroke section) from 1, and the like up to the end of scan.In like manner can obtain the stroke number of row.

Decipherment distance adjustment unit 413 utilizes the character shape feature to adjust initial decipherment distance.Particularly, can adopt formula R1=Wg * R to adjust.R1 is a decipherment distance 414 last among Fig. 4.R is the initial identification distance among Fig. 4.Be not can only be with multiplying each other, it be also passable to contain the computing (as multiplying each other with the initial identification distance again after the shape facility value is taken the logarithm (log)) of multiplying each other.Because for 2 initial identification distance R with different character picture of identical degree of degeneration, charcter topology complex image R value is bigger, the simple image R of charcter topology value is smaller, has so just caused difficulty to cutting apart.Therefore wg and R are opposite for the variation tendency of charcter topology, and wg has higher value for the simple characters image, have less value for the labyrinth character, can eliminate the sensitivity of R to structure by computing such as multiply each other.

In above embodiment, adopt shape facility that the initial identification distance is adjusted, compensated, in other embodiments, also can use other architectural feature identical with the initial identification distance with respect to the variation tendency of charcter topology with respect to the variation tendency of charcter topology.Promptly this architectural feature is the same with decipherment distance, and when charcter topology was complicated, its value was bigger, and when charcter topology was simple, its value was less.In this case, can adopt is divided by or contains the computing of being divided by adjusts the initial identification distance, in a word, as long as it is just passable to the sensitivity of structure to eliminate R, obviously, those skilled in the art can carry out various modification under prompting of the present invention, for saving length, this paper will not give unnecessary details one by one to this.

What Fig. 6 showed is the last decipherment distance of corresponding character picture among Fig. 3.Through ， “ Open after adjusting " final decipherment distance be 471, but the decipherment distance sum of four ingredients of its correspondence is 678.So “ Open " correctly cut apart.

Though the example among the present invention represents that with Japanese character the present invention is not limited only to Japanese, its principle also is applicable to other literal such as Chinese, Korean.In addition, in the above embodiments, line of text has a plurality of characters, but in fact, line of text also may have only a character, and the present invention also is applicable to such situation.

Should be appreciated that, for a person skilled in the art, apparently, can carry out various variants and modifications within the spirit and scope of the present invention.Therefore tackle protection scope of the present invention and carry out maximum explanation, and cover these variants and modifications, as long as they fall in the scope of claim of the present invention and equivalent thereof.

Claims

1, a kind of method of adjusting the initial identification distance of candidate characters said method comprising the steps of:

Architectural feature value calculation procedure is calculated the architectural feature value of the training sample of described candidate characters;

Set-up procedure; According to the Structural Eigenvalue that described Structural Eigenvalue calculation procedure is calculated described initial identification distance is adjusted; It is characterized in that; Described Structural Eigenvalue is the average stroke hop count of the ranks order of character stroke of the training sample of the sparse degree value of character stroke of training sample of sparse degree value, described candidate characters of character stroke pixel of training sample of described candidate characters or described candidate characters; The computing that described set-up procedure adopts multiplication or contains multiplication is adjusted described initial identification distance

Wherein, the sparse degree value of the character stroke pixel of the training sample of described candidate characters is the ratio of number of character stroke pixel of the training sample of the minimum external foursquare area of training sample of described candidate characters and described candidate characters, the sparse degree value of the character stroke of the training sample of described candidate characters is the ratio of n power of number of character stroke of the training sample of the minimum external foursquare area of training sample of described candidate characters and described candidate characters, and n is a positive integer; The average stroke hop count of the ranks of described character stroke order refers to by each row of calculation training sample image and the stroke number of each row, and acquisition stroke number is asked the number that is on average obtained.

2, the method for the initial identification distance of adjustment candidate characters according to claim 1 is characterized in that, obtains the training sample of described candidate characters by the character code of described candidate characters.

3, the method for the initial identification distance of adjustment candidate characters according to claim 1, it is characterized in that, the described computing that contains multiplication for architectural feature value that described architectural feature value calculation procedure is calculated through taking the logarithm after and the multiplying each other of described initial identification distance.

4, the method for the initial identification distance of adjustment candidate characters according to claim 1, it is characterized in that, described architectural feature value is identical with respect to the variation tendency of charcter topology with described decipherment distance with respect to the variation tendency of charcter topology, and described set-up procedure adopts the computing of being divided by or containing division that described initial identification distance is adjusted.

5, the method for the initial identification distance of adjustment candidate characters according to claim 1, it is characterized in that, when described candidate characters had a plurality of training sample, the architectural feature value that described architectural feature value calculation procedure is calculated was the average or weighted mean of the architectural feature value that these a plurality of training samples calculate at all.

6, a kind of device of adjusting the initial identification distance of candidate characters, this device comprises:

Architectural feature value computing unit calculates the architectural feature value of the training sample of described candidate characters;

Adjustment unit, according to the architectural feature value that described architectural feature value computing unit is calculated described initial identification distance is adjusted, it is characterized in that, the computing that the average stroke hop count of the ranks order of the character stroke of the sparse degree value of the character stroke of the sparse degree value of the character stroke pixel of the training sample that described architectural feature value is described candidate characters, the training sample of described candidate characters or the training sample of described candidate characters, described adjustment unit adopt multiplication or contain multiplication is adjusted described initial identification distance.

7, the device of the initial identification distance of adjustment candidate characters according to claim 6 is characterized in that, obtains the training sample of described candidate characters by the character code of described candidate characters.

8, the device of the initial identification distance of adjustment candidate characters according to claim 6, it is characterized in that, described device also comprises storage unit, described storage unit is stored the character code of described candidate characters and the training sample of described candidate characters accordingly, or stores the weight of the training sample of the training sample of the character code of described candidate characters, described candidate characters and described candidate characters accordingly.

9, the device of the initial identification distance of adjustment candidate characters according to claim 6, it is characterized in that, the described computing that contains multiplication for architectural feature value that described architectural feature value computing unit is calculated through taking the logarithm after and the multiplying each other of described initial identification distance.

10, the device of the initial identification distance of adjustment candidate characters according to claim 6, it is characterized in that, described architectural feature value is identical with respect to the variation tendency of charcter topology with described decipherment distance with respect to the variation tendency of charcter topology, and described adjustment unit adopts the computing of being divided by or containing division that described initial identification distance is adjusted.

11, the device of the initial identification distance of adjustment candidate characters according to claim 6, it is characterized in that, when described candidate characters had a plurality of training sample, the architectural feature value that described architectural feature value computing unit is calculated was the average or weighted mean of the architectural feature value that these a plurality of training samples calculate at all.