CN100347723C - Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost - Google Patents

Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost Download PDF

Info

Publication number
CN100347723C
CN100347723C CNB2005100121952A CN200510012195A CN100347723C CN 100347723 C CN100347723 C CN 100347723C CN B2005100121952 A CNB2005100121952 A CN B2005100121952A CN 200510012195 A CN200510012195 A CN 200510012195A CN 100347723 C CN100347723 C CN 100347723C
Authority
CN
China
Prior art keywords
character
image
distance
swimming
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2005100121952A
Other languages
Chinese (zh)
Other versions
CN1719454A (en
Inventor
丁晓青
蒋焰
付强
刘长松
彭良瑞
方驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CNB2005100121952A priority Critical patent/CN100347723C/en
Publication of CN1719454A publication Critical patent/CN1719454A/en
Application granted granted Critical
Publication of CN100347723C publication Critical patent/CN100347723C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Character Input (AREA)

Abstract

The present invention relates to an off-line handwritten Chinese character segmentation method amalgamated by geometric costs and semantic-recognition costs, which belongs to the field of character recognition. The present invention is characterized in that firstly, a row image of inputted off-line handwritten Chinese characters is analyzed to extract stroke segments, the stroke segments are merged into sub-character blocks to simultaneously give the geometric costs merged by the sub-character blocks; a plurality of possible candidate segmentation methods are generated by the geometric costs, every method is evaluated by using a binary syntactic model of language to obtain the semantic-recognition cost of every segmentation manner, and finally, the geometric costs and the semantic-recognition costs are synthesized to give an optimal segmentation recognition scheme. The present invention is applied to the segmentation and the recognition of a handwritten envelope address, and a segmentation correction rate can reach 93%; the present invention largely improves the performance of the traditional segmentation method, and has a certain guidance action for the segmentation problems of other language characters or fields.

Description

The cutting method of the The Off-line Handwritten Chinese character that combines with semanteme-identification cost based on how much costs
Technical field
The cutting method of the The Off-line Handwritten Chinese character that combines with semanteme-identification cost based on how much costs belongs to the character recognition field.
Background technology
Optical Character Recognition (OCR) technology is the hot issue in the pattern-recognition always, and the developing history of more than two decades has also been arranged at the Chinese character OCR technology of Chinese Character Recognition.The identification of off line Chinese character is meant that the Chinese character image that obtains by scanner, digital camera or camera discerns (Fig. 3), and the identification of The Off-line Handwritten Chinese is a difficult point always.This is because writing naturally of people has bigger degree of freedom comparatively speaking, and it can not provide how extra information as on-line handwritten Chinese character.
The problem that relates to a key in the off line Chinese Character Recognition is exactly the cutting of character, and this is because existing sorter is generally just made identification and judgement to independent image, and it needs at first to determine which partly belongs to same Chinese character.For human eye, adjacent Chinese characters is not separately become problem, but for machine, this is not to be easy to.Traditional segmentation technique all depends on geometric properties basically and adjudicates, and a part of method has been used the confidence information that recognizer provides simultaneously.
Yet the situation of reading in fact from the mankind, it is not enough only depending on above information, brain be the more important thing is the tightness degree of considering the adjacent Chinese characters combination in finishing the process that adjacent Chinese characters symbol image is separated, tend to accept from the syntax and semantics more acceptable cutting result.
Chinese character has profuse structure, font architecture about wherein quite a few Chinese character has, under writing style from left to right, the Chinese character that unavoidable problem of off line Chinese character segmentation is exactly a left and right sides structure usually is separated, for example " village " character segmentation in " village " becomes " wood is very little ", for such cutting, recognizer also often provides very large degree of confidence, also is insecure so depend on the method for classifier confidence merely.No matter be the method for utilizing the method for how much costs or utilizing the recognizer degree of confidence, all ignored the most significant in fact feature---semantic information in the Chinese character segmentation.
People have had long time for the research of language model, obtain obvious effects in practice based on the statistical language model of statistical method, particularly in the post-processing technology of speech recognition and Chinese Character Recognition.Mainly be to discern and adjudicate by the Hidden Markov Model (HMM) (HMM) of binary or three metagrammars.
The present invention has mainly utilized the semanteme-recognition confidence that obtains character cutting based on the Hidden Markov Model (HMM) of two-dimensional grammar, simultaneously the geometry cost of character cutting and the confidence information triplicity of character recognition are got up then, obtain a unified cutting identification cost, this process is actual organically to combine the cutting of character, identification and three processes of aftertreatment, and the identification and the aftertreatment of character have also been finished simultaneously after the cutting method decision.Experiment showed, that the present invention is at a high speed effectively.The present invention has proposed a kind of model simultaneously, has unified cutting, identification and the aftertreatment of The Off-line Handwritten Chinese, and the cutting thinking of this Chinese character is not occur in other documents, has certain innovation.
Summary of the invention
The objective of the invention is to realize the The Off-line Handwritten Chinese character cutting of high-accuracy.The process object of this method is the capable image (Fig. 4 has provided the general process of hand-written capable Image Acquisition) of the The Off-line Handwritten Chinese of input, at first pass through the method for some graphical analyses, extract the pen section of character, merge into sub-character by the pen section then, extracting a series of geometric properties and parameter simultaneously estimates, the slit mode of how much cost optimums of K choosing before seeking with the optimization method of K shortest path according to these how much costs then, then every kind of slit mode is used based on the language model of two-dimensional grammar and estimated, obtain corresponding semanteme-recognition confidence, thereby then these two costs combinations are obtained the character cutting mode that final evaluation is picked out to be had most.
The present invention comprises theoretical derivation, parameter estimation, and feature extraction, Candidate Set generates and sorter is adjudicated this several sections.
1 theory part
(1) based on the HMM semanteme-recognition confidence of statistical language model
x 1x 2... x n---the character picture after the cutting of row image;
N Cand---the number of the identification candidate that the identification core provides the word character image, this is the performance parameter of an identification core, thereby is constant, and irrelevant with the input character image;
c I, j---by discerning i the character picture x that core provides iJ candidate's recognition result (the identification core is arranged the identification candidate according to the ascending ascending order of decipherment distance);
The last handling process of general Chinese Character Recognition is the character posterior probability under given image sequence of will maximizing, and makes and satisfies the identification string of the maximized character string of posterior probability as us, promptly
Figure C20051001219500191
Utilize the Bayesian formula, we have
P ( c 1 , k 1 , c 2 , k 2 , . . . , c n , k n | x 1 , x 2 , . . . , x n ) = P ( x 1 , x 2 , . . . , x n | c 1 , k 1 , c 2 , k 2 , . . . , c n , k n ) P ( c 1 , k 1 , c 2 , k 2 , . . . , c n , k n ) P ( x 1 , x 2 , . . . , x n ) - - - ( 1 )
The last handling process of character recognition is general to calculate following formula according to following hypothesis:
1. owing under the condition of given character string, the character picture probability of occurrence is relatively independent, and every width of cloth image is only relevant with the character of its representative, and it doesn't matter with other character, so
P ( x 1 , x 2 , . . . , x n | c 1 , k 1 , c 2 , k 2 , . . . , c n , k n ) = Π i = 1 n P ( x i | c i , k i ) = Π i = 1 n P ( c i , k i | x i ) P ( x i ) P ( c i , k i ) - - - ( 2 )
Second equal sign is according to the Bayes formula.
2. under the situation of not considering corpus, it is equiprobable basically that each Chinese character occurs, and thinks that therefore the prior probability that each character occurs is approximate consistent, promptly each character we think that all it such as has at the probability that may occur;
3. suppose P (c 1, k1, c 2, k2..., c N, kn) satisfy the two-dimensional grammar model, promptly only be of the have influence of the previous Chinese character of each Chinese character, so for a back Chinese character
P ( c 1 , k 1 , c 2 , k 2 , . . . , c n , k n ) = P ( c 1 , k 1 ) Π i = 2 n P ( c i , k i | c i - 1 , k i - 1 ) - - - ( 3 )
In fact from the angle of Statistical Linguistics, adopt the ternary syntactic model more to tally with the actual situation, corresponding formula is
P ( c 1 , k 1 , c 2 , k 2 , . . . , c n , k n ) = P ( c 1 , k 1 ) P ( c 2 , k 2 | c 1 , k 1 ) Π i = 3 n P ( c i , k i | c i - 2 , k i - 2 c i - 1 , k i - 1 ) - - - ( 4 )
But consider time complexity and space complexity, it is just enough only to get the two-dimensional grammar model.
Comprehensively (1) (2) (3) obtain
P ( c 1 , k 1 , c 2 , k 2 , . . . , c n , k n | x 1 , x 2 , . . . , x n ) = Π i = 1 n P ( c i , k i | x i ) ( P ( c 1 , k 1 ) Π i = 2 n P ( c i , k i | c i - 1 , k i - 1 ) ) Π i = 1 n P ( x i ) P ( x 1 , x 2 , . . . , x n ) Π i = 1 n P ( c i , k i )
According to hypothesis 2., and notice
Figure C20051001219500204
And P (x 1, x 2..., x n) under the given situation of character picture sequence, be constant, following formula explanation so P ( c 1 , k 1 , c 2 , k 2 , . . . , c n , k n | x 1 , x 2 , . . . , x n ) ∝ Π i = 1 n P ( c i , k i | x i ) ( P ( c 1 , k 1 ) Π i = 2 n P ( c i , k i | c i - 1 , k i - 1 ) ) (" ∝ " symbolic representation " with ... be directly proportional ").
The post-processing technology of OCR is exactly will be at identification candidate collection c Ij1≤i≤n 1≤j≤N CandIn pick out optimum status switch, so-called optimum, be exactly maximization It has reflected the semanteme-recognition confidence of identification to a certain extent.
If k 1 * k 2 * . . . k n * = arg max 1 ≤ k 1 , k 2 , . . . , k n ≤ N Cand Π i = 1 n P ( c i , k i | x i ) ( P ( c 1 , k 1 ) Π i = 2 n P ( c i , k i | c i - 1 , k i - 1 ) ) , Then can choose optimum candidate sequence c 1, k1*c 2, k2*... c N, kn*
So just be converted into the problem among the HMM, how estimate optimum status switch at given observation sequence, the Viterbi method is the classic algorithm that addresses this problem, and its time complexity is O (nN Cand 2).N Cand 2Be the number of identification candidate, n is the piece number of the character picture after merging.
The Viterbi arthmetic statement is as follows:
Make Q[n] [N Cand] be two-dimensional array, wherein a Q[t] [j] preserved from certain candidate of first character picture and put c to byte T, jThe logarithm value of the probability that had of maximum possible candidate selection mode, be not difficult to find out 1≤t≤n 1≤j≤N CandGet a two-dimentional array of pointers Path[n in addition] [N Cand] be used to write down computation process.Initialization t=1,1≤j≤N Cand
Path[1][j]=NULL
Q[1][j]=logP(c 1,j)+log(c 1,j|x 1)
Recurrence 2≤t≤n is to 1≤j≤N Cand
Q [ t ] [ j ] = max 1 ≤ l ≤ N Cand { Q [ t - 1 ] [ l ] + log P ( c t , j | c t - 1 , l ) } + log P ( c t , j | x t )
l * = arg max 1 ≤ l ≤ N Cand { Q [ t - 1 ] [ l ] + log P ( c t , j | c t - 1 , l ) }
Path[t] [j] sensing byte point c T-1, l*, i.e. byte point c T, jFather node be c T-1, l*Stop t=n j * = arg max 1 ≤ j ≤ N Cand Q [ n ] [ j ] , Optimum byte point is c N, j*
Recall Path[n] [j *] path of indication, each the byte point on the outgoing route obtains the recognition result of optimum state sequence as us, obtains the logarithm value Q[n of the probability in maximum possible path simultaneously] [j *].
(2) fusion of geometry cost and HMM semanteme-identification cost
s 1, s 2..., s l---sub-character picture sequence;
x 1, x 2..., x n---the character picture sequence after sub-character picture merges;
N Cand---the number of the identification candidate that the identification core provides the word character image as previously mentioned, is a constant;
c I, j---i the character picture x that mountain identification core provides iJ candidate's recognition result (the identification core is arranged the identification candidate according to the ascending ascending order of decipherment distance);
Different with the identification aftertreatment of known cutting is, our posterior probability that need maximize that is to say that at pen section amalgamation result under the stable condition, greatly beggar's character merges the posterior probability P (x of mode and recognition result 1x 2... x n, c 1, k1c 2, k2... c N, kn| s 1s 2... s l)
According to identical relation
P ( x 1 x 2 . . . x n , c 1 , k 1 c 2 , k 2 . . . c n , k n | s 1 s 2 . . . s l ) = P ( c 1 , k 1 c 2 , k 2 . . . c n , k n | x 1 x 2 . . . x n , s 1 s 2 . . . s l ) P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l )
When segment image given after, identifying only with the image x that segments 1, x 2..., x nRelevant, irrelevant with pen section amalgamation result, so s can be omitted in the condition the inside of first of following formula 1s 2... s l, be reduced to
P ( x 1 x 2 . . . x n , c 1 , k 1 c 2 , k 2 . . . c n , k n | s 1 s 2 . . . s l ) = P ( c 1 , k 1 c 2 , k 2 . . . c n , k n | x 1 x 2 . . . x n ) P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l ) - - - ( 6 )
Our posterior probability that we need be maximized is divided into two parts like this, and first is our above-mentioned semanteme-recognition confidence, and a back part is exactly the geometry cost that sub-character picture merges, (5) of substitution front
P ( x 1 x 2 . . . x n , c 1 , k 1 , c 2 , k 2 , . . . , c n , k n | s 1 s 2 . . . s l )
= Π i = 1 n P ( c i , k i | x i ) ( P ( c 1 , k 1 ) Π i = 2 n P ( c i , k i | c i - 1 , k i - 1 ) ) Π i = 1 n P ( x i ) P ( x 1 , x 2 , . . . , x n ) Π i = 1 n P ( c i , k i ) P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l ) - - - ( 7 )
We merge cutting, identification and aftertreatment to have descended at a unified model.But in practical operation, because the number of the character picture that different cutting route obtains is different, and the number of words of cutting is many more, and posterior probability values is often more little, so we can not directly relatively merge the posterior probability of the different cutting route of number of words.Because merge the influence that the number of words difference is brought, we get P (x in order to eliminate 1x 2... x n, c 1, k1, c 2, k2..., c N, kn| s 1s 2... s l) geometric mean, promptly
Figure C20051001219500225
As our objective function.
We at first get the logarithmic function of (7) formula:
log P ( x 1 x 2 . . . x n , c 1 , k 1 , c 2 , k 2 , . . . , c n , k n | s 1 s 2 . . . s l )
= Σ i = 1 n log P ( x i ) - log P ( x 1 , x 2 , . . . , x n ) - Σ i = 1 n log P ( c i , k i ) + log P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l )
+ Σ i = 1 n log P ( c i , k i | x i ) + log P ( c 1 , k 1 ) + Σ i = 2 n log P ( c i , k i | c i - 1 , k i - 1 ) - - - ( 8 )
The geometric mean correspondence the arithmetic mean of (8) formula, and (8) formula that is about to is divided by the cost function of n as us.
1 n log P ( x 1 x 2 . . . x n , c 1 , k 1 , c 2 , k 2 , . . . , c n , k n | s 1 s 2 . . . s l )
= 1 n ( Σ i = 1 n log P ( x i ) - log P ( x 1 , x 2 , . . . , x n ) - Σ i = 1 n log P ( c i , k i ) ) + 1 n log P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l )
+ 1 n ( Σ i = 1 n log P ( c i , k i | x i ) + log P ( c 1 , k 1 ) + Σ i = 2 n log P ( c i , k i | c i - 1 , k i - 1 ) ) - - - ( 9 )
How calculating following formula neither be easy to, and we need further be similar to the model of simplifying us.
To P (c I, ki), according to the hypothesis in the last handling process of Chinese Character Recognition, under the corpus condition of unknown, think that the prior probability that each word occurs is consistent in text, i.e. P (c I, ki) be a constant, thereby
Figure C20051001219500231
Also be a constant, the maximization process is not had influence;
For P (x 1, x 2..., x n), it represents the probability that a row character picture occurs, and the appearance of character picture we can be similar to and think an independently process, the character picture in front occurs the appearance of the image in back not have to influence substantially, thereby P ( x 1 , x 2 , . . . , x n ) = Π i = 1 n P ( x i ) .
Thereby 1 n Σ i = 1 n log P ( x i ) - 1 n log P ( x 1 , x 2 , . . . , x n ) = 1 n log Π i = 1 n P ( x i ) P ( x 1 , x 2 , . . . , x n ) = 0
Analysis above comprehensive, we are reduced to maximization at (10) formula that will maximize
T = 1 n ( Σ i = 1 n log P ( c i , k i | x i ) + log P ( c 1 , k 1 ) + Σ i = 2 n log P ( c i , k i | c i - 1 , k i - 1 ) ) + 1 n log P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l ) - - - ( 10 )
Order
H ‾ = Σ i = 1 n log P ( c i , k i | x i ) + log P ( c 1 , k 1 ) + Σ i = 2 n log P ( c i , k i | c i - 1 , k i - 1 ) n G ‾ = log P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l ) n
So we make T=H+ G, judge the criterion of optimum cutting as us with the T value.Wherein H can think the average semanteme-identification cost of this merging mode, and G is the average geometric cost of this merging mode correspondence.
Because the geometry cost that system provides is a kind of tolerance of distance, we select for use a monotone decreasing function to be similar to P (x 1x 2... x n| s 1s 2... s l)
P ( x 1 x 2 . . . x n | s 1 s 2 . . . s l ) = λe - λ ( g g ‾ - 1 ) - - - ( 11 )
Wherein λ is a positive constant, and g represents by sub-character block s 1s 2... s lMerge into character picture x 1x 2... x nThe geometry cost, g represents sub-character block s 1s 2... s lThe geometry cost of minimum in all possible merging mode, then G ‾ = 1 n log ( λ e - λ ( g g ‾ - 1 ) ) , We claim that it is an average geometric cost after the normalization.
2 parameter estimation
This algorithm needs to estimate a series of parameter in realization.
At first we need estimate prior probability and the intercharacter transition probability that character occurs for the two-dimensional grammar model, we must collect with capable image to be slit and relate to the consistent corpus of content for this purpose, the semantic constraint condition that could reflect us so targetedly, prior probability that each word of being expected to calculate by different field occurs and word are different with transition probability between the word.If can not select suitable expectation storehouse, the performance of system can be subjected to bigger influence so.
For example the capable image of cutting is the address of hand-written envelope if desired, and we just must collect the language material sample of address database as us so.
State following symbol:
N c---the number of times that Chinese character c occurs in the language material sample;
N C1c2---two-character word c 1c 2The number of times that in the language material sample, occurs;
N---the Chinese character sum in the language material sample;
P (c)---the probability that Chinese character c occurs in the language material sample;
P (c 2| c 1)---Chinese character c in the language material sample 1And then Chinese character c appears in the back 2Probability;
P Smooth()---the probability after the smoothing processing;
The number of M---the different Chinese character that comprises in the language material sample, country-level Chinese character standard has comprised 3755 of Chinese characters in common use, we establish M=3755 simply;
(1) prior probability of estimation character, just P (c)
Calculate the total degree N that each Chinese character occurs in corpus c, and the Chinese character sum N in the calculating corpus, then
Figure C20051001219500241
Be estimation to P (c).
(2) transition probability of calculating character
If we adopt the two-dimensional grammar model, because P ( c 2 | c 1 ) = P ( c 1 c 2 ) P ( c 1 ) , We at first calculate c so 1c 2The times N that this speech occurs in corpus C1c2, use then As to P (c 2| c 1) estimation.
(3) data is level and smooth
Here relate to the another one major issue and be exactly the level and smooth of data, because not every Chinese character all occurs in our corpus.Therefore the prior probability and the transition probability of a part of word are imponderable, and in statistical language model, people have proposed the whole bag of tricks and have been used for the level and smooth of data.The basic thought of the sparse smoothing processing of data is when training data is insufficient, adopts certain mode that probability is adjusted and repaired, to obtain more accurate probability.Because (1) (2) all underestimate into 0 to the probability that incident do not occur, and all probability of occurrence sums always 1, so it provides too high estimation to the incident that occurred.Therefore, level and smooth target is by small probability (or zero probability) being heightened, big probability is turned down, making that probability distribution on the whole is more even.Smoothing technique has not only solved the zero probability problem, and can improve the performance of language model on the whole.
Smoothing technique combines the high-order model of n-gram with lower-order model, comprised rollback (back-off) and two kinds of binding patterns of interpolation (interpolation).
With the two-dimensional grammar is example, and fall-back mode has following form:
P smooth ( c 2 | c 1 ) = P ( c 2 | c 1 ) if N c 1 c 2 > 0 γ ( c 1 ) P ( c 2 ) if N c 1 c 2 = 0
Following formula shows, if existing together to c 1c 2In corpus, occurred, then used P (c 2| c 1) be similar to transition probability; Otherwise, be retracted into lower-order model P (c 2), γ (c 1) be normalized factor.
Interpolative mode then is that high-order model and lower-order model are carried out linear interpolation, and following form is arranged:
P smooth(c 2|c 1)=P(c 2|c 1)+γ(c 1)P(c 2)
The difference of interpolative mode and fall-back mode is: when handling greater than zero probable value, the former has utilized the information of lower-order model, and the latter does not utilize.Both something in commons are: the information of all having utilized lower-order model when handling zero probability.This shows, be the key point of smoothing technique to the utilization of lower-order model information.
At present, Chang Yong smoothing method has methods such as Jelinek-Mercer is level and smooth, Katz level and smooth, absolute discount is level and smooth, Witten-Bell is level and smooth, Kneser-Ney is level and smooth.The performance of various smoothing methods is along with the size of corpus scale, the exponent number of n-gram model, the content change of corpus.Wherein, the training corpus scale is to smoothing technique Effect on Performance maximum.
(4) estimation of character degree of confidence
The information of degree of confidence reflection identification, promptly our recognition result to what extent is believable actually, this is that we need the work done after character recognition is finished.
X---character picture to be identified;
c j(x)---(the identification core is arranged according to the ascending ascending order of decipherment distance and is discerned candidate, so c by j candidate's identifier word to image x of providing of identification core 1(x) be exactly the first-selection identification candidate of image x);
d j(x)---by j candidate's identifier word c of the image x that provides of identification core j(x) Dui Ying decipherment distance, so d 1(x) decipherment distance of the first-selection of presentation video x identification candidate correspondence;
N Cand---the number of the identification candidate that the identification core provides the word character image, this value is constant, and is only relevant with identification core itself;
P (c j(x) | x)---image x is identified as c j(x) degree of confidence, this is the object that we need estimate.
General sorter provides its identification candidate c to image x 1(x), c 2(x) ..., c NCand(x), provide corresponding decipherment distance d simultaneously 1(x), d 2(x) ..., d NCand(x), how to calculate P (c j(x) | x) 1≤j≤N CandBe the problem of a more complicated, can consider following several ways, as for specifically selecting for use which kind of way need be at the time complexity that calculates, a compromise be done in space complexity and accuracy rate aspect.
1. apart from experimental formula
P ( c j ( x ) | x ) = 1 / d j ( x ) Σ k = 1 N Cand 1 / d k ( x ) , 1 ≤ j ≤ N Cand Perhaps
P ( c j ( x ) | x ) = 1 d j ( x ) - d 1 ( x ) + 1 Σ k = 1 N Cand 1 d k ( x ) - d 1 ( x ) + 1 , 1 ≤ j ≤ N Cand Perhaps
P ( c j ( x ) | x ) = 1 / d j 2 ( x ) Σ k = 1 N Cand 1 / d k 2 ( x ) , 1 ≤ j ≤ N Cand
2. estimate based on the degree of confidence of Gauss model
P ( c j ( x ) | x ) = exp ( - d j ( x ) θ ) Σ k = 1 N Cand exp ( - d k ( x ) θ ) (θ needs to estimate)
Utilize confusion matrix can revise the degree of confidence of estimation (confusion matrix is relevant with identification core performance own, needs to estimate), revise degree of confidence by P ( c j ( x ) | x ) = Σ k = 1 N Cand P ( c k ( x ) | x ) P ( c j ( x ) | c k ( x ) ) Obtain.
If we estimate degree of confidence according to the method described above, just need the confusion matrix of estimation variance parameter θ and sorter.To specifically implement to introduce in the part in the technical program about this part content.In general, select the sorter performance estimation of which kind of method, need take all factors into consideration time cost and the storage space cost decides as us.
3 feature extractions
Feature extraction is at first extracted the stroke parameter by the graphical analysis to the row image, extracts the basic pen section of Chinese character then, and basic strokes is merged into each sub-character, provides the marking of how much costs simultaneously.
(1) goes parameter extraction stage of image
This stage will be extracted three parameters
w s---stroke width:
Figure C20051001219500271
---the character mean breadth;
---the character average height;
1. stroke width w sExtract
Stroke width is meant the width of written handwriting.At first, the black pixel distance of swimming of level of line of text is carried out histogram analysis, and (the black distance of swimming of level is meant the rectangular area of black pixel continuous possession on the X direction, the height of rectangle is a pixel, the wide length that is the black distance of swimming of level), histogrammic transverse axis represents that level deceives run length, and ordinate represents to have the black distance of swimming number of level of this length.If the black run length of the level that corresponding distance of swimming number is maximum in the histogram is p, corresponding distance of swimming number is hist (p), (be that histogrammic ordinate maximal value is hist (p), the horizontal ordinate of its correspondence is p)
Then get w s = ( p - 1 ) × hist ( p - 1 ) + p × hist ( p ) + ( p + 1 ) × h ( p + 1 ) h ( p - 1 ) + h ( p ) + h ( p + 1 ) .
(Fig. 6 has provided the histogram analysis to Fig. 5 (a) image, p=4 wherein, hist (p)=690)
2. character mean breadth
Figure C20051001219500274
Estimation
The character mean breadth has reflected the writing style of line of text, and character cutting is had direct influence.At first the line of text image is done the projection of vertical direction, obtain perspective view, the horizontal ordinate of this figure is corresponding one by one with the horizontal ordinate of line of text, and the value of ordinate is whole number (Fig. 7 has provided the perspective view of Fig. 5 (a)) of black pixel points on the corresponding horizontal ordinate vertical direction in the line of text.Perspective view horizontal coordinate direction of principal axis (ordinate is 0) is done the black distance of swimming analysis of level, and calculate the mean value of the black distance of swimming of whole levels, with this estimation as the character mean breadth.Too small and cause the overlapped situation of intercharacter stroke for the character pitch in the line of text, can be at perspective view y=2w sDo black distance of swimming statistics on+1 the horizontal direction, calculate its mean value, can obtain character mean breadth preferably
Figure C20051001219500281
Estimation.
(Fig. 8 has provided the perspective view on Fig. 5 (b) vertical direction, adhesion serious situation that Here it is)
3. character average height
Figure C20051001219500282
Extraction
The extraction of character average height is relatively simple, only needs the row image is divided into some five equilibriums (generally being five five equilibriums) in the horizontal direction, and the height to all little images is averaged the average height that is character again (as Fig. 9)
(2) the pen section extraction stage
The pen section is meant to cast aside anyhow in the Chinese character presses down four kinds of fundamental elements, and the pen section is extracted the adhesion that can overcome character effectively.The method that pen section extracting method adopts the black distance of swimming to follow the tracks of, its thinking is: at first search out the black distance of swimming of a level in image, as the beginning of a certain pen section, then the black distance of swimming of this level is followed the tracks of from the top down line by line, finish up to following the tracks of, obtain a section.
The thinking that follow the tracks of to adopt: at the next line of the black distance of swimming of present level, get comprise current black distance of swimming place horizontal level and about respectively be offset the horizontal extent of a pixel, in this scope, find all levels to deceive the distance of swimming; Fit the pen section direction that obtains according to the mean breadth of the existing distance of swimming of pen section and by the existing distance of swimming of pen section then, select the black distance of swimming of certain level to join in the existing pen section distance of swimming, and upgrade the information of former pen section.We will describe this process in detail in concrete enforcement.
Accompanying drawing 10 and 11 is results that the pen section is extracted.
(3) pen section merging phase
After the extraction of pen section is finished, also need the pen section is further merged.If R iAnd R jBe respectively the boundary rectangle of two adjacent pen sections.(seeing accompanying drawing 12)
(x I, 1, y I, 1)---R iThe coordinate of upper left angle point;
(x I, 2, y I, 2)---R iThe coordinate of bottom right angle point;
(x J, 1, y J, 1)---R jThe coordinate of upper left angle point;
(x J, 2, y J, 2)---R jThe coordinate of bottom right angle point;
D H(R i, R j)---R iRight side and R jThe horizontal range in left side (is noticed D H(R i, R j) value is with the positive and negative direction of representing, R in Figure 12 iThe horizontal level of left frame is at R jTherefore the right of the horizontal level of left frame is represented with negative value; Otherwise if R iThe horizontal level of left frame is at R jThe left side of the horizontal level of left frame is then used on the occasion of representing);
D V(R i, R j)---R iBottom side and R jThe vertical range of top side; (notice D V(R i, R j) value is with the positive and negative direction of representing, R in Figure 12 iThe upright position of bottom side frame is at R jBelow the upright position of top side frame, therefore represent with negative value; Otherwise if R iThe upright position of bottom side frame is at R jAbove the upright position of top side frame, then use) on the occasion of representing;
Width (R i)---R iWidth;
Width (R j)---R jWidth;
We merge the pen section according to following principle:
If 1. R iAnd R jSatisfy in the horizontal direction R iComprise R jPerhaps R jComprise R i, then with pen section i, j merges;
If 2. R iAnd R jSatisfy: D H(R i, R j)<0 (is R iLeft frame is at R jAnd have the right side of left frame), - D H ( R i , R j ) width ( R i ) > T 1 Or - D H ( R i , R j ) width ( R j ) > T 1 , Then with pen section i, j merges, herein T 1Be the predefine thresholding, generally get 0.7;
If 3. R iAnd R jSatisfy: D H(R i, R j)<0, and have - D H ( R i , R j ) width ( R i ) > T 2 And - D H ( R i , R j ) width ( R j ) > T 2 , Then with pen section i, j merges, herein T 2Be the predefine thresholding, generally get 0.5;
Algorithm is described in will be concrete the implementation phase, and Figure 13 has provided the pen section amalgamation result of Figure 11
(4) sub-character marking
The result that we merge the pen section, ordering from left to right is designated as: s 1, s 2..., s N, the sub-character picture after our pen section that Here it is merges will be finished cutting, this a little character picture suitably need be merged to finish the cutting operation.
Assess (s according to following mode k, s K+1..., s K+nk-1) the geometry cost that merges.
w k---sub-character picture s k, s K+1..., s K+nk-1Character picture (s after the merging k, s K+1..., s K+nk-1) width (Figure 21);
h k---sub-character picture s k, s K+1..., s K+nk-1Character picture (s after the merging k, s K+1..., s K+nk-1) height (Figure 21);
1. character duration
If (s k, s K+1..., s K+nk-1) width be w k, the character mean breadth of line of text is
Figure C20051001219500301
(going on foot institute as 3-2 asks), then S ( w k ) = a ( w k w c ‾ - 1 ) 2 if w k w c ‾ > 1 b ( w k w c ‾ - 1 ) 2 if w k w c ‾ ≤ 1 A wherein, b is the predefine parameter, and we get a=100, b=400.
2. character the ratio of width to height S ( r ) = min { c ( r r ‾ - 1 ) 2 , 100 } , Wherein general c=100.
R is character (s k, s K+1..., s K+nk-1) the ratio of width to height, r = w k h k ;
R is the average the ratio of width to height of character r ‾ = w c ‾ h c ‾ ;
Figure C20051001219500306
Be line of text character mean breadth, the estimated value that adopts the front line parameter to extract;
Figure C20051001219500307
Be line of text character average height, the estimated value that adopts the front line parameter to extract;
3. the inner sub intercharacter distance of character
Distance between the inner sub-character of character has reflected the combinative degree between each sub-character in the character, general writing style is that the sub-character for same Chinese character inside has less relatively distance, if and two sub-characters do not belong to same Chinese character, then its distance is relatively large.
But, only adopt above-mentioned sub-character-circumscribed rectangle distance measure can not reflect affinity degree between the sub-character fully, must adopt distance measure preferably.Following three kinds of different distance measures are arranged:
The horizontal range of i boundary rectangle frame: the horizontal range between the promptly sub-character rectangle frame (the rectangle frame area surrounded can have overlapping) (as Figure 14 b);
The ii Euclidean distance:
Euclidean distance between two pixels is to say, if the coordinate of pixel 1 is (x 1, y 1) coordinate of pixel 2 is (x 2, y 2), then the Euclidean distance between these two pixels is defined as
Figure C20051001219500308
Euclidean distance between sub-character A and sub-character B is defined as all deceives the minimum value of all deceiving Euclidean distance values all between the pixels among pixel and the B among the A.
The average distance of swimming distance of iii; Shown in Figure 14 (d), we calculate the average distance of swimming distance of the mean value of the length of whole horizontal distances of swimming (the white distance of swimming) between two sub-characters as us.
According to above-mentioned three kinds of distances, we define (s k, s K+1..., s K+nk-1) inner distance be d in = Σ i = k i = k + n k - 1 d i , i + 1 k , Its neutron character s iWith s jApart from d I, jBe defined as d i , j = Σ n γ n d ij n Promptly be above-mentioned three kinds of distances weighted mean and, γ nBe weighting coefficient, in general we get γ nAll be 1 also just passable.
Definition inner distance mark is S ( d in ) = 0 if d in < 0 100 if d in > w k 4 or d in > w c &OverBar; 2 400 d in w k otherwise
(Figure 14 (a) is the atom character picture, and Figure 14 (b) is sub intercharacter boundary rectangle horizontal range, and Figure 14 (c) is sub-intercharacter Euclidean distance, and Figure 14 (d) is a distance of swimming distance)
4. the distance before and after the character
In delegation's handwritten text, the distance between the inner sub-character of general character is less than the distance between the sub-character of kinds of characters, therefore, we to character with its before and after the distance of sub-character of character assess, and give to give a mark.
Suppose character (s k, s K+1..., s K+nk-1) and the sub-character in front and back between distance be respectively D LAnd D R
D L---sub-character s kBoundary rectangle frame and sub-character s K-1The boundary rectangle frame between horizontal range;
D R---sub-character s K+nk-1Boundary rectangle frame and sub-character s K+nkThe boundary rectangle frame between horizontal range;
If k=1 then D L=∞; If k+n k-1=N, then D R=∞.Get D=min (D at last L, D R).
Marking for the character longitudinal separation is S (D),
S ( D ) = 100 if D < - w c &OverBar; 25 w c &OverBar; + 100 D &OverBar; - 75 D D &OverBar; + w c &OverBar; if - w c &OverBar; &le; D &le; D &OverBar; 25 ( D max - D ) D max - D &OverBar; if D &OverBar; &le; D &le; D max 0 if D > D max
Wherein Be the mean breadth of character in the line of text, D and D MaxBe respectively the horizontal range between the horizontal range between the average sub-character-circumscribed rectangle frame and maximum sub-character-circumscribed rectangle frame in the line of text.
5. the interconnectedness of sub-character is estimated
Define sub-character s iWith sub-character s jInterconnectedness C IjFor
Figure C20051001219500321
Character (s k, s K+1..., s K+nk-1) connectedness represent by three amounts:
C I---internal communication;
C L---the left part connectedness;
C R---the right part connectedness;
Wherein internal communication refers to the connection degree of the inner sub-character of character; Left part connectedness and right part connectedness refer to sub-character s kWith sub-character s K-1, sub-character s K+nk-1With sub-character s K+nkBetween connectedness;
C I = 1 n k - 1 &Sigma; i = k i = k + n k - 2 C i , i + 1 n k > 1 1 n k = 1
C L = C k , k - 1 k > 1 1 k = 1
C R = C k + n k - 1 , k + n k k + n k - 1 < N 1 k + n k - 1 = N
The interconnectedness score S ( C ) = 100 &times; [ 1 - 1 2 ( 1 + C I - C R + C L 2 ) ]
Whole mark be above five kinds of marks weighted mean and S = &Sigma; i &alpha; i S i &Sigma; i &alpha; i .
4 Candidate Sets produce
For present off line Chinese character segmentation method, utilize method of geometry the Chinese character of adhesion can be cut basically, but the unit after the cutting is that this merges the part of cutting with regard to needing us, obtains final result by suitable merge algorithm more than the quantity of the Chinese character of reality.Traditional cutting method based on geometric properties is often just according to the slit mode of the geometry cost that generates by shortest path first computational geometry cost minimum, and think that from our viewpoint how much costs are not to estimate the optimum criterion of slit mode, the method that depends on classifier confidence merely also is insecure, must consider the relation of context constraint, and the model of reflection semantic constraint relation is exactly the Hidden Markov Model (HMM) (HMM) that we introduce previously.Because our last result is that synthetic geometry cost and semanteme-identification cost provide, therefore we need produce the Candidate Set of a slit mode according to certain criterion, each element in the Candidate Set is calculated the geometry cost and the semanteme-identification cost of its correspondence, comprehensively obtain optimum cutting result.
For the N that has segmented a sub-character picture s 1, s 2..., s N, (V, E), wherein the node number is N+1, is labeled as Node to set up a digraph 1, Node 2, Node 3..., Node N+1, also be V={Node 1, Node 2, Node 3..., Node N+1.For Node arbitrarily i, exist from Node iTo Node I+1, Node I+2, Node I+3... directed walk, path Node wherein i→ Node I+jCorresponding to i, i+1 ..., i+j-1 piece character merges, and also promptly merges sub-character picture s i, s I+1..., s I+j-1, the cost in path is exactly that this merges corresponding geometry cost.Every in cutting figure is by starting point Node 1To terminal point Node N+1The path all corresponding sub-character picture s 1, s 2..., s NA kind of merging method, just to a kind of slit mode of row image, so we claim that it is a cutting route.As Figure 20, the capable image that Figure 16 a is provided extracts stroke, merges sub-character, and Figure 17 a has provided corresponding sub-character block, and Figure 20 has provided the cutting figure that sets up thus, and as seen each bar arc has just been represented the merging of plurality of sub character.
Traditional method directly search this cutting figure (Segmentation Graph) from the optimal path of origin-to-destination as slit mode, and we wish to search near this optimum solution " neighborhood ", because if our how much cost functions can more correctly reflect the intercharacter closed intensity of each height, even the slit mode of optimum how much costs is not correctly to separate, it also should be little with correct cutting difference, so we turn to the sub-optimal path of search cutting figure.Under the given situation of sub-optimal path, just extract semanteme-recognition confidence based on the two-dimensional grammar model, be used for " evaluation " our candidate's slit mode.
Be exactly we search for the method K shortest path first of sub-optimal path, (every paths all is from starting point Node Zong it can calculate the optimum cutting route of preceding K choosing that cost ascending order by path arranges to cutting figure 1To terminal point Node N+1).
N Node---the node number of figure, N Node=N+1, the number of the sub-character picture that the N representative has segmented;
N Edge---the bar number on limit;
Start---start node (is Node 1);
Edge---terminal node (is Node N+1);
K---calculative optimal path bar number;
π k(v)---to the path that the total cost in the path of node v is arranged k, wherein the set of the value of v is set of node Node from starting point Start 1, Node 2, Node 3..., Node N+1, institute reaches path cost arranges by ascending order, so π 1(v) be exactly shortest path, if get v=Node from starting point Start to end points v N+1, π so 1(Node N+1) represented shortest path from origin-to-destination;
Γ -1(v)---the set of the precursor node of v, promptly might be connected to the set of the node of v, for any u ∈ Γ -1(v), all there is path u → v;
---the connection of two paths, wherein the terminal point in a path is the starting point in b path, as starting point, the b path termination is as terminal point with the starting point in a path for the path a b after the connection;
C[v]---the path candidate set of node v;
Carry out according to following step then:
At first, calculate π for all v ∈ V 1(v), promptly calculate shortest path respectively from starting point Start to each node;
For each v ∈ V, calculate π with the method for recurrence k(v), 2≤k≤K wherein.
If π 1(v), π 2(v) ..., π K-1(v) finish, introduce below and how to utilize π 1(v), π 2(v) ..., π K-1(v) calculate π k(v).
If k=2, initialization path candidate set C[v so], to the set Γ of the precursor node of v -1(each the element u v) finds the shortest path from starting point Start to node u, constructs new path π 1(u) v adds in the path candidate set of v, i.e. C[v] ← { π 1(u) v|u ∈ Γ -1(v) };
If k>2 are to path π K-1(v), find path π K-1(the precursor node u of v v) 0, i.e. path π K-1(v) by node u 0Link v.Can prove to have integer l, satisfy 1≤l≤k-1, make π K-1(v) from starting point Start to u 0The time path and the π that pass by l(u 0) unanimity, that is to say π K-1(v) equal just by path π l(u 0) terminal point u 0Be connected to node v, i.e. π K-1(v)=π l(u 0) v.We calculate π again to such integer l L+1(u 0)
We gather C[v from path candidate then] the inside removes path π l(u 0) v, and π L+1(u 0) v adds path candidate set C[v to] the inside.Even C[v] ← C[v]-{ π l(u 0) v}} ∪ { π L+1(u 0) v}; Then ask C[v] in shortest path, can prove C[v] in shortest path be exactly π k(v);
Recursively in order to last algorithm, the optimum cutting route of K bar can prove that the time complexity of this algorithm under worst case is before just can obtaining N wherein EdgeBe the number of arc, N NodeIt is the node number.
5 judgements
We at first extract each sub-character for the capable image of input, and provide corresponding how much cost estimated, again with the candidate Merge Scenarios of the way in 4 according to geometry cost generation plurality of sub characters.
For each candidate scheme, we use the semanteme-identification cost of the every strip character of Viterbi algorithm computation merge way.
According to the derivation of our front, we calculate average geometric cost (after the normalization) G of every paths respectively to the preceding K paths of each row image kWith average semanteme-identification cost H k, make T k=H k+ G k, in path candidate, choose the cutting method that the maximum conduct of T value finally provides then, notice that when cutting method provides the recognition result of each image and the result of aftertreatment provide simultaneously.
Description of drawings
Fig. 1 (a): to " having provided the capable image pattern storehouse of the correct cutting of character ", we are divided into two parts to them, and a part is used for calculating parameter as training sample, and another part is used to test the performance of the method for the invention as test sample book;
Fig. 1 (b): to " object to be slit relates to the corpus in field ", we calculate our semantic constraint relation, i.e. the prior probability of character appearance and intercharacter transition probability (seeing the 2-1 step of specific implementation method);
Fig. 1 (c): to " the individual character image pattern storehouse of The Off-line Handwritten Chinese ", our estimation variance parameter θ and confusion matrix is used for the recognition confidence (seeing the 2-2-1 and the 2-2-2 step of specific implementation method) that we estimate each identifier word;
Fig. 2 (a): the process of the parameter lambda that computational geometry cost and semanteme-identification cost merges (seeing the 9-1 step of specific implementation method);
Fig. 2 (b): accompanying method of the present invention is realized the process to the The Off-line Handwritten Chinese cutting;
Fig. 3: we advance our computing machine to the document of handwritten Chinese character by scanner scanning, store in the mode of image;
Fig. 4: the file and picture for scanning is advanced, need carry out denoising and binary conversion treatment, because the present invention is that unit handles to go image, therefore also need file and picture is extracted the row that it comprises Chinese character, as objective for implementation of the present invention;
Fig. 5 (a): separate situation preferably between the example of the capable image that extracts, word and word;
Fig. 5 (b): the example of the capable image that extracts, adhesion serious situation between word and the word;
Fig. 6: Fig. 6 has provided the histogram analysis to the character stroke width of Fig. 5 (a) image, and wherein length is that the number that occurs of the black distance of swimming of 4 level is maximum, therefore at the 3-1 of specific implementation method step p=4, corresponding hist (p)=690
Fig. 7: Fig. 7 has provided the perspective view on Fig. 5 (a) picture black pixel vertical direction (seeing the 3-2 step of specific implementation method);
Fig. 8: Fig. 8 has provided the perspective view on Fig. 5 (b) picture black pixel vertical direction (seeing the 3-2 step of specific implementation method);
Fig. 9: Fig. 9 has provided the 3-3 step character average height of specific implementation method
Figure C20051001219500351
Estimation procedure;
Figure 10 and Figure 11: Figure 10 and 11 is results that the pen section is extracted, and what each little rectangle surrounded is exactly a pen section that extracts, and sees the step 4 of specific implementation method.
Figure 12: R iAnd R jBe respectively the boundary rectangle of two adjacent pen sections, Figure 12 has marked R iRight side and R jThe horizontal range D in left side H(R i, R j) and R iBottom side and R jThe vertical range D of top side V(R i, R j) (seeing the step 5 of specific implementation method);
The pen section that Figure 13: Figure 13 has provided Figure 11 is merged into result's (seeing the step 5 of specific implementation method) of sub-character;
(b) (c) (d) for Figure 14 (a): Figure 14 (a) is the atom character picture, and Figure 14 (b) is sub intercharacter boundary rectangle horizontal range, and Figure 14 (c) is sub-intercharacter Euclidean distance, and Figure 14 (d) is the horizontal distance of swimming distance between the sub-character
Figure 15: to the marking of the distance D before and after the character, D is to the mapping relations (seeing the 6-4 step of specific implementation method) of S (D);
Figure 16 (a) is (c) (b): the capable image to be slit that proposes from envelope;
Figure 17 (a) is (b): Figure 16 (a) sub-character that (b) (c) extracts by the pen section and the pen section obtains after merging (b), and each sub-character is surrounded by a boundary rectangle;
Figure 18 (a) (b) (c): Figure 17 (a) (b) (b) according to the scoring of how much costs, the sub-character of getting its how much cost optimums merges mode, merges the result who obtains behind the sub-character block;
Figure 19 (a) (b) (c): Figure 17 (a) (b) (b) merge the result who obtains behind the sub-character block according to the described method of the application;
Figure 20: the cutting figure that the sub-character cutting result who is provided by Figure 17 (a) sets up (seeing the 7-1 step of specific implementation method);
Figure 21: the general process (seeing the step 6 of specific implementation method) that sub-character merges;
Figure 22: Figure 22 has explained the process of step 7-3 in the specific implementation method, and this step makes us can avoid repeating to discern identical character picture;
Specific implementation method
The invention is characterized in: it is by image capture device and coupled computer implemented, contains following performing step successively:
Step 1: by image capture device is the enough training samples of following purpose collection, sets up corresponding storehouse
● the individual character image pattern storehouse of The Off-line Handwritten Chinese;
● provided the capable image pattern storehouse of the correct cutting of character, see Fig. 1 a, we demarcate their correct slit mode to the capable image pattern that has extracted in advance, then they are divided into two parts, a part is used for calculating parameter as training sample, and another part is used to test the performance of the described method of the application as test sample book;
● object to be slit relates to the corpus in field;
Step 2: parameter estimation
2-1 and 2-2 step carry out on given " corpus in the related field of capable image to be slit ", are used to calculate the semantic constraint relation (Fig. 1 b) in the related field of sample to be slit, and we state the implication of following symbology:
N c---the number of times that Chinese character c occurs in the language material sample;
N C1c2---two-character word c 1c 2The number of times that in the language material sample, occurs;
N---the Chinese character sum in the language material sample;
P (c)---the probability that Chinese character c occurs in the language material sample;
P (c 2| c 1)---Chinese character c in the language material sample 1And then Chinese character c appears in the back 2Probability;
P Smooth()---the probability after the smoothing processing;
The number of M---the different Chinese character that comprises in the language material sample, country-level Chinese character standard has comprised 3755 of Chinese characters in common use, we can establish M=3755 simply;
The 2-1 step: on the language material sample, estimate P (c), the prior probability of character c appearance just; Also to estimate intercharacter transition probability in addition, just P (c 2| c 1):
The 2-1-1 step: P ( c ) = N C N , N cThe total degree that in corpus, occurs for each Chinese character that calculates;
The 2-1-2 step: for the two-dimensional grammar model, P ( c 2 | c 1 ) = N c 1 c 2 N c 1 , N wherein C1c2Be the c that calculates 1c 2The number of times that this speech occurs in corpus;
2-1-3 step: we have adopted following simple disposal route to come smoothed data, for the two-dimensional grammar model:
P smooth ( c 2 | c 1 ) = P ( c 2 | c 1 ) if N c 1 c 2 > 0 &epsiv; if N c 1 c 2 = 0 and N c 2 = 0 1 M if N c 1 c 2 = 0 and N c 2 > 0
Wherein ε is a very little positive number given in advance, gets ε=10 -9
The 2-2 step is (Fig. 1 c) that calculates on the visual sample of the individual character handwritten Chinese character in " the individual character image pattern storehouse of The Off-line Handwritten Chinese ", and wherein the correct character of each sample image correspondence is known, states following agreement:
N Sample---the number of image pattern in the The Off-line Handwritten Chinese image pattern storehouse;
x i---i sample image;
d j(x i)---by discerning i the sample image x that core provides iThe decipherment distance of j candidate's identifier word correspondence, the identification core is arranged the identification candidate according to the ascending order of decipherment distance, so d 1(x i) i sample image x of expression iThe decipherment distance of first-selection identification candidate correspondence;
N Cand---the number of the identification candidate that the identification core provides the word character image, this is the performance parameter of an identification core, thereby is constant, and irrelevant with the input character image;
L i---i character picture x iThe sequence number that corresponding correct Chinese character occurs in the identification Candidate Set of character picture is wherein discerned Candidate Set and is arranged each identification candidate according to the ascending order of decipherment distance;
The 2-2-1 step: calculate Off-line Handwritten Chinese Recognition core corresponding variance parameter, represent with symbol theta
At first, all images sample in " the individual character image pattern storehouse of The Off-line Handwritten Chinese " spoken of in step 1 is discerned,, obtained N each image CandIndividual identification candidate and corresponding decipherment distance;
The decipherment distance that provides according to the identification core calculates i sample image x iThe decipherment distance d of first-selected word correspondence 1(x i) with its decipherment distance poor of j identifier word, use y IjEven expression is y Ij=d j(x i)-d 1(x i), secondly, the minimization following formula obtains the estimation to parameter θ:
E = 1 2 N sample &Sigma; i = 1 N sample { &Sigma; j = 2 L j [ exp ( - y ij &theta; ) - 1 ] 2 + &Sigma; j = L i + 1 N Cand exp ( - 2 y ij &theta; ) }
Concrete minimization method can be taked exhaustive way, gets 10000 points between 0 and 100, and 0.01,0.02,0.03 ..., 99.8,99.9,100, as θ substitution following formula, obtain the estimation of the conduct of wherein corresponding E minimum value to θ;
In order to calculate the confusion matrix of identification core, we state following agreement:
ω Input(x)---the true classification of image x correspondence;
c j(x)---by j candidate's recognition result of the image x that provides of identification core, the identification core is arranged according to the ascending ascending order of decipherment distance and is discerned candidate, so c 1(x) be exactly the first-selected identification of image x candidate;
{ ω Input(x)=ω }---corresponding true character is the sample set of the image of ω;
#{ ω Input(x)=ω }---corresponding true character is the number of the image pattern of ω;
The 2-2-2 step: calculate confusion matrix
Confusion matrix is the matrix of a M * M, and wherein M represents to have comprised in the language material sample the different Chinese character of how many kinds of, and national standard first-level Chinese characters character set has 3755 Chinese characters, and we can establish M=3755; If we press selected arbitrarily series arrangement, char to all Chinese characters 1, char 2..., char M, the element of the capable β row of the α of confusion matrix is exactly P (char so β| char α), the expression concrete class is char α, be char but but be identified the core knowledge βProbability;
According to formula P ( char &beta; | char &alpha; ) = 1 # { &omega; input ( x ) = char &alpha; } &Sigma; x &Element; { &omega; input ( x ) = char &alpha; } &Sigma; j = 1 N Cand P ( c j ( x ) = char &beta; | x ) Calculate with discern core relevant obscure probability matrix, wherein #{ ω Input(x)=char αFor being char corresponding to true character αThe number of image pattern;
Wherein P ( c j ( x ) = char &beta; | x ) = exp ( - d j ( x ) &theta; ) &Sigma; i = 1 N Cand exp ( - d i ( x ) &theta; ) , It is that the recognition result to image x that the identification core provides is char βDegree of confidence;
&Sigma; x &Element; { &omega; input ( x ) = char &alpha; } &Sigma; j = 1 N Cand P ( c j ( x ) = char &beta; | x ) Represent that true character is char α, comprised char in the candidate but discern βThe alphabet image in about char βThe recognition confidence sum;
Except above-mentioned four aspects, we also need to estimate the parameter lambda that how much costs and semanteme-identification cost merge, and (Fig. 2 a), we are placed on last part and how introduce estimated parameter λ by the training sample calculating of row image for it;
Step 3: the character row image parameter is extracted
This step is finished the extraction to the row image parameter, comprises stroke width, character mean breadth and character average height, relates to the estimation of following parameter:
w s---stroke width;
---the character mean breadth;
Figure C20051001219500393
---the character average height;
The 3-1 step: stroke width, i.e. the width w of written handwriting sEstimate
At first, the black pixel distance of swimming of level to line of text is carried out histogram analysis, the black distance of swimming of level is meant the rectangular area of black pixel continuous possession on the X direction, the height of rectangle is a pixel, the wide length that is the black distance of swimming of level, histogrammic transverse axis represents that level deceives run length, and ordinate represents to have the black distance of swimming number of level of this length; If the black run length of the level that corresponding distance of swimming number is maximum in the histogram is p, corresponding distance of swimming number is hist (p), that is to say that histogrammic ordinate maximal value is hist (p), and the horizontal ordinate of its correspondence is p,
Then get w s = ( p - 1 ) &times; hist ( p - 1 ) + p &times; hist ( p ) + ( p + 1 ) &times; h ( p + 1 ) h ( p - 1 ) + h ( p ) + h ( p + 1 ) ;
(Fig. 6 has provided the histogram analysis to Fig. 5 (a) image, p=4 wherein, hist (p)=690)
The 3-2 step: average character duration
Figure C20051001219500395
Estimation
The character mean breadth has reflected the writing style of line of text, and character cutting is had direct influence; At first the line of text image is done the projection of vertical direction, obtain perspective view, the horizontal ordinate of this figure is corresponding one by one with the horizontal ordinate of line of text, and the value of ordinate is whole number (Fig. 7 has provided the perspective view of Fig. 5 (a)) of black pixel points on the corresponding horizontal ordinate vertical direction in the line of text; Perspective view horizontal coordinate direction of principal axis (ordinate is 0) is done the black distance of swimming analysis of level, and calculate the mean value of the black distance of swimming of whole levels, with this estimation as the character mean breadth; Too small and cause the overlapped situation of intercharacter stroke for the character pitch in the line of text, can be at perspective view y=2w sDo black distance of swimming statistics on+1 the horizontal direction, calculate its mean value, can obtain character mean breadth preferably
Figure C20051001219500401
Estimation;
(Fig. 8 has provided the perspective view on Fig. 5 (b) vertical direction, adhesion serious situation that Here it is)
The 3-3 step: character average height
Figure C20051001219500402
Estimation
The extraction of character average height is relatively simple, only needs the row image is divided into some five equilibriums (generally being five five equilibriums) in the horizontal direction, and the height to all little images is averaged the average height that is character again
Figure C20051001219500403
(as Fig. 9)
Step 4: pen section extraction stage
The pen section is meant to cast aside anyhow in the Chinese character presses down four kinds of fundamental elements, and the pen section is extracted the adhesion that can overcome character effectively;
The method that pen section extracting method adopts the black distance of swimming to follow the tracks of, its thinking is: at first search out the black distance of swimming of a level in image, as the beginning of a certain pen section, then the black distance of swimming of this level is followed the tracks of from the top down line by line, finish up to following the tracks of, obtain a section;
The thinking that follow the tracks of to adopt: in certain scope of the next line of the black distance of swimming of present level, normally respectively be offset some pixels (generally respectively being offset a pixel about us) about the black distance of swimming position of present level, find all levels to deceive the distance of swimming, and information according to existing pen section, mean breadth as the existing distance of swimming of pen section, the pen section direction that the fitting a straight line of the existing distance of swimming of pen section obtains etc., select the black distance of swimming of certain level to join in the existing pen section distance of swimming, and upgrade the information of former pen section, we describe this process in detail, and it contains following steps successively:
4-1 step: during according to section horizontal black pixel distance of swimming of sequential scanning to from top to bottom, if it is not in the black run-length recording of our existing each section, its beginning, simultaneously the black distance of swimming of this section level is added in the black run-length recording of new pen section so as a certain new pen section;
The 4-2 step: for the black distance of swimming of adding to recently in the segment record, we comprise the horizontal level of this horizontal distance of swimming again at the black distance of swimming next line of this level, and about respectively be offset a pixel and begin to search for the black distance of swimming of new level, if have the black picture element of certain horizontal distance of swimming to extend to this zone, then this distance of swimming extracted; If the not black distance of swimming appears at this zone, this section is extracted and is finished so, gets back to the 4-1 step and continues the new pen section of search;
4-3 step: for the black distance of swimming that extracts, how we consider it is added in the black run-length recording of pen section and go.Here we need divide two situation discussion, and a kind of situation is that the black distance of swimming of level that previous step extracts only has one, carries out the 4-3-1 step so; Another kind of situation is that the black distance of swimming of level that previous step extracts has two or more, carries out the 4-3-2 step so;
The 4-3-1 step:
If we only extract the black distance of swimming of a level,
If the mean breadth of the black distance of swimming of the existing level of this section of ■ more than or equal to the twice of the black distance of swimming width of new level, judges so to have reached pen section end point that the pen section is extracted and finished;
If the black distance of swimming width of the new level of ■ is more than or equal to three times of the mean breadth of the black distance of swimming of the existing level of this section, judge that then it is a pen section point of crossing, predict pen section direction according to the black Itinerary Information of level of existing this section so, and then each is offset half of the black distance of swimming mean breadth of section level about on the direction of prediction, joins in the record as a new pen section black distance of swimming;
If the black distance of swimming width of the level that ■ is new does not satisfy above-mentioned two prerequisites, judge that then it is the black distance of swimming of rational pen section, directly adds the level of pen section to and deceives run-length recording;
The 4-3-2 step:
If we extract the black distance of swimming of two or more levels, we at first utilize the direction of the black Itinerary Information prediction of existing pen section level pen section so, the distance of swimming that is extracted in then on the prediction direction is deceived the distance of swimming as our candidate's level, judges the black run-length recording tables of renewal for three that repeat 4-3-1 at last;
The Forecasting Methodology that step 4-3-1 and 4-3-2 adopt is as follows: ask mid point respectively for the black distance of swimming of all levels that the pen section follows the tracks of out, according to least square principle these mid points of linear function fit, dope the direction of pen section simultaneously according to the straight line that simulates then;
The 4-4 step: the attribute of judgement pen section
For the pen section that extracts, we calculate its height and width at first respectively
If the mean breadth of the black distance of swimming of all levels of this section of ■ is greater than given threshold value (generally we are made as 10 pixels), and pen section width judges so that greater than pen section height it is horizontal stroke.
We set a little step-length ■, represent (generally we get 3 pixels) with MinStepLength
◆ calculate the black distance of swimming mid point of i levels capable and capable these two row of i+MinStepLength ("+" expression plus sige)
If ■ mid point unanimity, angle accumulator adds so
Figure C20051001219500411
If the capable mid point horizontal ordinate of ■ i is greater than the horizontal ordinate of the capable mid point of i+MinStepLength, angle accumulator adds so
If the capable mid point horizontal ordinate of ■ i is less than the capable mid point horizontal ordinate of i+MinStepLength, angle accumulator adds so
Figure C20051001219500413
◆ down scan distance pen section from first row of each section highly is zero to be MinStepLength when capable always, and also promptly the (pen section height-MinStepLength) go with all angle additions, is asked average angle
If the ■ average angle is greater than zero and than predefined value α 1For a short time, (generally establish α 1Be 10 degree), be judged as perpendicular stroke so;
If the ■ average angle is greater than above-mentioned α 1And less than predefined value α 2(generally establish α 2Be 88 degree) be judged as the left-falling stroke stroke so
If the ■ average angle is greater than above-mentioned α 2And less than predefined value α 3(generally establish α 3Be 98 degree) be judged as horizontal stroke so;
If the ■ average angle is greater than above-mentioned α 3And less than predefined value α 4(generally establish α 4Be 176 degree) be judged as the right-falling stroke stroke so
If the ■ average angle is greater than above-mentioned α 4Be judged as perpendicular stroke so
If there is not the new distance of swimming found, then this section is followed the tracks of and is finished; If there is not new pen section found, then the pen section is extracted and is finished; When each section was extracted out, the attribute of pen section was promptly cast aside right-falling stroke anyhow and is promptly also determined;
Accompanying drawing 10 and 11 is results that the pen section is extracted, and each little pen section is all surrounded by a little rectangle
Step 5: the pen section merges
After the extraction of pen section is finished, also need the pen section is further merged into sub-character, establish R iAnd R jBe respectively the boundary rectangle of two adjacent pen sections: (seeing accompanying drawing 12)
(x I, 1, y I, 1)---R iThe coordinate of upper left angle point;
(x I, 2, y I, 2)---R iThe coordinate of bottom right angle point;
(x J, 1, y J, 1)---R jThe coordinate of upper left angle point;
(x J, 2, y J, 2)---R jThe coordinate of bottom right angle point;
D H(R i, R j)---R iRight side and R jThe horizontal range in left side (is noticed D H(R i, R j) value is with the positive and negative direction of representing, R in Figure 12 iThe horizontal level of left frame is at R jTherefore the right of the horizontal level of left frame is represented with negative value; Otherwise if R iThe horizontal level of left frame is at R jThe left side of the horizontal level of left frame is then used on the occasion of representing);
D V(R i, R j)---R iBottom side and R jThe vertical range of top side; (notice D V(R i, R j) value is with the positive and negative direction of representing, R in Figure 12 iThe upright position of bottom side frame is at R jBelow the upright position of top side frame, therefore represent with negative value; Otherwise if R iThe upright position of bottom side frame is at R jAbove the upright position of top side frame, then use) on the occasion of representing;
Width (R i)---R iWidth;
Width (R j)---R jWidth;
The pen section merges carries out according to following three principles:
(1) if R iAnd R jSatisfy in the horizontal direction R iComprise R jPerhaps R jComprise R i, then with pen section i, j merges;
(2) if R iAnd R jSatisfy: D H(R i, R j)<0 (is R iLeft frame is at R jAnd have the right side of left frame), - D H ( R i , R j ) width ( R i ) > T 1 Or - D H ( R i , R j ) width ( R j ) > T 1 , Then with pen section i, j merges, herein T 1Be the predefine thresholding, generally get 0.7;
(3) if R iAnd R jSatisfy: D H(R i, R j)<0, and have - D H ( R i , R j ) width ( R i ) > T 2 And - D H ( R i , R j ) width ( R j ) > T 2 , Then with pen section i, j merges, herein T 2Be the predefine thresholding, generally get 0.5;
Pen section merge algorithm contains following step successively:
The 5-1 step: initialization, the pen section sorts from left to right by the position on the horizontal direction;
5-2 step: (1) searches for the pen section that all need merge on principle, as searches a Duan Ze who satisfies condition they are merged, and has merged then to turn to 5-1 to go on foot; Otherwise forward the 5-3 step to;
5-3: (2) search for the pen section that all need merge on principle, as search a Duan Ze who satisfies condition they are merged, and have merged then to turn to 5-1 to go on foot; Otherwise forward the 5-4 step to;
3: (3) search for the pen section that all need merge on principle, as search a Duan Ze who satisfies condition they are merged, and have merged then end;
Figure 13 has provided the pen section amalgamation result of Figure 11
Step 6: how much cost evaluations that character merges comprise six sub-steps:
The result that we merge the pen section, ordering from left to right is designated as: s 1, s 2..., s N, the sub-character picture after our pen section that Here it is merges will be finished cutting, this a little character picture suitably need be merged to finish the cutting operation, and we introduce the merging process of sub-character picture, Figure 21 this process of having demonstrated:
We use (s k, s K+1..., s K+nk-1), expression is by sub-character picture s k, s K+1..., s K+nk-1Merge the character picture that forms, we assess (s according to following mode k, s K+1..., s K+nk-1) the geometry cost that merges:
w k---sub-character picture s k, s K+1..., s K+nk-1Character picture (s after the merging k, s K+1..., s K+nk-1) width (Figure 21);
h k---sub-character picture s k, s K+1..., s K+nk-1Character picture (s after the merging k, s K+1..., s K+nk-1) height (Figure 21);
The 6-1 step: character duration w kMarking is with S (w k) expression character duration score
If (s k, s K+1..., s K+nk-1) width be w k, the character mean breadth of line of text is
Figure C20051001219500441
(going on foot institute as 3-2 asks), then S ( w k ) = a ( w k w c &OverBar; - 1 ) 2 if w k w c &OverBar; > 1 b ( w k w c &OverBar; - 1 ) 2 if w k w c &OverBar; &le; 1 A wherein, b is the predefine parameter, and we get a=100, b=400;
The 6-2 step: character the ratio of width to height r marking, with S (r) expression character duration score S ( r ) = min { c ( r r &OverBar; - 1 ) 2 , 100 } , Wherein general c=100;
R is character (s k, s K+1..., s K+nk-1) the ratio of width to height, r = w k h k ;
R is the average the ratio of width to height of character r &OverBar; = w c &OverBar; h c &OverBar; ;
Figure C20051001219500446
Be line of text character mean breadth, estimated value (3-2 step);
Figure C20051001219500447
Be line of text character average height, estimated value (3-3 step);
The 6-3 step: the intercharacter distance marking of the inner son of character
The horizontal range of i boundary rectangle frame: the horizontal range between the promptly sub-character rectangle frame (the rectangle frame area surrounded can have overlapping) (as Figure 14 b);
The ii Euclidean distance:
Euclidean distance between two pixels is to say, if the coordinate of pixel 1 is (x 1, y 1) coordinate of pixel 2 is (x 2, y 2), then the Euclidean distance between these two pixels is defined as
Figure C20051001219500448
Euclidean distance between sub-character A and sub-character B is defined as all deceives the minimum value of all deceiving Euclidean distance values all between the pixels among pixel and the B among the A;
The average distance of swimming distance of iii: shown in Figure 14 (d), we calculate the average distance of swimming distance of the mean value of the length of whole horizontal distances of swimming (the white distance of swimming) between two sub-characters as us;
According to above-mentioned three kinds of distances, we define (s k, s K+1..., s K+nk-1) inner distance be d in = &Sigma; i = k i = k + n k - 1 d i , i + 1 k , Its neutron character s iWith s jApart from d I, jBe defined as d i , j = &Sigma; n &gamma; n d ij n Promptly be above-mentioned three kinds of distances weighted mean and, γ nBe weighting coefficient, in general we get γ nAll be 1 also just passable;
Definition inner distance mark is S ( d in ) = 0 if d in < 0 100 if d in > w k 4 or d in > w c &OverBar; 2 400 d in w k otherwise
(Figure 14 (a) is the atom character picture, and Figure 14 (b) is sub intercharacter boundary rectangle horizontal range, and Figure 14 (c) is sub-intercharacter Euclidean distance, and Figure 14 (d) is a distance of swimming distance)
The 6-4 step: the distance marking before and after the character, with S (D) expression
Suppose character (s k, s K+1..., s K+nk-1) and the sub-character in front and back between distance be respectively D LAnd D R
D L---in character s kBoundary rectangle frame and sub-character s K-1The boundary rectangle frame between horizontal range (consistent) with the definition in the step 5;
D R---sub-character s K+nk-1Boundary rectangle frame and sub-character s K+nkThe boundary rectangle frame between horizontal range (consistent) with the definition in the step 5;
If k=1 then D L=∞; If k+n k-1=N, then D R=∞ gets D=min (D at last L, D R);
Marking for the character longitudinal separation is S (D),
S ( D ) = 100 if D < - w c &OverBar; 25 w c &OverBar; + 100 D &OverBar; - 75 D D &OverBar; + w c &OverBar; if - w c &OverBar; &le; D &le; D &OverBar; 25 ( D max - D ) D max - D &OverBar; if D &OverBar; &le; D &le; D max 0 if D > D max
Wherein
Figure C20051001219500455
Be the mean breadth of character in the line of text, D and D MaxBe respectively the horizontal range between the horizontal range between the average sub-character-circumscribed rectangle frame and maximum sub-character-circumscribed rectangle frame in the line of text;
The 6-5 step: interconnectedness marking, with S (C) expression
Define sub-character s iWith sub-character s jInterconnectedness C IjFor
Figure C20051001219500461
Character (s k, s K+1..., s K+nk-1) connectedness represent by three amounts:
C I---internal communication;
C L---the left part connectedness;
C R---the right part connectedness;
Wherein internal communication refers to the connection degree of the inner sub-character of character; Left part connectedness and right part connectedness refer to sub-character s kWith sub-character s K-1, sub-character s K+nk-1With sub-character s K+nkBetween connectedness;
C I = 1 n k - 1 &Sigma; i = k i = k + n k - 2 C i , i + 1 n k > 1 1 n k = 1
C L = C k , k - 1 k > 1 1 k = 1
C R = C k + n k - 1 , k + n k k + n k - 1 < N 1 k + n k - 1 = N
The interconnectedness score S ( C ) = 100 &times; [ 1 - 1 2 ( 1 + C I - C R + C L 2 ) ]
The 6-6 step: the calculated population score value is as how much costs
Whole mark be above five kinds of marks weighted mean and S = &Sigma; i &alpha; i S i &Sigma; i &alpha; i , Generally we get α 1=5, α 2=2, α 3=3, α 4=5, α 5=2
Step 7: the slit mode that calculates how much cost optimums of preceding K bar according to the geometry cost of estimating out
After finishing step 6, we have been cut into the sub-character picture of row to the row image and (have seen Figure 17 (a) (b) (c), it provided Figure 16 (a) (b) the image cutting of (c) row be the result of sub-character picture, what each rectangle frame the inside comprised is exactly a sub-character picture, this rectangle frame is exactly the boundary rectangle frame of sub-character picture), provided the geometry cost that merges between the sub-character block simultaneously, but thereby we need further bundle character picture to merge into the cutting that character picture is finished character; Below our cost of utilizing sub-character block that step 6 obtains to merge, generate some possible sub-character Merge Scenarioses (each scheme is a kind of cutting result of corresponding row image all), we estimate these schemes in step 8, and provide optimal case;
The 7-1 step: cutting figure sets up
For the N that has segmented a sub-character picture s 1, s 2..., s N, (V, E), wherein the node number is N+1, is labeled as Node to set up a digraph 1, Node 2, Node 3..., Node N+1, also be V={Node 1, Node 2, Node 3..., Node N+1; For Node arbitrarily i, exist from Node iTo Node I+1, Node I+2, Node I+3... directed walk, path Node wherein i→ Node I+jCorresponding to i, i+1 ..., i+j-1 piece character merges, and also promptly merges sub-character picture s i, s I+1..., s I+j-1, the cost in path is exactly that this merges corresponding geometry cost; Every in cutting figure is by starting point Node 1To terminal point Node N+1The path all corresponding sub-character picture s 1, s 2..., s NA kind of merging method, just to a kind of slit mode of row image, so we claim that it is a cutting route;
(see Figure 20, the capable image that Figure 16 a is provided extracts stroke, merges sub-character, and Figure 17 a has provided corresponding sub-character block, and Figure 20 has provided the cutting figure that sets up thus, and as seen each bar arc has just been represented the merging of plurality of sub character)
The 7-2 step: the cutting route of how much cost optimums of K bar before producing:
Introduce below us how to calculate cutting figure (V, E) in from starting point Node 1To terminal point Node N+1By the preceding K paths that the ascending order of cost is arranged, in fact every here paths is all corresponding a kind of slit mode of capable image is exactly sub-character picture s 1, s 2..., s NA kind of Merge Scenarios merges sub-character s according to this scheme 1, s 2..., s NThereby, finish cutting to the row image;
The detailed process of algorithm is as follows:
(V E), establishes given figure
N Node---the node number of figure, N Node=N+1, the number of the sub-character picture that the N representative has segmented;
N Edge---the bar number on limit;
Start---start node (is Node 1);
Edge---terminal node (is Node N+1);
K---calculative optimal path bar number;
π k(v)---(wherein v can get set of node Node from starting point Start to node v 1, Node 2, Node 3..., Node N+1In any one node) the total cost in the path path (path cost by ascending order arrange) of arranging k, so π 1(v) be exactly shortest path, if get v=Node from starting point Start to end points v N+1, π so 1(Node N+1) represented shortest path from origin-to-destination;
Γ -1(v)---the set of the precursor node of v, promptly might be connected to the set of the node of v, for any u ∈ Γ -1(v), all there is path u → v;
---the connection of two paths, wherein the terminal point in a path is the starting point in b path, as starting point, the b path termination is as terminal point with the starting point in a path for the path a b after the connection;
C[v]---the path candidate set of node v;
Carry out according to following step then:
The 7-2-1 step:, calculate π at first for all v ∈ V 1(v), promptly calculate shortest path respectively from starting point Start to each node;
The 7-2-2 step:, calculate π with the method for recurrence for each v ∈ V k(v), 2≤k≤K wherein;
If π 1(v), π 2(v) ..., π K-1(v) finish, introduce below and how to utilize π 1(v), π 2(v) ..., π K-1(v) calculate π k(v);
If k=2, initialization path candidate set C[v so], to the set Γ of the precursor node of v -1(each the element u v) finds the shortest path from starting point Start to node u, constructs new path π 1(u) v adds in the path candidate set of v, i.e. C[v] ← { π 1(u) v|u ∈ Γ -1(v) };
If k>2 are to path π K-1(v), find path π K-1(the precursor node u of v v) 0, i.e. path π K-1(v) by node u 0Link v, can prove to have integer l, satisfy 1≤l≤k-1, make π K-1(v) from starting point Start to u 0The time path and the π that pass by l(u 0) unanimity, that is to say π K-1(v) equal just by path π l(u 0) terminal point u 0Be connected to node v, i.e. π K-1(v)=π l(u 0) v, we calculate π again to such integer l L+1(u 0) (because 1≤l≤k-1,2≤l+1≤k can prove that this recursive calculation process can realize so)
We gather C[v from path candidate then] the inside removes path π l(u 0) v, and π L+1(u 0) v adds path candidate set C[v to] the inside, even C[v] ← C[v]-{ π l(u 0) v}} ∪ { π L+1(u 0) v}; Then ask C[v] in shortest path, can prove C[v] in shortest path be exactly π k(v);
Recursively, just can obtain preceding K bar cutting route in order to last algorithm;
As for how selecting the K value in actual applications, need between time complexity and accuracy, do a compromise, in fact, we do not need to find right-on slit mode in practice, the slit mode of suboptimum also can guarantee very high character identification rate, we generally get K=200, and more generally we can select adaptive K value (generally selecting K is 10 times of sub-character block number) under the situation;
Because our Candidate Set generates according to how much cost criterion, therefore effectively how much cost functions can make our Candidate Set as far as possible little, and correct slit mode appears at earlier position simultaneously;
The 7-3 step: character recognition
In fact, the time loss that produces the K candidate is also little, the time bottleneck is the time that sorter identification consumes, therefore above-mentioned algorithm can be taked further optimized Measures in our concrete application, though notice cutting method difference whole in this preceding K candidate's slit mode, all there are some total slit modes in every cutting route, so we there is no need every paths is all discerned one time, also needn't all recomputate degree of confidence, adopt following method:
CharCandidatesSet---image recognition Candidate Set, each element has wherein all comprised N CandIndividual identification candidate and N CandThe recognition confidence of individual correspondence;
CharCandidatesSetNum---the number of element in the image recognition Candidate Set;
If sub-character is s 1s 2... s N, we set up the question blank LookupTable of (N+1) * (N+1), and the element among the LookupTable at first all is made as-1, empties image recognition Candidate Set CharCandidatesSet, and makes CharCandidatesSetNum=0;
For 1≤k≤K
To k candidate's cutting route (how much costs come candidate's cutting route of k by ascending order):
To s ps P+1... s q(the merging mode that 1≤p≤q≤N) is such, the element LookupTable among the inquiry LookupTable (p, q+1) (element of the capable q+1 row of p among the expression question blank LookupTable);
If (p q+1)=-1, illustrates that so this merging also was not identified to LookupTable, and the image that this combination obtains is discerned, and obtains N CandIndividual candidate, and estimate the degree of confidence (8-1 step) of each identification candidate, then identification candidate and corresponding recognition confidence integral body thereof are added to image recognition Candidate Set CharCandidatesSet as an element, make LookupTable (p then, q+1)=CharCandidatesSetNum, order allows CharCandidatesSetNum increase by 1 then, also is CharCandidatesSetNum=CharCandidatesSetNum+1;
If (p q+1) ≠-1, illustrates that so this merging was considered, need not handle again to LookupTable
End For
Figure 22 has explained this process of 7-3 in detail, and the benefit that this process is brought is the work that repeats of having avoided the identification core, and has saved the time greatly; If we need know sub-character block s in the step of back ps P+1... s qDuring recognition result after the merging, as long as the capable q+1 column element of p of inquiry LookupTable just can obtain sub-character block s ps P+1... s qThe sequence number of recognition result after the merging in CharCandidatesSet, thus the sub-character block s that in the CharCandidatesSet correspondence position, has write down found ps P+1... s qRecognition result after the merging and corresponding degree of confidence;
Step 8: depend on optimum candidate's cutting scheme under how much meanings of K bar that provide previously, provide the semanteme-identification cost of each sub-character Merge Scenarios correspondence according to the two-dimensional grammar model:
The 8-1 step: the character degree of confidence is estimated
X---character picture to be identified;
c j(x)---(the identification core is arranged according to the ascending ascending order of decipherment distance and is discerned candidate, so c by j candidate's identifier word to image x of providing of identification core 1(x) be exactly the first-selection identification candidate of image x);
d j(x)---by j candidate's identifier word c of the image x that provides of identification core j(x) Dui Ying decipherment distance, so d 1(x) decipherment distance of the first-selection of presentation video x identification candidate correspondence;
N Cand---the number of the identification candidate that the identification core provides the word character image, as described in the preceding 2-2 step, this value is constant, and is only relevant with identification core itself;
P (c j(x) | x)---image x is identified as c j(x) degree of confidence, this is the object that we need estimate;
For all candidate that the 7-3 step obtains, can select following two kinds of diverse ways to estimate the degree of confidence of candidate at our concrete needs:
1. apart from experimental formula
P ( c j ( x ) | x ) = 1 / d j ( x ) &Sigma; k = 1 N Cand 1 / d k ( x ) , 1 &le; j &le; N Cand Perhaps
P ( c j ( x ) | x ) = 1 d j ( x ) - d 1 ( x ) + 1 &Sigma; k = 1 N Cand 1 d k ( x ) - d 1 ( x ) + 1 , 1 &le; j &le; N Cand Perhaps
P ( c j ( x ) | x ) = 1 / d j 2 ( x ) &Sigma; k = 1 N Cand 1 / d k 2 ( x ) , 1 &le; j &le; N Cand
2. estimate based on the degree of confidence of Gauss model P ( c j ( x ) | x ) = exp ( - d j ( x ) &theta; ) &Sigma; k = 1 N Cand exp ( - d k ( x ) &theta; ) (θ calculates in our front 2-2-1 step) utilizes the confusion matrix of previous calculations can revise the degree of confidence (confusion matrix is calculated by 2-2-2) of estimation, revise degree of confidence by P ( c j ( x ) | x ) = &Sigma; k = 1 N Cand P ( c k ( x ) | x ) P ( c j ( x ) | c k ( x ) ) Obtain;
The estimation recognition confidence can be selected top method as required flexibly, and in the present invention, we have adopted the degree of confidence estimation scheme based on the Gauss model;
The 8-2 step: put the letter cost based on the semanteme-identification of two-dimensional grammar and calculate
For the capable image cutting route of the K bar that has provided, the letter cost is put in average semanteme-identification that every following method of cutting route application is calculated its correspondence, state following symbol and meaning thereof:
n k---merge sub-character picture according to k bar cutting route, obtain n altogether kCharacter picture after the individual merging;
Image K, t---merge t character picture behind the sub-character picture according to k bar cutting route, 1≤k≤K wherein, 1≤t≤n k
c j(image K, t)---the character picture image that the identification core provides K, tJ identification candidate, wherein 1≤j≤N Cand, 1≤k≤K, 1≤t≤n k, P (c j(image K, t) | image K, t) its recognition confidence of correspondence;
Because the identification of character picture and the estimation of degree of confidence are finished in the step at 7-3, we can obtain recognition result and the degree of confidence that we need by the method for inquiry LookupTable from CharcandidatesSet in this step;
To k bar candidate cutting route, 1≤k≤K, use following Vieterbi method computing semantic-identification cost:
Make Q[n k] [N Cand] be two-dimensional array, wherein a Q[t] [j] preserved from certain candidate of first character picture and put c to byte j(image K, t) the logarithm value of the probability that had of maximum possible candidate selection mode, get a two-dimentional array of pointers Path[n in addition k] [N Cand] be used to write down computation process;
Initialization t=1,1≤j≤N Cand
Path[1][j]=NULL
Q[1][j]=logP(c j(image k,1))+log(c j(image k,1)|image k,1)
Recurrence 2≤t≤n k, to 1≤j≤N CandCalculate Q[t] [j]
Q [ t ] [ j ] = max 1 &le; l &le; N Cand { Q [ t - 1 ] [ l ] + log P ( c j ( image k , t ) | c l ( image k , t - 1 ) ) } + log P ( c j ( image k , t ) | image k , t )
Find in addition and make Q[t-1] [l]+logP (c j(image K, t) | c j(image K, t-1)) maximum l, note is made l *, promptly
l * = arg max 1 &le; l &le; N Cand { Q [ t - 1 ] [ l ] + log P ( c j ( image k , t ) | c l ( image k , t - 1 ) ) }
Make Path[t then] [j] sensing byte point c L*(image K, t-1), i.e. byte point c j(image K, t) father node be c J*(image K, t-1)
Stop t=n k
Find j at last *, make j * = arg max 1 &le; j &le; N Cand Q [ n k ] [ j ] , Recall Path[n k] [j *] path of indication, each the byte point on the outgoing route obtains character string and is our character identification result; When obtaining optimum character string, we have also obtained the logarithm value Q[n of the probability in maximum possible path k] [j *], we estimate this value, this are worth divided by n as the semanteme-identification cost to this cutting route k, obtain average semanteme-identification cost H k = Q [ n k ] [ j * ] n k ;
Step 9: synthetic geometry cost and semantic cost provide net result
The 9-1 step: we need estimate the fusion parameters λ of geometry cost and semanteme-identification cost:
The calculation process of fusion parameters λ such as Fig. 2 a
State following agreement:
N L---provided the capable image number (being the training sample number) that sub-character and correct sub-character merge mode;
n I, k---k candidate's slit mode of i training sample corresponding characters number;
n I, 0---i the correct cutting of training sample obtains character number;
g I, k---the geometry cost of k candidate's cutting route of i training sample correspondence, then g I, 1The minimum value of representing how much costs in i all slit modes of training sample;
G I, k---the average geometric cost (value after the normalization) of k candidate's slit mode of i training sample correspondence;
H I, k---the average semanteme-identification cost of k candidate's slit mode of i training sample correspondence;
g I, 0---the geometry cost of i the entirely true cutting correspondence of training sample;
G I, 0---the average geometric cost (value after the normalization) of i the entirely true cutting correspondence of training sample;
H I, 0---the average semanteme-identification cost of i the entirely true cutting correspondence of training sample;
We select N L(Fig. 1 a) handles according to the order from the step 3 to the step 8 each row image, thereby can obtain n the capable image of individual training sample I, k, g I, k, H I, k1≤i≤N L, 1≤k≤K; We select for use following mode that the geometry cost of estimating in the step 6 is carried out normalization, and ask its mean value, and promptly we make G ik = 1 n i , k log ( &lambda; e - &lambda; ( g i , k / g i , 1 - 1 ) ) , 1 &le; i &le; N L , 1 &le; k &le; K , We can obtain the mark G after the geometry cost normalization of correct cutting correspondence according to the step of front similarly I, 01≤i≤N LWith average semanteme-identification cost (utilizing the method for 8-2 step introduction) H I, 01≤i≤N L, note T i k ( &lambda; ) = H i , k + G i , k , 1 &le; k &le; K , Remember T then i 0(λ) be the T value of i the entirely true cutting correspondence of sample, promptly T i 0 ( &lambda; ) = H i , 0 + G i , 0 ;
Minimization N ( &lambda; ) = &Sigma; i = 1 N L # { T i k ( &lambda; ) > T i 0 ( &lambda; ) | 1 &le; k &le; K } Promptly obtain estimation to weighting coefficient λ;
Wherein # { T i k ( &lambda; ) > T i 0 ( &lambda; ) | 1 &le; k &le; K } Be illustrated under the situation of given λ, in the K of i sample image correspondence candidate's cutting route, the number of the path candidate that the T value is also bigger than the T value of correct slit mode correspondence, minimization method still can adopt the hit-and-miss method that is similar to minimization θ;
The 9-2 step: calculate optimum cutting identification path (Fig. 2 b) according to fusion parameters λ
To the general capable image of sample to be slit, we are according to step 3---and step 8 calculates K bar candidate cutting route, and calculates the average identification-semantic cost H of every paths k1≤k≤K, the average geometric cost (after the normalization) G k = 1 n k log ( &lambda; e - &lambda; ( g k / g 1 - 1 ) ) , 1 &le; k &le; K (g wherein kHow much costs of every cutting route that obtain in the corresponding step 6 of 1≤k≤K), then comprehensive cost T k=H k+ G k1≤k≤K gets k * = arg max 1 &le; k &le; K T k , K then *The optimum cutting that individual candidate's slit mode provides as us.
The experimental result of the described method of the application
We prepare following data for test:
1, in order to calculate the recognition confidence of sorter, we need be on the character picture sample of known class calculating parameter θ, 50 candidates' that provided by THOCR Chinese Character Recognition core has been provided for use for we.Sample also is the handwritten Chinese character sample of being collected by department of electronic engineering, tsinghua university intelligence picture and text treatment of laboratory.
2, we have collected the hand-written envelop image sample of about 4,000 width of cloth envelopes, the a part of sample wherein that used CEARS routine processes that department of electronic engineering, tsinghua university intelligence picture and text laboratory provides, extract the address line (having comprised two parts of geographical address and organization) of envelope address, get rid of and wherein extract address line by mistake, obtained the whole capable image of 1141 experiment samples as us, and the prior correct character cutting mode that has provided each row image by manual type, wherein 908 pick out as training sample calculating parameter λ, other 233 as test sample book.
What 3, we realized is the capable cutting of hand-written postal address, what select for use is the postinfo address (ADDR storehouse, Beijing bought of space postinfo company therefrom, has 180,000 address (ADDR information, comprises organization and physical address, actual total about 370,000 clauses and subclauses are as the corpus of our training.
Carry out to step 9 according to our step 2 recited above then, we calculate θ=2.322 (2-2-1 step), λ=51.85 (9-1 step), and preceding ten identification candidate that the identification core is exported image to be identified (are N Cand=10,2-2 step, 8-1 step).
We are two index R relatively LAnd R C
R wherein LExpression row cutting accuracy, the cutting of so-called row is correct, and each character that is meant this delegation's image is all by correct cutting,
Figure C20051001219500541
R CExpression character segmentation accuracy is meant an independent character by the ratio of correct cutting,
Figure C20051001219500542
Test findings following (Intel Pentium4 2.8GHz 512MB RAM)
The number of words of correct cutting The line number of correct cutting The ratio (%) of correct cutting number of words The ratio (%) that entirely true cutting row accounts for
Path cutting based on how much cost minimums 2,492 55 82.7 23.6
The cutting of this paper 2,814 147 93.3 63.1
Annotate: 233 of test sample books have comprised 3013 Chinese characters altogether.
Every capable average handling time is less than 300 milliseconds, and this time the inside has comprised the merging of character picture, identification, aftertreatment and the All Time that provides optimum cutting recognition result.
Contrast tradition only the result that obtains of the slit mode of how much cost optimums of search (Figure 18 (a) (b) (c) provided to Figure 17 (a) (b) (c) only depend on the cutting result that how much cost optimums obtain, Figure 19 (a) (b) (c) has provided the cutting result that our method provides), the method that visible the present invention proposes has realized the cutting of the The Off-line Handwritten Chinese of high-accuracy.From time index, this method can be carried out real-time efficient cutting and identification to hand-written file and picture simultaneously.
When promptly how much costs are not included consideration in λ=0, the simple semantic information that relies on can not obtain best result, this is because the identification of our individual character can allow certain noise, therefore not necessarily leave no choice but under some situation under correct cutting, just can obtain correct recognition result, and for the approaching slit mode of semanteme-identification cost, just need further to have compared their geometry cost.
Conclusion
In sum, the The Off-line Handwritten Chinese cutting method based on geometry cost and semanteme-identification cost fusion of the present invention's proposition has the following advantages:
1), can overcome the adhesion problems of handwritten Chinese character effectively based on the cutting of crossing of the statistical nature realization of extracting image geometry feature and image to The Off-line Handwritten Chinese.
2) how much assessments of the sub-character of each character that adopts of the present invention can reflect the intercharacter intimate degree of each character preferably, for next step merging provides correct foundation.
What 3) the present invention proposed provides cutting candidate's method with the calculating K shortest path.Classic method is only got the slit mode of how much cost optimums, can not guarantee best slit mode necessarily can in the hope of, the K shortest-path method has overcome the deficiency of classic method, has enlarged the region of search.
4) the present invention proposes a new viewpoint, think that semanteme-identification cost is only the strongest criterion of estimating cutting validity, consider the geometry cost of cutting on this basis simultaneously, provide slit mode.This makes the process of The Off-line Handwritten Chinese cutting that we realize and identification more approach human cognitive process.The framework that the present invention simultaneously provides has certain directive significance, no matter the geometry cost of which kind of form or semantic cost all can merge according to thinking provided by the invention.
5) the present invention has excellent popularization, mainly shows: can be generalized in the cutting of offline handwriting character of English (perhaps other language) (may need to consider the language model of high-order) by Chinese cutting on the one hand; The present invention can be applied to other easily and need realize the field of offline handwriting character cutting efficiently on the other hand, only needs prior corpus to this field to calculate relevant parameters, and we get final product substitution by the framework of model.
6) the present invention has certain refusal effect to the document of non-this area, at Bi-gram model at the address line training, for the situation that the capable error extraction of name is become address line, can find that semantic cost at this moment is very high, this makes us can correctly refuse such error extraction row, makes total system have more intellectuality.
This method has proposed the unified model of Chinese character segmentation identification and aftertreatment, for the cutting of off line Chinese character has proposed a kind of new thinking.

Claims (1)

1, the The Off-line Handwritten Chinese cutting method that merges of how much costs and semanteme-identification cost is characterized in that: it is by image capture device and coupled computer implemented, contains following steps successively:
Step 1: by image capture device is the enough training samples of following purpose collection, sets up corresponding storehouse
● the individual character image pattern storehouse of The Off-line Handwritten Chinese;
● provided the capable image pattern storehouse of the correct cutting of character: we demarcate their correct slit mode to the capable image pattern that has extracted in advance, then they are divided into two parts, a part is used for calculating parameter as training sample, and another part is used to test the performance of the described method of the application as test sample book;
Object to be slit relates to the corpus in field;
Step 2: parameter estimation
2-1 and 2-2 step carry out on given " corpus in the related field of capable image to be slit ", are used to calculate the semantic constraint relation in the related field of sample to be slit, and we state the implication of following symbology:
N c---the number of times that Chinese character c occurs in the language material sample;
N C1c2---two-character word c 1c 2The number of times that in the language material sample, occurs;
N---the Chinese character sum in the language material sample;
P (c)---the probability that Chinese character c occurs in the language material sample;
P (c 2| c 1)---Chinese character c in the language material sample 1And then Chinese character c appears in the back 2Probability;
P Smooth()---the probability after the smoothing processing;
The number of M---the different Chinese character that comprises in the language material sample;
The 2-1 step: on the language material sample, estimate P (c), the prior probability of character c appearance just; Also to estimate intercharacter transition probability in addition, just P (c 2| c 1):
The 2-1-1 step: P ( c ) = N c N , N cThe total degree that in corpus, occurs for each Chinese character that calculates;
The 2-1-2 step: for the two-dimensional grammar model, P ( c 2 | c 1 ) = N c 1 c 2 N c 1 , N wherein C1c2Be the c that calculates 1c 2The number of times that this speech occurs in corpus;
2-1-3 step: we have adopted following simple disposal route to come smoothed data, for the two-dimensional grammar model:
P smooth ( c 2 | c 1 ) = P ( c 2 | c 1 ) if N c 1 c 2 > 0 &epsiv; if N c 1 c 2 = 0 and N c 2 = 0 1 M if N c 1 c 2 = 0 and N c 2 > 0
ε=10 wherein -9
The 2-2 step is to calculate on the visual sample of the individual character handwritten Chinese character in " the individual character image pattern storehouse of The Off-line Handwritten Chinese ", and wherein the correct character of each sample image correspondence is known, states following agreement:
N Sample---the number of image pattern in the The Off-line Handwritten Chinese image pattern storehouse;
x i---i sample image;
d j(x i)---by discerning i the sample image x that core provides I.The decipherment distance of j candidate's identifier word correspondence, the identification core is arranged the identification candidate according to the ascending order of decipherment distance, so d 1(x i) i sample image x of expression iThe decipherment distance of first-selection identification candidate correspondence;
N Cand---the number of the identification candidate that the identification core provides the word character image, this is the performance parameter of an identification core, thereby is constant, and irrelevant with the input character image;
L i---i character picture x iThe sequence number that corresponding correct Chinese character occurs in the identification Candidate Set of character picture is wherein discerned Candidate Set and is arranged each identification candidate according to the ascending order of decipherment distance;
The 2-2-1 step: calculate Off-line Handwritten Chinese Recognition core corresponding variance parameter, represent with symbol theta
At first, all images sample in " the individual character image pattern storehouse of The Off-line Handwritten Chinese " spoken of in step 1 is discerned,, obtained N each image CandIndividual identification candidate and corresponding decipherment distance;
The decipherment distance that provides according to the identification core calculates i sample image x iThe decipherment distance d of first-selected word correspondence 1(x i) with its decipherment distance poor of j identifier word, use y IjEven expression is y Ij=d j(x i)-d 1(x i), secondly, the minimization following formula obtains the estimation to parameter θ:
E = 1 2 N sample &Sigma; i = 1 N sample { &Sigma; i = 2 L i [ exp ( - y ij &theta; ) - 1 ] 2 + &Sigma; j = I 2 + 1 N Cand exp ( - 2 y ij &theta; ) }
Concrete minimization method is taked exhaustive way, gets 10000 points between 0 and 100, and 0.01,0.02,0.03 ..., 99.8,99.9,100, as θ substitution following formula, obtain the estimation of the conduct of wherein corresponding E minimum value to θ;
In order to calculate the confusion matrix of identification core, we state following agreement:
ω Input(x)---the true classification of image x correspondence;
c j(x)---by j candidate's recognition result of the image x that provides of identification core, the identification core is arranged according to the ascending ascending order of decipherment distance and is discerned candidate, so c 1(x) be exactly the first-selected identification of image x candidate;
{ ω Input(x)=ω }---corresponding true character is the sample set of the image of ω;
#{ ω Input(x)=ω }---corresponding true character is the number of the image pattern of ω;
The 2-2-2 step: calculate confusion matrix
Confusion matrix is the matrix of a M * M, and wherein M represents to have comprised in the language material sample the different Chinese character of how many kinds of; If we press selected arbitrarily series arrangement, char to all Chinese characters 1, char 2..., char M, the element of the capable β row of the α of confusion matrix is exactly P (char so β| char α), the expression concrete class is char α, be char but but be identified the core knowledge βProbability;
According to formula P ( char &beta; | char &alpha; ) = 1 # { &omega; input ( x ) = char &alpha; } &Sigma; x &Element; { &omega; input ( x ) = char &alpha; } &Sigma; j = 1 N Cand P ( c j ( x ) = char &beta; | x ) Calculate with discern core relevant obscure probability matrix, wherein #{ ω Input(x)=char αFor being char corresponding to true character αThe number of image pattern;
Wherein P ( c j ( x ) = char &beta; | x ) = exp ( - d j ( x ) &theta; ) &Sigma; i = 1 N Cand exp ( - d i ( x ) &theta; ) , It is that the recognition result to image x that the identification core provides is char βDegree of confidence;
&Sigma; x &Element; { &omega; input ( x ) = char &alpha; } &Sigma; j = 1 N Cand P ( c j ( x ) = cha r &beta; | x ) Represent that true character is char α, comprised char in the candidate but discern βThe alphabet image in about char βThe recognition confidence sum;
Except above-mentioned four aspects, we also need to estimate the parameter lambda that how much costs and semanteme-identification cost merge, and it is by the training sample calculating of row image, and we are placed on last part and how introduce estimated parameter λ;
Step 3: the character row image parameter is extracted
This step is finished the extraction to the row image parameter, comprises stroke width, character mean breadth and character average height, relates to the estimation of following parameter:
w s---stroke width;
Figure C2005100121950005C1
---the character mean breadth:
Figure C2005100121950005C2
---the character average height;
The 3-1 step: stroke width, i.e. the width w of written handwriting sEstimate
At first, the black pixel distance of swimming of level to line of text is carried out histogram analysis, the black distance of swimming of level is meant the rectangular area of black pixel continuous possession on the X direction, the height of rectangle is a pixel, the wide length that is the black distance of swimming of level, histogrammic transverse axis represents that level deceives run length, and ordinate represents to have the black distance of swimming number of level of this length; If the black run length of the level that corresponding distance of swimming number is maximum in the histogram is p, corresponding distance of swimming number is hist (p), that is to say that histogrammic ordinate maximal value is hist (p), and the horizontal ordinate of its correspondence is p,
Then get w s = ( p - 1 ) &times; hist ( p - 1 ) + p &times; hist ( p ) + ( p + 1 ) &times; h ( p + 1 ) h ( p - 1 ) + h ( p ) + h ( p + 1 ) ;
The 3-2 step: average character duration Estimation
The character mean breadth has reflected the writing style of line of text, character cutting there is direct influence, at first the line of text image is done the projection of vertical direction, obtain perspective view, the horizontal ordinate of this figure is corresponding one by one with the horizontal ordinate of line of text, the value of ordinate is whole number of black pixel points on the corresponding horizontal ordinate vertical direction in the line of text, to perspective view horizontal coordinate direction of principal axis just ordinate be that 0 direction is done the black distance of swimming analysis of level, and calculate the mean value of the black distance of swimming of whole levels, with this estimation as the character mean breadth; Too small and cause the overlapped situation of intercharacter stroke for the character pitch in the line of text, at perspective view y=2w sDo black distance of swimming statistics on+1 the horizontal direction, calculate its mean value, obtain character mean breadth preferably
Figure C2005100121950005C5
Estimation;
The 3-3 step: character average height
Figure C2005100121950005C6
Estimation
The extraction of character average height is relatively simple, only needs the row image is divided into some five equilibriums in the horizontal direction, generally gets five five equilibriums, and the height to all little images is averaged the average height that is character again
Figure C2005100121950005C7
Step 4: pen section extraction stage
The pen section is meant to cast aside anyhow in the Chinese character presses down four kinds of fundamental elements, and the pen section is extracted the adhesion that has overcome character effectively;
The method that pen section extracting method adopts the black distance of swimming to follow the tracks of, its thinking is: at first search out the black distance of swimming of a level in image, as the beginning of a certain pen section, then the black distance of swimming of this level is followed the tracks of from the top down line by line, finish up to following the tracks of, obtain a section;
The thinking that follow the tracks of to adopt: at the next line of the black distance of swimming of present level, get comprise current black distance of swimming place horizontal level and about respectively be offset the horizontal extent of a pixel, in this scope, find all levels to deceive the distance of swimming; Fit the pen section direction that obtains according to the mean breadth of the existing distance of swimming of pen section and by the existing distance of swimming of pen section then, select the black distance of swimming of certain level to join in the existing pen section distance of swimming, and upgrade the information of former pen section; We describe this process in detail, and it contains following steps successively:
4-1 step: during according to section horizontal black pixel distance of swimming of sequential scanning to from top to bottom, if it is not in the black run-length recording of our existing each section, its beginning, simultaneously the black distance of swimming of this section level is added in the black run-length recording of new pen section so as a certain new pen section;
The 4-2 step: for the black distance of swimming of adding to recently in the segment record, we comprise the horizontal level of this horizontal distance of swimming again at the black distance of swimming next line of this level, and about respectively be offset a pixel and begin to search for the black distance of swimming of new level, if have the black picture element of certain horizontal distance of swimming to extend to this zone, then this distance of swimming extracted; If the not black distance of swimming appears at this zone, this section is extracted and is finished so, gets back to the 4-1 step and continues the new pen section of search;
4-3 step: for the black distance of swimming that extracts, how we consider it is added in the black run-length recording of pen section and go: we need divide two situation discussion here, and a kind of situation is that the black distance of swimming of level that previous step extracts only has one, carries out the 4-3-1 step so; Another kind of situation is that the black distance of swimming of level that previous step extracts has two or more, carries out the 4-3-2 step so;
The 4-3-1 step:
If we only extract the black distance of swimming of a level,
If the mean breadth of the black distance of swimming of the existing level of this section of ■ more than or equal to the twice of the black distance of swimming width of new level, judges so to have reached pen section end point that the pen section is extracted and finished;
If the black distance of swimming width of the new level of ■ is more than or equal to three times of the mean breadth of the black distance of swimming of the existing level of this section, judge that then it is a pen section point of crossing, predict pen section direction according to the black Itinerary Information of level of existing this section so, and then each is offset half of the black distance of swimming mean breadth of section level about on the direction of prediction, joins in the record as a new pen section black distance of swimming;
If the black distance of swimming width of the level that ■ is new does not satisfy above-mentioned two prerequisites, judge that then it is the black distance of swimming of rational pen section, directly adds the level of pen section to and deceives run-length recording;
The 4-3-2 step:
If we extract the black distance of swimming of two or more levels, we at first utilize the direction of the black Itinerary Information prediction of existing pen section level pen section so, the distance of swimming that is extracted in then on the prediction direction is deceived the distance of swimming as our candidate's level, judges the black run-length recording tables of renewal for three that repeat 4-3-1 at last;
The Forecasting Methodology that step 4-3-1 and 4-3-2 adopt is as follows: ask mid point respectively for the black distance of swimming of all levels that the pen section follows the tracks of out, according to least square principle these mid points of linear function fit, dope the direction of pen section simultaneously according to the straight line that simulates then;
The 4-4 step: the attribute of judgement pen section:
For the pen section that extracts, we calculate its height and width at first respectively
If the mean breadth of the black distance of swimming of all levels of this section of ■ is greater than given threshold value, and pen section width judges so that greater than pen section height it is horizontal stroke;
We set a little step-length ■, represent with MinStepLength,
◆ calculate the black distance of swimming mid point of i levels capable and capable these two row of i+MinStepLength, "+" expression plus sige
● if the mid point unanimity, angle accumulator adds so
Figure C2005100121950007C1
If ● the capable mid point horizontal ordinate of i is greater than the horizontal ordinate of the capable mid point of i+MinStepLength, and angle accumulator adds so
Figure C2005100121950007C2
If ● the capable mid point horizontal ordinate of i is less than the capable mid point horizontal ordinate of i+MinStepLength, and angle accumulator adds so
Figure C2005100121950007C3
◆ down scan distance pen section from first row of each section highly is zero MinStepLength when capable always, with all angle additions, asks average angle:
● if average angle is greater than zero and than predefined value α 1Little, be judged as perpendicular stroke so;
● if average angle is greater than above-mentioned α 1And less than predefined value α 2Be judged as the left-falling stroke stroke so;
● if average angle is greater than above-mentioned α 2And less than predefined value α 3Be judged as horizontal stroke so;
● if average angle is greater than above-mentioned α 3And less than predefined value α 4Be judged as the right-falling stroke stroke so;
● if average angle is greater than above-mentioned α 4Be judged as perpendicular stroke so;
If there is not the new distance of swimming found, then this section is followed the tracks of and is finished; If there is not new pen section found, then the pen section is extracted and is finished; When each section was extracted out, the attribute of pen section was promptly cast aside right-falling stroke anyhow and is promptly also determined;
Step 5: the pen section merges
After the extraction of pen section is finished, also need the pen section is further merged into sub-character, establish R iAnd R jBe respectively the boundary rectangle of two adjacent pen sections;
(x I, 1, y I, 1)---R iThe coordinate of upper left angle point;
(x I, 2, y I, 2)---R iThe coordinate of bottom right angle point;
(x J, 1, y J, 1)---R jThe coordinate of upper left angle point;
(x J, 2, y J, 2)---R jThe coordinate of bottom right angle point;
D H(R i, R j)---R iRight side and R jThe horizontal range in left side, D H(R i, R j) value is with the positive and negative direction of representing, R iThe horizontal level of left frame is at R jThe right of the horizontal level of left frame is represented with negative value; Otherwise if R iThe horizontal level of left frame is at R jThe left side of the horizontal level of left frame is then used on the occasion of representing;
D V(R i, R j)---R iBottom side and R jThe vertical range of top side, D V(R i, R j) value is with the positive and negative direction of representing, R iThe upright position of bottom side frame is at R jBelow the upright position of top side frame, represent with negative value; Otherwise if R iThe upright position of bottom side frame is at R jAbove the upright position of top side frame, then use on the occasion of representing;
Width (R i)---R iWidth;
Width (R j)---R jWidth;
The pen section merges carries out according to following three principles:
1) if R iAnd R jSatisfy in the horizontal direction R iComprise R jPerhaps R jComprise R i, then with pen section i, j merges;
2) if R iAnd R jSatisfy: D H(R i, R j)<0 also is R iLeft frame is at R jThe right side of left frame, and have - D H ( R i , R j ) width ( R i ) > T 1 Or - D H ( R i , R j ) width ( R i ) > T 1 , Then with pen section i, j merges, herein T 1Be the predefine thresholding, generally get 0.7;
3) if R iAnd R jSatisfy: D H(R i, R j)<0, and have - D H ( R i , R j ) width ( R i ) > T 2 And - D H ( R i , R j ) width ( R i ) > T 2 , Then with pen section i, j merges, herein T 2Be the predefine thresholding, generally get 0.5;
Pen section merge algorithm contains following step successively:
The 5-1 step: initialization, the pen section sorts from left to right by the position on the horizontal direction;
5-2 step: (1) searches for the pen section that all need merge on principle, as searches a Duan Ze who satisfies condition they are merged, and has merged then to turn to 5-1 to go on foot; Otherwise forward the 5-3 step to;
5-3 step: (2) search for the pen section that all need merge on principle, as search a Duan Ze who satisfies condition they are merged, and have merged then to turn to 5-1 to go on foot; Otherwise forward the 5-4 step to;
5-4 step: (3) search for the pen section that all need merge on principle, as search a Duan Ze who satisfies condition they are merged, and have merged then end;
Step 6: how much cost evaluations that character merges comprise six sub-steps
The result that we merge the pen section, ordering from left to right is designated as: s 1, s 2..., s N, the sub-character picture after our pen section that Here it is merges will be finished cutting, this a little character picture suitably need be merged to finish the cutting operation;
We introduce the merging process of sub-character picture;
We use (s k, s K+1..., s K+nk-1), expression is by sub-character picture s k, s K+1..., s K+nk-1Merge the character picture that forms, we assess (s according to following mode k, s K+1..., s K+nk-1) the geometry cost that merges:
w k---sub-character picture s k, s K+1..., s K+nk-1Character picture (s after the merging k, s K+1..., s K+nk-1) width;
h k---sub-character picture s k, s K+1..., s K+nk-1Character picture (s after the merging k, s K+1..., s K+nk-1) height;
The 6-1 step: character duration w kMarking is with S (w k) expression character duration score
If (s k, s K+1..., s K+nk-1) width be w k, the character mean breadth of line of text is
Figure C2005100121950009C1
, draw by 3-2 step, then S ( w k ) = a ( w k w c &OverBar; - 1 ) 2 if w k w c &OverBar; > 1 b ( w k w c &OverBar; - 1 ) 2 if w k w c &OverBar; &le; 1 A wherein, b is the predefine parameter, and generally we get a=100, b=400;
The 6-2 step: character the ratio of width to height r marking, with S (r) expression character duration score S ( r ) = min { c ( r r &OverBar; - 1 ) 2 , 100 } , Wherein generally get c=100;
R is character (s k, s K+1..., s K+nk-1) the ratio of width to height, r = w k h k ;
R is the average the ratio of width to height of character r &OverBar; = w c &OverBar; h c &OverBar; ;
Figure C2005100121950009C6
Be line of text character mean breadth, method of estimation is seen the 3-2 step;
Figure C2005100121950009C7
Be line of text character average height, method of estimation is seen the 3-3 step;
The 6-3 step: the intercharacter distance marking of the inner son of character
1) horizontal range of boundary rectangle frame: the horizontal range between the promptly sub-character rectangle frame allows the rectangle frame area surrounded to have overlapping:
2) Euclidean distance: the Euclidean distance between two pixels is to say, if the coordinate of pixel 1 is (x 1, y 1) coordinate of pixel 2 is (x 2, y 2), then the Euclidean distance between these two pixels is defined as ( x 1 - x 2 ) 2 + ( y 1 - y 2 ) 2 ; Euclidean distance between sub-character A and sub-character B is defined as all deceives the minimum value of all deceiving Euclidean distance values all between the pixels among pixel and the B among the A;
3) average distance of swimming distance: we calculate the average distance of swimming distance of the mean value of the length of whole white distances of swimming of level between two sub-characters as us;
According to above-mentioned three kinds of distances, we define (s k, s K+1..., s K+nk-1) inner distance be d in = &Sigma; i = k i = k + n k - 1 d i , i + 1 k , Its neutron character s iWith s j, apart from d I, jBe defined as d i , j = &Sigma; n d ij n Promptly be the average of above-mentioned three kinds of distances and;
Definition inner distance mark is S ( d in ) = 0 if d in < 0 100 if d in > w k 4 or d in > w c &OverBar; 2 400 d in w k otherwise
The 6-4 step: the distance marking before and after the character, with S (D) expression
Suppose character (s k, s K+1..., s K+nk-1) and the sub-character in front and back between distance be respectively D LAnd D R
D L---sub-character s kBoundary rectangle frame and sub-character s K-1The boundary rectangle frame between horizontal range, consistent with the definition in the step 5;
D R---sub-character s K+nk-1Boundary rectangle frame and sub-character s K+nkThe boundary rectangle frame between horizontal range, consistent with the definition in the step 5;
If k=1 then D L=∞; If k+n k-1=N, then D R=∞; Get D=min (D at last L, D R);
Marking for the character longitudinal separation is S (D),
S ( D ) = 100 ifD < - w c &OverBar; 25 w c &OverBar; + 100 D &OverBar; - 75 D D &OverBar; + w c &OverBar; if - w c &OverBar; &le; D &le; D &OverBar; 25 ( D max - D ) D max - D &OverBar; if D &OverBar; &le; D &le; D max 0 ifD > D max
Wherein
Figure C2005100121950010C5
Be the mean breadth of character in the line of text, D and D MaxBe respectively the mean value of the horizontal range between the line of text neutron character-circumscribed rectangle frame and the maximal value of the horizontal range between the sub-character-circumscribed rectangle frame;
The 6-5 step: interconnectedness marking, with S (C) expression
Define sub-character s iWith sub-character s jInterconnectedness C IjFor
Figure C2005100121950011C1
Character (s k, s K+1..., s K+nk-1) connectedness represent by three amounts:
C I---internal communication;
C L---the left part connectedness;
C R---the right part connectedness;
Wherein internal communication refers to the connection degree of the inner sub-character of character; Left part connectedness and right part connectedness refer to sub-character s kWith sub-character s K-1, sub-character s K+nk-1With sub-character s K+nkBetween connectedness;
C l = 1 n k - 1 &Sigma; i = k i = k + n k - 2 C i , i + 1 n k > 1 1 n k = 1
C L = C k , k - 1 k > 1 1 k = 1
C R = C k + n k - 1 , k + n k k + n k - 1 < N 1 k + n k - 1 = N
The interconnectedness score S ( C ) = 100 &times; [ 1 - 1 2 ( 1 + C l - C R + C L 2 ) ] ;
The 6-6 step: the calculated population score value is as how much costs
Whole mark be above five kinds of marks weighted mean and S = &Sigma; i &alpha; i S i &Sigma; i &alpha; i , α wherein 1=5, α 2=2, α 3=3, α 4=5, α 5=2;
Step 7: the slit mode that calculates how much cost optimums of preceding K bar according to the geometry cost of estimating out
After finishing step 6, we have been cut into the sub-character picture of row to the row image, provided the geometry cost that merges between the sub-character block simultaneously, but we need further bundle character picture to merge into character picture, thereby finish the cutting of character, below our cost of utilizing sub-character block that step 6 obtains to merge, generate some possible sub-character Merge Scenarioses, each scheme is a kind of cutting result of corresponding row image all, we estimate these schemes in step 8, and provide optimal case;
The 7-1 step: cutting figure sets up
For the N that has segmented a sub-character picture s 1, s 2..., s N, (V, E), wherein the node number is N+1, is labeled as Node to set up a digraph 1, Node 2, Node 3..., Node N+1, also be V={Node 1, Node 2, Node 3..., Node N+1; For Node arbitrarily i, exist from Node iTo Node I+1, Node I+2, Node I+3... directed walk, path Node wherein i→ Node I+jCorresponding to i, i+1 ..., i+j-1 piece character merges, and also promptly merges sub-character picture s i, s I+1..., s I+j-1, the cost in path is exactly that this merges corresponding geometry cost; Every in cutting figure is by starting point Node 1To terminal point Node N+1The path all corresponding sub-character picture s 1, s 2..., s NA kind of merging method, just to a kind of slit mode of row image, so we claim that it is a cutting route;
The 7-2 step: the cutting route of how much cost optimums of K bar before producing:
Introduce below us how to calculate cutting figure (V, E) in from starting point Node 1To terminal point Node N+1By the preceding K paths that the ascending order of cost is arranged, in fact every here paths is all corresponding a kind of slit mode of capable image is exactly sub-character picture s 1, s 2..., s NA kind of Merge Scenarios merges sub-character s according to this scheme 1, s 2..., s NThereby, finish cutting to the row image;
The detailed process of algorithm is as follows:
(V E), establishes given figure
N Node---the node number of figure, N Node=N+1, the number of the sub-character picture that the N representative has segmented;
N Edge---the bar number on limit;
Start---start node is Node 1
Edge---terminal node is Node N+1
K---calculative optimal path bar number;
π k(v)---to the path that the total cost in the path of node v is arranged k, wherein the set of the value of v is set of node Node from starting point Start 1, Node 2, Node 3..., Node N+1, institute reaches path cost arranges by ascending order, so π 1(v) be exactly shortest path, if get v=Node from starting point Start to end points v N+1, π so 1(Node N+1) represented shortest path from origin-to-destination;
Г -1(v)---the set of the precursor node of v, promptly might be connected to the set of the node of v, for any u ∈ Г -1(v), all there is path u → v;
---the connection of two paths, wherein the terminal point in a path is the starting point in b path, as starting point, the b path termination is as terminal point with the starting point in a path for the path a b after the connection;
C[v]---the path candidate set of node v;
Carry out according to following step then:
The 7-2-1 step:, calculate π at first for all v ∈ V 1(v), promptly calculate shortest path respectively from starting point Start to each node;
The 7-2-2 step: for each v ∈ V, if π 1(v), π 2(v) ..., π K-1(v) finish, introduce below and how to utilize π 1(v), π 2(v) ..., π K-1(v) calculate π k(v), 2≤k≤K wherein;
If k=2, initialization path candidate set C[v so], to the set Г of the precursor node of v -1(each the element u v) finds the shortest path from starting point Start to node u, constructs new path π 1(u) v adds in the path candidate set of v, i.e. C[v] ← { π 1(u) v|u ∈ Г -1(v) };
If k>2 are to path π K-1(v), find path π K-1(the precursor node u of v v) 0, i.e. path π K-1(v) by node u 0Link v, necessarily have integer l, satisfy 1≤l≤k-1, make π K-1(v) from starting point Start to u 0The time path and the π that pass by l(u 0) unanimity, that is to say π K-1(v) equal just by path π l(u 0) terminal point u 0Be connected to node v, i.e. π K-1(v)=π l(u 0) v, to such integer l, we calculate π again L+1(u 0);
We gather C[v from path candidate then] the inside removes path π l(u 0) v, and π L+1(u 0) v adds path candidate set C[v to] the inside, even C[v] ← C[v]-{ π l(u 0) v}} ∪ { π L+1(u 0) v}; Then ask C[v] in shortest path, C[v] in shortest path be exactly π k(v);
Recursively, just obtained preceding K bar cutting route in order to last algorithm;
The 7-3 step: character recognition
CharCandidatesSet---image recognition Candidate Set, each element has wherein all comprised N CandIndividual identification candidate and N CandThe recognition confidence of individual correspondence;
CharCandidatesSetNum---the number of element in the image recognition Candidate Set;
If sub-character is s 1s 2... s N, we set up the question blank of (N+1) * (N+1), i.e. LookupTable, and the element among the LookupTable at first all is made as-1, empties the image recognition Candidate Set, i.e. CharCandidatesSet, and make CharCandidatesSetNum=0;
For 1≤k≤K
To k candidate's cutting route, promptly how much costs come candidate's cutting route of k by ascending order,
To s ps P+1... s q(the merging mode that 1≤p≤q≤N) is such, and the element LookupTable among the inquiry LookupTable (p, q+1), the element of the capable q+1 row of p among its expression question blank LookupTable;
If (p q+1)=-1, illustrates that so this merging also was not identified to LookupTable, and the image that this combination obtains is discerned, and obtains N CandIndividual candidate, and estimate the degree of confidence of each identification candidate, see the 8-1 step, then identification candidate and corresponding recognition confidence integral body thereof are added to image recognition Candidate Set CharCandidatesSet as an element, make LookupTable (p then, q+1)=and CharCandidatesSetNum, allow CharCandidatesSetNum increase by 1 simultaneously, also be CharCandidatesSetNum=CharCandidatesSetNum+1;
If (p q+1) ≠-1, illustrates that so this merging was considered, need not handle again to LookupTable
End For
The benefit that this process is brought is the work that repeats of having avoided the identification core, and has saved the time greatly, if we need know sub-character block s in the step of back ps P+1... s qDuring recognition result after the merging, as long as the capable q+1 column element of p of inquiry LookupTable just obtains sub-character block s ps P+1... s qThe sequence number of recognition result after the merging in CharCandidatesSet, thus the sub-character block s that in the CharCandidatesSet correspondence position, has write down found ps P+1... s qRecognition result after the merging and corresponding degree of confidence;
Step 8: depend on optimum candidate's cutting scheme under how much meanings of K bar that provide previously, provide the semanteme-identification cost of each sub-character Merge Scenarios correspondence according to the two-dimensional grammar model:
The 8-1 step: the character degree of confidence is estimated
X---character picture to be identified;
c j(x)---by j candidate's identifier word to image x that the identification core provides, the identification core is arranged the identification candidate according to the ascending ascending order of decipherment distance, so c 1(x) be exactly the first-selection identification candidate of image x;
d j(x)-by j candidate's identifier word c of the image x that provides of identification core j(x) Dui Ying decipherment distance, so d j(x) decipherment distance of the first-selection of presentation video x identification candidate correspondence;
N Cand---the number of the identification candidate that the identification core provides the word character image, as described in the preceding 2-2 step, this value is constant, and is only relevant with identification core itself;
P (c j(x) | x)---image x is identified as c j(x) degree of confidence, this is the object that we need estimate;
For all candidate that the 7-3 step obtains, use the degree of confidence of estimating candidate based on the confidence degree estimation method of Gauss model:
P ( c j ( x ) | x ) = exp ( - d j ( x ) &theta; ) &Sigma; k = 1 N Cand exp ( - d k ( x ) &theta; ) , θ is exactly that we calculate in the front 2-3 step;
The degree of confidence of utilizing the confusion matrix correction of previous calculations to estimate, wherein confusion matrix is calculated by 2-4, revise degree of confidence by P ( c j ( x ) | x ) = &Sigma; k = 1 N Cand P ( c k ( x ) | x ) P ( c j ( x ) | c k ( x ) ) Obtain;
The 8-2 step: put the letter cost based on the semanteme-identification of two-dimensional grammar and calculate
For the capable image cutting route of the K bar that has provided, the letter cost is put in average semanteme-identification that every following method of cutting route application is calculated its correspondence, state following symbol and meaning thereof:
n k---merge sub-character picture according to k bar cutting route, obtain n altogether kCharacter picture after the individual merging;
Image K, t---merge t character picture behind the sub-character picture according to k bar cutting route, 1≤k≤K wherein, 1≤t≤n k
c j(image K, t)---the character picture image that the identification core provides K, tJ identification candidate, wherein 1≤j≤N Cand, 1≤k≤K, 1≤t≤n k, P (c j(image K, t) | image K, t) its recognition confidence of correspondence;
Because the identification of character picture and the estimation of degree of confidence are finished in the step at 7-3, we obtain recognition result and the degree of confidence that we need by the method for inquiry LookupTable from CharcandidatesSet in this step;
To k bar candidate cutting route, 1≤k≤K, use following Vieterbi method computing semantic-identification cost:
Make Q[n k] [N Cand] be two-dimensional array, wherein a Q[t] [j] preserved from certain candidate of first character picture and put c to byte j(image K, t) the logarithm value of the probability that had of maximum possible candidate selection mode, get a two-dimentional array of pointers Path[n in addition k] [N Cand] be used to write down computation process;
Initialization t=1,1≤j≤N Cand
Path[1][j]=NULL,
Q[1][j]=logP(c j(image k,1))+log(c j(image k,1)|image k,1),
Recurrence 2≤t≤n k, to 1≤j≤N CandCalculate Q[t] [j],
Q [ t ] [ j ] = max 1 &le; l &le; N Cand { Q [ t - 1 ] [ l ] + log P ( c j ( imag e k , t ) | c l ( image k , t - 1 ) ) } + log P ( c j ( image k , t ) | image )
Find in addition and make Q[t-1] [l]+logP (c j(image K, t) | c l(image K, t-1)) maximum l, note is made l *, promptly
l * = arg max 1 &le; l &le; N Cand { Q [ t - 1 ] [ l ] + log P ( c j ( imag e k , t ) | c l ( image k , t - 1 ) ) } ,
Make Path[t then] [j] sensing byte point c L*(image K, t-1), i.e. byte point c j(image K, t) father node be c I*(image K, i-1)
Stop t=n k'
Find j at last *, make j * = arg max 1 &le; j &le; N Cand Q [ n k ] [ j ] , Recall Path[n k] [j *] path of indication, each the byte point on the outgoing route obtains character string and is our character identification result; When obtaining optimum character string, we have also obtained the logarithm value Q[n of the probability in maximum possible path k] [j *], we estimate this value, this are worth divided by n as the semanteme-identification cost to this cutting route k, obtain average semanteme-identification cost H k = Q [ n k ] [ j * ] n k ;
Step 9: synthetic geometry cost and semantic cost provide net result
The 9-1 step: we need estimate the fusion parameters λ of geometry cost and semanteme-identification cost
State following agreement:
N L---provided the capable image number that sub-character and correct sub-character merge mode, i.e. the training sample number;
n I, k---k candidate's slit mode of i training sample corresponding characters number;
n I, 0---i the correct cutting of training sample obtains character number;
g I, k---the geometry cost of k candidate's cutting route of i training sample correspondence, then g I, 1The minimum value of representing how much costs in i all slit modes of training sample;
G I, k---the average geometric cost of k candidate's slit mode of i training sample correspondence is got the value after the normalization;
H I, k---the average semanteme-identification cost of k candidate's slit mode of i training sample correspondence;
g I, 0---the geometry cost of i the entirely true cutting correspondence of training sample;
G I, 0---the average geometric cost of i the entirely true cutting correspondence of training sample is got the value after the normalization;
H I, 0---the average semanteme-identification cost of i the entirely true cutting correspondence of training sample;
We select N LThe capable image of individual training sample is handled according to the order from the step 3 to the step 8 each row image, thereby is obtained n I, k, g I, k, H I, k1≤i≤N L, 1≤k≤K; We select for use following mode that the geometry cost of estimating in the step 6 is carried out normalization, and ask its mean value, and promptly we make G ik = 1 n i , k log ( &lambda;e - &lambda; ( g i , k / g i , l - 1 ) ) 1≤i≤N L, 1≤k≤K; We obtain the mark G after the geometry cost normalization of correct cutting correspondence according to the step of front similarly I, 01≤i≤N LWith average semanteme-identification cost H I, 01≤i≤N LNote T i k ( &lambda; ) = H i , k + G i , k 1≤k≤K remembers T then i 0(λ) be the T value of i the entirely true cutting correspondence of sample, promptly T i 0 ( &lambda; ) = H i , 0 + G i , 0 ; Minimization N ( &lambda; ) = &Sigma; i = 1 N L # { T i k ( &lambda; ) > T i 0 ( &lambda; ) | 1 &le; k &le; K } Promptly obtain estimation to weighting coefficient λ;
Wherein # { T i k ( &lambda; ) > T i 0 ( &lambda; ) | 1 &le; k &le; K } Be illustrated under the situation of given λ, in the K of i sample image correspondence candidate's cutting route, the number of the path candidate that the T value is also bigger than the T value of correct slit mode correspondence, minimization method still adopts the hit-and-miss method that is similar to minimization θ;
The 9-2 step: calculate optimum cutting identification path according to fusion parameters λ
To the general capable image of sample to be slit, we are according to step 3---and step 8 calculates K bar candidate cutting route, and calculates the average identification-semantic cost H of every paths kAverage geometric cost after 1≤k≤K and the normalization G k = 1 n k log ( &lambda;e - &lambda; ( g k / g 1 - 1 ) ) 1≤k≤K, wherein g kThe geometry cost of every the cutting route that obtains in the corresponding step 6 of 1≤k≤K, then comprehensive cost T k=H k+ G k1≤k≤K gets k * = arg max 1 &le; k &le; K T k , K then *The optimum cutting that individual candidate's slit mode provides as us.
CNB2005100121952A 2005-07-15 2005-07-15 Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost Expired - Fee Related CN100347723C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100121952A CN100347723C (en) 2005-07-15 2005-07-15 Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100121952A CN100347723C (en) 2005-07-15 2005-07-15 Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost

Publications (2)

Publication Number Publication Date
CN1719454A CN1719454A (en) 2006-01-11
CN100347723C true CN100347723C (en) 2007-11-07

Family

ID=35931285

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100121952A Expired - Fee Related CN100347723C (en) 2005-07-15 2005-07-15 Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost

Country Status (1)

Country Link
CN (1) CN100347723C (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5146190B2 (en) * 2008-08-11 2013-02-20 オムロン株式会社 Character recognition device, character recognition program, and character recognition method
US8175389B2 (en) * 2009-03-30 2012-05-08 Synaptics Incorporated Recognizing handwritten words
CN101814140B (en) * 2010-04-22 2013-06-19 上海邮政科学研究院 Method for positioning envelope image address
CN102314616B (en) * 2010-06-30 2013-05-29 汉王科技股份有限公司 Self-adaptation offline handwriting identification method and device
JP5699570B2 (en) * 2010-11-30 2015-04-15 富士ゼロックス株式会社 Image processing apparatus and image processing program
CN102156889A (en) * 2011-03-31 2011-08-17 汉王科技股份有限公司 Method and device for identifying language type of handwritten text line
CN102779276B (en) * 2011-05-09 2015-05-20 汉王科技股份有限公司 Text image recognition method and device
CN102254157A (en) * 2011-07-07 2011-11-23 北京文通图像识别技术研究中心有限公司 Evaluating method for searching character segmenting position between two adjacent characters
CN102982330B (en) * 2012-11-21 2016-12-21 新浪网技术(中国)有限公司 Character identifying method and identification device in character image
CN103116752A (en) * 2013-02-25 2013-05-22 新浪网技术(中国)有限公司 Picture auditing method and system
CN106981085A (en) * 2016-01-16 2017-07-25 张诗剑 Based on digital photography and the contrast of the object of cloud computing and simulation three-dimensional display system
US10013603B2 (en) * 2016-01-20 2018-07-03 Myscript System and method for recognizing multiple object structure
CN107292213B (en) * 2016-03-30 2020-04-14 中国刑事警察学院 Handwriting quantitative inspection and identification method
CN105913682B (en) * 2016-06-27 2018-08-21 重庆交通大学 Intelligent reverse car seeking method based on RFID technique and system
CN108171235B (en) * 2018-01-08 2021-01-22 北京奇艺世纪科技有限公司 Title area detection method and system
CN110716767B (en) * 2018-07-13 2023-05-05 阿里巴巴集团控股有限公司 Model component calling and generating method, device and storage medium
CN116071764B (en) * 2023-03-28 2023-07-14 中国人民解放军海军工程大学 Prototype network-based handwritten Chinese character recognition method, device, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1369877A (en) * 2000-10-04 2002-09-18 微软公司 Method and system for identifying property of new word in non-divided text
JP2003150902A (en) * 2001-09-27 2003-05-23 Canon Inc Method and device for dividing image into character image lines, character image recognizing method and device
CN1482571A (en) * 2003-04-11 2004-03-17 清华大学 Statistic handwriting identification and verification method based on separate character

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1369877A (en) * 2000-10-04 2002-09-18 微软公司 Method and system for identifying property of new word in non-divided text
JP2003150902A (en) * 2001-09-27 2003-05-23 Canon Inc Method and device for dividing image into character image lines, character image recognizing method and device
CN1482571A (en) * 2003-04-11 2004-03-17 清华大学 Statistic handwriting identification and verification method based on separate character

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于笔划合并和动态规划的联机汉字切分算法 姚正斌,等.清华大学学报(自然科学版),第44卷第10期 2004 *

Also Published As

Publication number Publication date
CN1719454A (en) 2006-01-11

Similar Documents

Publication Publication Date Title
CN100347723C (en) Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost
CN1156791C (en) Pattern recognizing apparatus and method
CN100336071C (en) Method of robust accurate eye positioning in complicated background image
CN1191536C (en) Hand shape and gesture identifying device, identifying method and medium for recording program contg. said method
CN1215433C (en) Online character identifying device, method and program and computer readable recording media
CN1324521C (en) Preprocessing equipment and method for distinguishing image character
CN1161687C (en) Scribble matching
CN1178458C (en) Image coding device image decoding device, image coding method, image decoding method and medium
CN1213592C (en) Adaptive two-valued image processing method and equipment
CN1193284C (en) Method and apparatus for dividing gesture
CN1220163C (en) Title extracting device and its method for extracting title from file images
CN1159673C (en) Apparatus and method for extracting management information from image
CN1186714C (en) High radix divider and method
CN1225484A (en) Address recognition apparatus and method
CN1331449A (en) Method and relative system for dividing or separating text or decument into sectional word by process of adherence
CN1734445A (en) Method, apparatus, and program for dialogue, and storage medium including a program stored therein
CN1091906C (en) Pattern recognizing method and system and pattern data processing system
CN1741035A (en) Blocks letter Arabic character set text dividing method
CN101038625A (en) Image processing apparatus and method
CN1839410A (en) Image processor, imaging apparatus and image processing method
CN1266643C (en) Printed font character identification method based on Arabic character set
CN1620659A (en) Multilingual database creation system and method
CN1658220A (en) Object detection
CN1469229A (en) Auxiliary inputting device
CN1761996A (en) System and method for speech recognition utilizing a merged dictionary

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20071107

Termination date: 20130715