CN1010513B - Pattern recongnition system - Google Patents
Pattern recongnition systemInfo
- Publication number
- CN1010513B CN1010513B CN87104862A CN87104862A CN1010513B CN 1010513 B CN1010513 B CN 1010513B CN 87104862 A CN87104862 A CN 87104862A CN 87104862 A CN87104862 A CN 87104862A CN 1010513 B CN1010513 B CN 1010513B
- Authority
- CN
- China
- Prior art keywords
- character
- pattern
- mentioned
- storehouse
- threshold value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/469—Contour-based spatial representations, e.g. vector-coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
To improve matching process efficiency by deciding a detailed matching process is carried out or not down to a lower dimension based on the result of comparison between the matching distance and the threshold value of a pattern type. At a matching part 24 a matching distance (d) is calculated up to a higher N dimension between the feature vector Fkn of a dictionary pattern of the character type (k) and the feature vector Yn of an unknown pattern. Then the distance (d) is compared with the threshold value Thk of the corresponding character type (k) registered in a threshold value table part 30. In case of d>Thk, there is no possibility of the character type of the unknown pattern and the type (k). Thus the matching process is discontinued with the corresponding dictionary pattern. If d<=Thk is satisfied, it is highly possible that the present character type is equal to the character pattern of the unknown character pattern. Thus a detailed matching operation is carried out.
Description
The present invention relates to a kind of pattern recognition system and method that is used for being identified in the pattern that defines on the two-dimensional plane, The present invention be more particularly directed to a kind of utilization proper vector relevant and discern system and method such as character pattern or speech pattern isotype with pattern.Or rather, the present invention relates to a kind of mode identification method and system that is particularly suitable for discerning hand-written character.
Disclosed a kind of mode identification method that adopts the multilayer direction histogram among Japanese patent application 59-202822 and the 58-202825.Above-mentioned two assignees that application transfers the application, thereby above-mentioned two applications are used as reference.Adopted the mode identification method of multilayer direction histogram according to this, at first, a plurality of giving decided direction code and is designated as peripheral pixel, along the profile of a pattern (for example character pattern) and settle.Then, the direction code of this pattern scans to another side from the one side around the frame image of character pattern, thereby detect a direction code that occurs afterwards at one or more white pixels (background), this direction code that is detected is divided into a plurality of one decks that give fixed layer according to the state grade along sweep trace.Then, for the zone of each segmentation in this model frame image, to forming a histogram with respect to direction code to each layer that gives fixed layer; Thereby, have that this histogrammic vector is used as component (sum of each feature) and the proper vector that is used as a pattern.
For example, can draw up 8 kinds and dissimilar give fixed direction code, the frame image of a pattern can be subdivided into 4 * 4 mesh regions.In this case, if direction code is assigned to first and second layers, the dimension of proper vector is 256(=4 * 4 * 2 * 8 so).
Consider a storehouse (set of known mode proper vector just), the identical proper vector of each pattern of from a plurality of patterns, extracting, with their mean value as the proper vector in storehouse or reference model and be deposited with in the storehouse.
The method that it should be noted the zone of a model frame image of segmentation will be not limited only to the disclosed content of above-mentioned application.For example, similar with the mode identification method that above-mentioned application is disclosed, can adopt a kind of method, in this method, the zone of a pattern is subdivided into the mesh region that makes that direction code distributes equably, partly overlap according to giving these mesh regions of being segmented of fixed parameter, thereby the number of segmentation reduces to minimum.A kind of mode identification method that has the multilayer direction histogram of this divided method is proposed by the applicant of this application.
In a kind of like this pattern recognition system that adopts the multilayer direction histogram, for the matching distance between the proper vector of recognizing the input pattern an of the unknown, calculate the proper vector of from unknown pattern, extracting and storehouse (reference) pattern.In this case, big if the number of the dimension of proper vector or magnitude become, the quantity of computed range also becomes very big so, thereby coupling computation process needs long time.
Fundamental purpose of the present invention is the shortcoming that will avoid above-mentioned prior art to exist, so that a kind of improved mode identification method and system to be provided.
Another object of the present invention provides the improved method and system that a kind of employing multilayer direction histogram is discerned a kind of pattern (for example character pattern).
Further purpose of the present invention provides a kind of improved method and system, discerns a kind of pattern with high speed and high discrimination.
Further purpose of the present invention provides a kind of improved mode identification method and system, can carry out coupling effectively and calculate.
Further purpose of the present invention provides a kind of improved mode identification method and system, to be suitable for using machine to discern hand-written character, letter and symbol.
Further purpose of the present invention provides a kind of improved mode identification method and system, can reduce the capacity in the required storehouse of the reference data that is used to store known character.
Other advantage of the present invention and new feature are incited somebody to action when the present invention is described in detail in conjunction with the accompanying drawings below and can be embodied.
Fig. 1 is the block scheme of a kind of pattern recognition system of providing according to one embodiment of present invention;
Fig. 2 a to 2c represents when system works shown in Figure 1 during in recording mode, the process flow diagram of each sequence of steps of storehouse forming process;
Fig. 3 represents when system works shown in Figure 1 during in recognition method, the process flow diagram of each sequence of steps of mode identification procedure;
Fig. 4 is the process flow diagram of each sequence of steps of a mode identification procedure according to another embodiment of the invention;
Fig. 5 is the process flow diagram of each sequence of steps of a mode identification procedure according to still another embodiment of the invention;
Fig. 6 is the chart that a width of cloth is used for illustrating the character of adopted proper vector in multilayer direction Histogram drawing method;
Fig. 7 is the block scheme of a kind of pattern recognition system of providing according to another embodiment of the invention;
Fig. 8 represents the process flow diagram of each sequence of steps of the matching process that system carried out shown in Figure 7;
Fig. 9 is the process flow diagram of each sequence of steps of matching process according to another embodiment of the present invention;
Figure 10 a and Figure 10 b are used for the figure that the characterization vector rearranges;
Figure 11 is according to one embodiment of present invention and the process flow diagram of each sequence of steps of a kind of character identifying method of design;
Figure 12 is according to a further embodiment of the invention and the functional-block diagram of a kind of optical character recognition system of design;
Figure 13 is used for illustrating that in mode identification method of the present invention and system a plurality of the giving that is suitable for put direction code;
Figure 14 shows the example of a character pattern;
Figure 15 shows the result who arranges along the profile of character pattern shown in Figure 14 as peripheral pixel when direction code shown in Figure 13;
Figure 16 can be used for illustrating perpendicular to character pattern shown in Figure 14 and carries out photoscanning to determine the method for a character pattern periphery;
Figure 17 a has provided the peripheral chart that obtains after the character pattern scanning shown in Figure 15 to Figure 17 c;
Figure 18 presentation graphs 18a(1) and Figure 18 a(2) how to connect;
Figure 18 a(1), 18a(2) and 18b represent the process flow diagram of each sequence of steps of peripheral mensuration process together;
Figure 19 a is a histogram for the direction code of the resulting significant layering pixel of the mesh region of character pattern shown in Figure 15 to 19c;
How Figure 20 presentation graphs 20a and 20b connect;
When Figure 20 a and 20b represent by connection shown in Figure 20, a kind of process flow diagram of each sequence of steps of histogram forming process;
Figure 21 is divided method #
1The process flow diagram of each sequence of steps;
Figure 22 a and 22b represent to use divided method #
1The result of cutting apart;
Figure 23 a and 23b are used for illustrating the problem of running into when fixedly cut-point carries out the graticule mesh segmentation when using;
Figure 24 is divided method #
2The process flow diagram of each sequence of steps;
Figure 25 a and 25b represent using divided method #
1With divided method #
2The comparison that the result of gained carries out;
Figure 26 is divided method #
3The process flow diagram of each sequence of steps; With
Figure 27 a and 27b represent using divided method #
2With divided method #
3The comparison that the result of gained carries out.
According to a Zhong viewpoint of the present invention, when a kind of pattern Yong optical means was read, Ze formed multilayer direction Histogram figure And and the characteristic quantity of extracting. Then, Zai deposits storehouse Zhong Zuo in to be reference model or to compare than front Yu a reference model that is stored in storehouse Zhong, and Chong newly arranges each component of characteristic vector. Zai multilayer direction histogram method Zhong, some components that characteristic vector comprises are to pattern-recognition very You profit, and some component Ze are little to the pattern-recognition impact. In order to illustrate in greater detail, we study the characteristic vector of a bidimensional. The multilayer direction histogram method that has two dimension as storehouse (reference) data You is when forming, Zai Japanese Zhong, the characteristic vector of " Zi accords with identification " four kanjis, the g of available Fig. 6 Zhong1,g
2,g
3,g
4Represent. Zai Zhe example Zhong, as we can see from the figure, first component A of a little characteristic vector Zai horizontal direction of Zhe You, its variance or standard deviation are bigger than the second component B of vertical direction. In other words, Zai Yong is when identifying a unknown pattern, and the component A of characteristic vector is than the higher recognition capability of component B You.
A kind of performance of Zhe sample of Zhu meaning characteristic vector, if according to this pattern-recognition characteristic from high-end calculating of mating distance, the stage possibility that a little reference model Zai of Zhe calculate the coupling distance to a little relatively high dimensions of Zhe does not become candidate pattern, thereby gets rid of the possibility of Yong Yu pattern-recognition. In order to differentiate the unknown pattern of an input, the initial stage of number Zai identifying operation that can become the reference model of candidate pattern may reduce significantly, needn't to all reference models, all mate the calculating of distance to every dimension of characteristic vector. Importantly, even the order Chong of the component of each dimension newly arranges, the resulting characteristic vector of You multilayer direction histogram still has the characteristic of the feature that can keep this pattern.
According to a Zhe of the present invention viewpoint, its focus concentrates in the above-mentioned characteristic, at first, complies with The characteristic vector of storehouse pattern when Zhao multilayer direction histogram method obtains the Zan of all schema categories or interim, then, order according to the dimension of standard deviation or variance, the heavy new arrangement that makes characteristic vector to be stipulating new vector, and a little new vectors of Zhe are by storage get up Zuo the attach most importance to characteristic vector in storehouse of new regulation or the reference model of storehouse Zhong. And and, the pattern-recognition of carrying out according to a Zhe of the present invention viewpoint, that the method for Yong multilayer direction histogram is from the unknown pattern of the input characteristic vector of extracting, order according to standard deviation or variance makes each heavy new arrangement of the characteristic vector of extracting form the characteristic vector of a Chong new regulation, realize the characteristic vector of Chong new regulation and be stored in coupling between Zhi the characteristic vector of storehouse pattern of each schema category of storehouse Zhong, be down to N from Zui higher-dimension and tie up to calculate the coupling distance, Zhe coupling that calculates compared apart from the Yu Zhi with accordingly related schema category, to determine whether coupling. Determine that according to Zhe comparative result the coupling of related schema category is Zhong Zhi or the lower dimension of schema category Yong carried out further matching operation.
Yu Zhi differentiates Yu the method below the coupling of calculating apart from comparing is Yong. For every schema category, characteristic vector is that Yong multilayer direction histogram method is extracted from each pattern, produce then the characteristic vector of Chong new regulation, a little characteristic vectors of Zhe have the component that order characteristic vector of being extracted according to each dimension of standard deviation or variance is rearranged. Then, the characteristic vector of Zai Chong new regulation and the characteristic vector of a storehouse Zhong or be stored between Zhi the reference vector of identical schema category of a storehouse Zhong, standard deviation or the variance of a coupling distance are down to N from the highest dimension and are tieed up to calculate. Then, each Yu Zhi of various patterns is that standard deviation or the variance that You calculates determined.
According to the pattern-recognition method and system, according to a Zhe of the present invention viewpoint, the unknown pattern of Zai input and the matching process Zhong between Zhi storehouse pattern or the reference model, when storehouse pattern or reference model can not become the candidate pattern of the unknown pattern of judging input, matching process Zai initial stage is Zhong Zhi just, in order to improve coupling efficient and accelerate pattern-recognition speed. In addition, in order to detect coupling Zhong Zhi, be provided with respectively Yu Zhi for each schema category, Yu the mistake of storehouse or reference model coupling Zhong disconnected, do not affect the appearance of candidate pattern yet, therefore, use the substantive You point of the mode identification procedure of multilayer direction histogram method can not lose, can bring high discrimination simultaneously.
Referring now to Fig. 1,, Fig. 1 shows according to one embodiment of present invention and the functional-block diagram of the pattern recognition system that designs, and it is specially adapted to use machine to discern hand-written character such as hand-written kanji or Chinese character, as the target pattern of identification.As showing, pattern recognition system comprises using up to read character pattern And and provide character pattern information to read from original text for one gives the read component 10 that processing element 12 is gone, for example giving 12 li of processing element, each character pattern is segmented respectively, and each character pattern that is segmented is all standardized.So quilt gives the character pattern of having handled and delivers to characteristic extraction parts 14 singly from giving processing element 12.In characteristic extraction parts 14, character pattern is handled by multilayer direction Histogram drawing method, so that extract proper vector.
Character recognition system shown in Fig. 1 has two kinds of working methods, just (1) compares in order to recording mode that forms storehouse or reference model and the known reference model that (2) utilize and are stored in the storehouse, to discern the recognition method of the unknown pattern of importing.At first, will come the declare record mode course of work to the process flow diagram shown in Fig. 2 c, be used to form the storehouse that is stored in the storehouse or each step of reference model by Fig. 2 a.
For a specific character kind, a plurality of (M) character pattern by character read component 10 one by one read (step 58) by light.The reading of this character pattern submit to give that processing element 12 carries out give processing procedure (step 52), then, this character pattern of handling that given is fed to characteristic extraction parts 14, in parts 14, a proper vector (also being the vectors with 256 dimensions) is extracted (step 54) with multilayer direction Histogram drawing method.This proper vector of being extracted is sent to storehouse formation parts 20 by rearranging parts 16.Form parts 20 obtain a proper vector of extracting from M input character pattern average vector (step 56) with this storehouse.This average vector temporarily is deposited with in the storehouse 22, as the proper vector (step 58) of the temporary transient storehouse pattern of this character kind.In the storehouse forms parts 20, check this process whether to carry out to the end character kind (step 60), if check finds to also have a character kind to handle, it will turn back to for the 50th step the character late kind will similarly be handled so.
All dispose up to all character kinds, then enter and rearrange table formation parts 26, here each dimension for all proper vectors that temporarily are stored in the temporary transient storehouse pattern in the storehouse 22 calculates standard deviation or variance (step 62).Afterwards, the size order that has standard deviation or variance by every dimension institute rearranges original feature vector, forms one and rearranges Biao And and this is rearranged table and be stored in to rearrange and show (to go on foot 64) in the parts 18.Up to this step, stipulated that rearranges a table, after this, the effective process that the beginning storehouse forms.
For a special character kind K, a plurality of (M) character pattern is used up in order by mode reads parts 10 and is read (step 66).These character patterns of reading are given processing element and handle (step 68) and deliver to characteristic extraction parts 14 then, in 14 li proper vectors of parts with multilayer direction Histogram drawing method extracted (going on foot 70).Each component of this proper vector of being extracted rearranges (step 72) according to being stored in the Biao And that rearranges that rearranges in the table parts 18 by rearranging the bigger standard deviation that parts 16 are had with each dimension or the order of variance.After being rearranged together with their component, these M proper vector offers that the storehouse forms parts 20 , And and their average vector is deposited in the storehouse 20, as the storehouse of character kind K or the complete proper vector of reference model (step 74).Form in the parts 20 in the storehouse, whether the check storehouse forms processing procedure and all character kinds (K kind) is all handled, if also have a character kind to handle, since the 66th step remaining character kind is carried out similarly so and handled.Until the processing of finishing all character kinds, then the process of storehouse formation itself calls off, and still, will carry out the threshold test step now.
In the threshold value determination process, import identical a plurality of (M) character pattern, these character patterns are used to form the storehouse (step 78) of each character kind, then give processings (step 80), use the multilayer direction Histogram drawing method proper vector (going on foot 82) of extracting.In rearranging parts 16, rearrange (step 84) according to rearranging table each component to M character pattern, deliver to matching block 24 then.Afterwards, threshold value determination common execution between matching block 24 and threshold value determination parts 28.That is to say, matching distance between the proper vector of the storehouse pattern of proper vector that each is rearranged and character kind K drops to N from higher-dimension and ties up and calculate, matching distance according to this calculating, at threshold value determination parts 28, drop to N from higher-dimension and tie up the standard deviation (step 86) of calculating matching distance.The standard deviation of being calculated is deposited in the threshold value table parts 30, as a threshold value that is used for special character kind K (step 88).Equally, the threshold value of other character kind also Bei Ce Ding And and also be stored in the threshold value table parts 30.If the threshold value of determining each character kind all Ce Ding And by storage (step 80), the operation of recording mode then can stop so.
To the operation of the relevant recognition method of pattern recognition system as shown in Figure 1 be described below.Fig. 3 has provided each sequence of steps figure of this identification working method.As shown in Figure 3, the character pattern an of the unknown is imported as a target that is identified (step 100) with mode reads parts 10, give processing (step 102) by giving processing element afterwards, a proper vector (for example vector of 256 dimensions) is extracted (step 104) by the characteristic extraction parts 14 of the multilayer direction Histogram drawing method with the application.Then, each component of its proper vector rearranges according to rearranging table by rearranging parts 16, and the proper vector Yn that is rearranged then is sent to the matching block 24(step 106).In matching block 24, the matching distance d between the proper vector FKn of the storehouse pattern of character kind K and the proper vector Yn of unknown pattern drops to the N dimension from higher-dimension and calculates (step 108).This is by matching distance d of being calculated compare with the threshold value ThK of corresponding character kind K in being deposited with threshold value table parts 30 (going on foot 110).
Comparative result, if the d situation bigger than ThK, the character kind of unknown pattern just can not be existing character K so, so at the matching process of the 116th step termination immediately with the storehouse pattern.On the contrary, if d is equal to or less than the situation of ThK, so existing character kind might be the character kind of unknown character pattern, so will carry out further matching process.Just in this situation, a complete matching distance D ties up all and calculates (step 112).Distance D is compared with existing candidate distance, and the candidate distance with small distance then is defined as a new candidate distance (step 114) again, proceeds to for the 116th step then.In the 116th step,, so the character late kind is then comprised the matching treatment process of steps such as the 108th step if judge to also have a character kind to mate.If the alphabet kind has all been carried out matching treatment, go on foot the character code that will finish this circulation , And and the last candidate characters kind of being left the 116th so and export as the result who discerns.So far, the identifying of a unknown character pattern has just been finished.
When using hand-written the time situation that character can occur distorting.In order to handle this situation, consider to formulate the various characters that may revise, And and form reference models with these characters and be stored in the storehouse.Even in this situation, the matching rate between hand-written character pattern and the storehouse pattern is looked the user of pattern recognition system and is become.In addition, even to identical user, when this user repeatedly uses this character recognition system, also to improve with the matching rate of reference model in the storehouse.Whether in this case, carry out further or the continuation matching operation in order to measure, if threshold value can change, it is desirable so.Just, have good matching rate or a skilled user to use under the situation of this system user and storehouse, it is lower that the threshold value of mensuration is preferably stipulated, reduce carrying out the further number of times of coupling, thereby can improve recognition speed and do not reduce discrimination.
To introduce another embodiment of this viewpoint of the present invention below, allow to revise a such threshold value in this embodiment.Because the function of this embodiment is similar to the embodiment that the front is introduced, so omitting, do not draw by the block scheme of this embodiment, and in order to illustrate that this embodiment will utilize block scheme shown in Figure 1.In this embodiment, record (storehouse formation) mode is similar to aforesaid embodiment, also will constitute one and rearrange table and a storehouse.But different with the above embodiments is, in order to determine whether each character kind is further mated and stipulated a plurality of threshold values.
Stream journey figure And with reference to figure 2 introduces this viewpoint in conjunction with the above embodiments, if the threshold value of measuring in the 88th step is designated as ThK, in the present embodiment, with threshold test parts 28 measured value ThK/1, ThK/2, ThK/L(L is a positive integer) as the threshold value ThK(1 of character kind K), ThK(2) ... ThK(L) , And is deposited with them in the threshold value table parts 30.The process flow diagram of Fig. 4 has provided each sequence of steps during implementing the identification working method.Introduce the operation of the pattern-recognition mode of present embodiment below with reference to Fig. 4.
In the present embodiment, the horizontal UL(=1 of user, 2,3 ... L) being one is used to select the parameter , And of threshold value to place the initial step 200.This user's level can be specified by the user of pattern recognition system.It is identical with the process that the 100th of Fig. 3 went on foot for the 108th step to go on foot the process in the 210th step from the 202nd.In addition, it is identical with the 112nd to 118 process that goes on foot of Fig. 3 to go on foot the process in 220 steps from the 214th.Whether the 212nd step was will further to mate in order to measure according to the matching distance that drops to the N dimension from higher-dimension, and this step is corresponding to the 110th step of Fig. 3.In the present embodiment, for character kind K from threshold value Th(1) to Th(L) select threshold value Th(UL a plurality of threshold values) as the threshold value of measuring, this threshold value is equivalent to the threshold value of the horizontal UL of user.
In other words, be familiar with the user under the situation of present embodiment pattern recognition system, if user's level is changed to height (L maximum), then select low threshold value (Th(L) in minimum value) in order to detect, the result has reduced further coupling (step 214), thereby obtains high Pi Pei and Shuai And and improved character recognition speed.On the contrary, under the situation that the user is unfamiliar with or user and storehouse matching rate are lower, if selected low user's level (L minimum), then in order to detect, selecting bigger threshold value (ThK(L) is maximal value), make the number of times of further processing procedure increase, thereby recognition speed reduce the discrimination raising.
Above introduction show that the difference of discrimination depends on that the user also depends on the difference of character kind.Have when one of use under the situation of pattern as the storehouse compositional model of certain tendency, then more outstanding.During the forming process of storehouse, check is difficult under japanese character that many character kinds are arranged and Chinese character situation.The further embodiment of this viewpoint of the present invention is the problem for fear of the discrimination generation fluctuation along with the character kind.This further embodiment also has the structure with first embodiment identical function.In addition, the processing procedure of storehouse generation type is identical with second embodiment, thereby will measure Th(1 to each character kind) to Th(L) a plurality of threshold values.
Fig. 5 shows the process flow diagram of each sequence of steps of the operation of the 3rd embodiment pattern-recognition mode.The process in the step of the 300th among Fig. 5 to the 308th step is identical to 210 processes that go on foot with the 202nd step of Fig. 4.The process in the step of the 314th among Fig. 5 to 320 steps is identical to the 220th processing procedure that goes on foot with the 214th step among Fig. 4.In the 310th step, be provided with level LV for every character kind, as the parameter V that is used for selecting threshold value.For example, at the pattern recognition system duration of work, for a special character kind K, this level LV(K) be next more given than PK by the refusal/identification that deposits in, the value of this PK is removed by a constant.The 312nd step is corresponding to the 212nd step of Fig. 4, here, ThK(1 from character kind K) to Th(L) a plurality of threshold values select one corresponding to the single threshold value ThK(V) , And of parameter V with compare apart from d to determine whether further to handle (going on foot 314).
It is to be noted that the 4th embodiment of this viewpoint of the present invention can be combined definite by second and the 3rd embodiment of this viewpoint of the present invention, the 4th embodiment introduced with reference to process flow diagram shown in Figure 5.In the 4th embodiment, before the 300th step, increased by a step to be used for being provided with user's level (corresponding to the 200th among Fig. 4 step).In addition, the 310th the step, the level LV(K of character kind K) and the horizontal UL of user between and as parameter V.The 312nd, in order to detect the threshold value ThK(V that has used corresponding to parameter V).This part still is identical with each embodiment that introduces previously.In the 4th embodiment, to avoid the fluctuation , And that causes owing to user and/or character kind and utilized best threshold value to determine whether further to mate, this just makes the 4th embodiment have the two advantage of the second and the 3rd embodiment concurrently.
Introduce another viewpoint of the present invention now.Coupling is to come between unknown pattern that is identified and storehouse pattern relatively, calculating is in component and the distance between the pattern of storehouse or the similar degree of the respective dimensions of the proper vector of the unknown, measures a result with storehouse pattern of minor increment sum or maximum similar sum as identification.According to second kind of viewpoint of the present invention, in the formation in a storehouse, each component of proper vector that calculates a standard deviation or variance And and each storehouse pattern for each dimension of the proper vector of a storehouse pattern is by rearranging to the less standard deviation or the order of variance from bigger standard deviation or variance, and the proper vector that rearranges is deposited in the storehouse.
We suppose that the proper vector of a storehouse pattern is to suppose also that by the component , And that multilayer direction histogram method Xing Cheng And has shown in Figure 10 a each dimension all has the standard deviation or the variance of calculating, and each ties up the order X that presses from high to low for complete storehouse pattern
4, X
1, X
7, X
5, X
2, X
8, X
6Deng arrangement.Then, proper vector respectively ties up X
1, X
2, X
3Deng each component be stored in the storehouse by the new Pai Lie of Chong shown in Figure 10 b And.After just rearranging, the X of original feature vector
4The component of dimension has become the highest Y of new feature vector
1The dimension component, the highest X of original feature vector
1The dimension component becomes the second most significant digit Y of new feature vector
2The dimension component.Therefore, from extract order that each component of a proper vector rearranges according to the component of the proper vector of storehouse pattern and being rearranged of the unknown pattern that is identified.Afterwards, in order between unknown pattern and storehouse pattern, higher dimension to be mated, preferentially calculated with the distance or the similarity degree of the component of the respective dimension of the proper vector of storehouse pattern.According to second kind of viewpoint of the present invention, the coupling in this method between storehouse pattern and the unknown pattern is to effectively utilize a component that the high proper vector of mode of priority recognition capability is arranged, thereby discerns unknown pattern.
With reference to Fig. 7, Fig. 7 shows the block scheme of a character pattern recognition system structure according to one embodiment of present invention briefly.System shown in Figure 7 includes a storehouse 410, this stock has store the set of the reference character pattern character vector of being made up of multilayer direction histogram method, and the component of each proper vector as previously described is to rearrange by having the order of big standard deviation to each dimension of less standard deviation.Here, the standard deviations n of component gkn that is stored in the n dimension of a storehouse pattern K between a plurality of reference models (K) in the storehouse can calculate by following formula.
The average of gn:n dimension wherein.
Certainly, can replace the standard deviation with variance, above-mentioned formula is the square root of standard deviation, is used for by each component that rearranges a proper vector from big variance to order of respectively tieing up of less variance.
The system of Fig. 7 comprises that also rearranges a table 412, and it forms during the storehouse forms.With afterwards, in rearranging table 412, deposited the table of the proper vector component of a corresponding storehouse pattern before described in front the rearranging.This system also comprises mode reads parts 414, and it is used up the Zi symbol Mo Shi And that reads on the original paper character pattern information providing is given processing element 416.In giving processing element 416, the character pattern that provides from read component 414 is divided into independent character pattern, standardizes then, the character pattern that is given processing by a character of a character deliver to characteristic extraction parts 418.In characteristic extraction parts 418, an input character pattern (yet being designated as unknown pattern hereinafter) is followed multilayer direction histogram method, and from this unknown pattern proper vector of extracting.Then, this proper vector of being extracted is sent to and rearranges parts 420.
In rearranging parts 420, by means of rearranging table 412, each component of the proper vector of extracting from unknown pattern is to rearrange with the order of the identical dimension of each component of the proper vector of storehouse pattern, and the proper vector after rearranging is input in the matching block 422.In matching block 422, for higher dimension preferentially calculate the proper vector of storehouse pattern and rearrange after the proper vector of unknown pattern between distance carrying out the coupling of unknown vector with each storehouse pattern, thereby differentiate unknown pattern.In this system, also provide an output block 424, the result of determined identification in its output matching block 422.
The processing procedure of matching block 422 is with reference to the process flow diagram introduction of Fig. 8.During the proper vector of the unknown pattern after rearranging input matching block 422, the numbering K set of storehouse pattern (goes on foot 500) And and carries out coupling between storehouse pattern and the unknown pattern.At first, tie up each component fkn(n of the proper vector of calculating the storehouse pattern to N from higher-dimension: each dimension) and each component Yn of the proper vector of unknown pattern is apart from the sum dk(step 502).This is compared with threshold value Th (step 504) apart from sum dk then.If the situation of dk greater than Th appears in comparative result, then mate Zhong Duan And knot this moment and end) because existing storehouse pattern does not become the opportunity of candidate pattern, so after storehouse pattern numbering K increased progressively, handling procedure turned back to the step 502(step 506.
If obtain the situation that dk is equal to or greater than Th in the step 504, then carry out further matching operation, because existing storehouse pattern can become the candidate pattern that is used to differentiate.Further matching operation is the matching operation of calculating distance D K storehouse pattern and unknown pattern up to low-dimensional (being the 256th dimension) (step 508) here from the dimension that is right after to the N dimension.Then, the distance between each of unknown pattern and candidate pattern compares the series arrangement (step 510) that the fixed number word is used as small distance of giving that , And obtains a candidate pattern with this distance that obtains again.Program turns back to the step 502 through going on foot 506 then, up to the coupling of finishing last storehouse pattern, obtain the situation that K is equal to or greater than MAX in the step 507, then export the code (step 512) of a candidate pattern (character), thereby finished the matching operation of unknown pattern.
In this method of present embodiment, at first the higher dimension that possesses high mode identificating ability is calculated, the same threshold of this distance that calculates then, do not have possibility to become the storehouse pattern of the candidate pattern of unknown pattern to get rid of those, all dimensions of storehouse pattern that therefore only allow to become the candidate pattern of unknown pattern to those are carried out distance calculation.In other words, the classification of storehouse pattern is undertaken by the coupling than higher-dimension of the dimension from higher-dimension to N, thereby some storehouse pattern of having no chance to become candidate pattern is excluded, and has eliminated unnecessary distance calculation.Utilize this structure, the sum of distance calculation is reduced effectively, makes recognition speed and discrimination have significantly and improves.
Introduce second embodiment of second kind of viewpoint of the present invention now.The total functional structure of this embodiment is identical with first above-mentioned embodiment, and the method difference of the matching process of part is only arranged.Therefore, only introduce the operation of matching block 422 with reference to process flow diagram shown in Figure 9.When being sent to matching block 422 after the proper vector of unknown pattern rearranges, the K of storehouse pattern numbering is changed to " 1 " and (goes on foot 600) And and carry out matching operation between storehouse pattern and unknown pattern.At first, from higher-dimension to N dimension calculate the proper vector Yn of the component fkn of proper vector of storehouse pattern and unknown pattern apart from summation dk
1(step 602).Then, this is apart from summation dk
1With threshold value Th
1Relatively (step 604).If comparative result shows dk
1Greater than Th
1, then matching operation is ended immediately, because existing storehouse pattern can not become candidate pattern (character), after therefore the numbering K of storehouse pattern increased progressively in the step 606, program turned back to the step 602.That is to say that the classification of storehouse pattern is based on from higher-dimension to N
1The calculating of dimension is not carried out further matching operation for these owing to sorting result is excluded the storehouse pattern.
If the condition that obtains in the step 604 is dk
1Be equal to or less than Th
1, then drop to N
2Dimension (N here
2Be to compare N
1Big dimension) comes the summation dk of computed range
2(step 608).Dk
2With threshold value Th
2Relatively (step 610) is if obtain dk
2Greater than Th
2Situation, the cancellation from candidate pattern of then existing storehouse pattern, program through go on foot 606 turn back to the step 602.Just, calculated distance that is based on of the middle classification of storehouse pattern is reduced to N
2Dimension is not further mated for those storehouse patterns that are excluded from candidate pattern.On the other hand, in order to keep the storehouse pattern, another is apart from summation dk
3Be to drop to N than higher-dimension
3Dimension (N here
3Dimension compare N
2Calculate greatly) (go on foot 612) , And and calculated apart from summation dk
3With threshold value Th
3Compare (step 614).If comparative result has produced dk
3Greater than Th
3Situation, then existing storehouse pattern is excluded because its possibility does not become candidate pattern, program through go on foot 606 turn back to the step 602.
So after these sort operations, residue storehouse pattern just might become candidate pattern, therefore, each of these residue storehouse patterns is all calculated the summation DK(step 616 of respectively tieing up distance), forward to then with Fig. 8 in went on foot for 510 identical steps 618.After the matching operation of last storehouse pattern is finished, in the step 607, detect termination condition, therefore export the code of candidate pattern (character), thereby finish matching operation unrecognized pattern.
According to this specific embodiment scheme, with said method by progressively increasing the dimension be used for distance calculation, progressively reduction is used for the candidate pattern of unrecognized pattern in three steps, and gets rid of those can not become the storehouse pattern of candidate pattern in each reduction step.Whereby, only to these results that handle as above-mentioned three step reductions and remaining storehouse pattern is carried out whole matching operations.Therefore, the distance calculation amount significantly reduces, thereby recognition speed significantly improves, and discrimination has also improved.It should be noted that what use in the matching operation between unrecognized pattern and storehouse pattern is distance, yet can also use similar degree to replace distance.
Another object of the present invention is described now.In general, in character recognition, when character kind quantity to be processed increased, discrimination will reduce.In order to address this problem, prior art proposes to increase the dimension of proper vector, and uses complicated algorithm, but the method for this prior art can produce the shortcoming as storage capacity increases and recognition speed reduces.In addition, handling the problem that also exists the discrimination that obtains not satisfy the demand under the situation of hand-written character.
As one type of character identifying method, there is a kind of method to comprise and the particular segment assignment of character image given step of deciding feature and the step of compiling by segmentation each mesh region feature that character image limited according to giving fixed algorithm.At " OKI ELECTRICS RE-SEARCH AND DEVELOPMENT ", disclose the divided method that be used to limit mesh region on the 77th page to the 82nd page of 121 Vol, 50 No3 Dec nineteen eighty-three.In the method, each character that utilizes character image circumferential distribution center of gravity has been used the method that comprises (1) segmentation fixed point.(2) method of the variable fixed point of segmentation.Be used to discern when having the character of moderate finite deformation as hand-written character etc., method (2) has advantage aspect invariant feature.But because this method is utilized the center of gravity of circumferential distribution, so be used for determining that the calculated amount of segmentation grid points is very big, this will reduce processing speed.On the other hand, in method (1), although do not need a lot of calculating, for the character as distortion such as hand-written character, segmentation just becomes appropriate inadequately, therefore makes this method character ground feature of can not extracting rightly.This problem not only for as complex characters with moderate finite deformation such as hand-written japanese character character exist, and for as hand-written kana character also exist.For example, the key of differentiating kana character " Wu " and kana character " Wa " is whether the top has extra one to erect stroke , And and importantly compile the information of this extra stroke.If use the fixed point divided method in this case, this extra stroke is in the specific mesh region sometimes, and sometimes because of small distortion is not in the specific mesh region, consequently the information of compiling becomes unstable, and identification therefore can lead to errors.
This purpose of the present invention particularly at solve the above problems , And provide a kind of can be with at a high speed and the height ratio character identifying method and the system that any character are carried out character recognition.For this purpose, the invention provides a kind of mode identification method that comprises the steps:
(a): to the black profile pixel assignment direction code of character pattern;
(b): this character pattern of scanning direction on vertical this limit from an edge, thus detecting N peripheral pixel, this pixel is exactly to N black change point from white;
(c):, form the histogram of the direction code of N peripheral pixel for the zone of each segmentation of character pattern;
(d): utilize the direction code histogram of N peripheral pixel to be feature, carry out and the storehouse between matching operation.
According to this purpose of the present invention, also provide a kind of mode identification method that comprises the steps:
(a): the directional ring from every edge of a frame image perpendicular to this limit scans this character image around character image, and according to the order that occurs along sweep trace, and classify to significant pixel in one deck place in to every layer in the information change of feature;
(b): when utilizing a segmentation direction this character image to be carried out raster scanning as auxiliary scanning direction, significant pixel is counted, determine the segmentation point according to this count value, and utilize these segmentation points of determining that character image is subdivided into the net region;
(c): the histogram that forms the important pixel that is classified into multilayer for each mesh region;
(d): utilize this histogram and storehouse to carry out matching operation.
Here, a kind of parameter represented in " feature " this term, this parameter and character pattern itself closely related , And provide reference number for each pixel or each zone (to compiling of adjacent image point), for example the gray scale of character pattern (such as the white and black under binary mode).In addition, the point that a gray scale (feature) changes represented in " characteristic information change point " this term, for example on the background part (partly white under the binary mode situation) of character pattern and the border between the character part (at black under the binary mode situation partly), perhaps assignment is to that point of direction code (feature) variation of this character pattern.
Referring to Figure 12, this figure has represented according to the present invention the optical character recognition system of an embodiment formation of the 3rd purpose with the form of block scheme.As shown in the figure, there is one to be used on original text, using up and to read character image to produce the scanner 721 of picture intelligence.This picture intelligence is then sent into the character parts 722 of extracting, and in parts extracted in character, isolates independent character image signal from input image signal.Isolated independent character image is sent into noise removing parts 723, eliminates noise partly from character image in the noise removing parts.Then, this character image is stored in the character memory 725 by normalization , And by normalization parts 724.Characteristic extraction parts 726 are arranged in the system of Figure 12, and these characteristic extraction parts are handled has the input character image of the feature of therefrom extracting, and a work memory 727 that carries out accessing operation by characteristic extraction parts 726 is arranged.Matching block 728 is used in the input character image feature of extracting by characteristic extraction parts 726 and each storehouse or is stored between the feature of each reference character of stock's storage 729 carrying out matching operation.Also have an output block 730 to be used to export the result who carries out matching operation by matching block 728.
Referring to the process flow diagram shown in Figure 11 the operation of the processing in the characteristic extraction parts 726 is described especially now.Characteristic extraction parts 726 are read character image , And and to a plurality of the giving and decide the direction code as shown in Figure 13 of this character image assignment from character memory 725.Under the character image situation of example japanese character character (Chinese character) as shown in Figure 14, as shown in Figure 15 to black picture element assignment direction code along this character image profile existence.Then shown in arrow among Figure 16, in characteristic extraction parts 726 to this character image raster scanning four times, each scanning all since one of four limits to its contrast along the direction qualification charcter topology vertical with the limit.In the illustrated embodiment scheme, raster scanning is to carry out along the main scanning direction from the top to the bottom for the first time, then raster scanning is to carry out along the main scanning direction from the left side to the right for the second time, scanning is to carry out along the main scanning direction from the bottom to the top for the third time, and the 4th scanning is what to carry out along the main scanning direction from the right to the left side.
Along every sweep trace, detecting N peripheral pixel from white to N black change point during raster scanning, the N here is a positive integer.That is to say, sweep trace along the raster scanning first time, first black pixel that runs into limits first peripheral pixel, what occur after an above white pixel of first black pixel back limits second peripheral pixel at the back black pixel, and the next one that occurs after an above white pixel of second black pixel back limits the 3rd peripheral pixel at the back black pixel.In other words, in the present embodiment scheme, this N peripheral pixel, promptly the profile pixel from white to black variation along sweep trace is used as the significant pixel that characteristic information changes.As described below, what use as the feature of meaningful pixel is direction code to its assignment.According to the N that detects an in this way peripheral pixel, characteristic extraction parts 726 write " 1 " on the corresponding pixel location in N periphery of configuration shown in the work memory 727, the character image that has illustrated in Figure 17 a to Figure 17 c Figure 14 uses the resulting result of this processing, and these figure clearly express first, second respectively and the 3rd periphery shown.
Figure 18 a(1), Figure 18 a(2) and Figure 18 b show the process flow diagram that this periphery determine to be handled operation.To each step of this processing be illustrated below.At first, the numbering JJ of raster scanning be reset (step 801).JJ is added " 1 " (step 802) and JJ is checked (step 803).If JJ equals 1 or 2, promptly carry out from top to bottom the first time raster scanning or from left to right the second time raster scanning situation under, just forward the step 804 to, ISTART IEND in the step 804, JSTART and JEND insert respectively among register IS, IE, JS and the JE, and the upper right side of And set in register STEP (Figure 18 a(1) shows the I and the J address at each angle of rectangular frame image).Equal at JJ under 3 or 4 the situation, promptly under the raster scanning for the third time or the situation of the 4th raster scanning from right to left that carry out from top to bottom, just forward the step 805 to, in the step 805, ISTART, IEND, JEND and JSTART insert respectively among IS, IE, JS and the JE, and in STEP mid-" 1 ".If JJ is greater than 4, because the 4th raster scanning finish, so this processing just is through with.
In the step 804 or go on foot after 805, from IS, subtract the value that " 1 " obtain and be placed into the address counter I(step 806), thus raster scanning started.I is added " 1 " (step 807), and check whether I has been increased to the IE+1(step 808).If this discrimination result is sure, then turn back to the step 802.On the other hand, if this discrimination result negates that mark IFLG sum counter ICouNT is with regard to be reset (step 809) so.Then the resulting value of value that deducts STEP from the value of JS is inserted the address counter J(step 810).Then the value of STEP is added to (step 811) among the J, and checks whether the value of J equals the JE+STEP(step 812), if the result is sure, just turn back to 807, if the result negates just to forward the step 813 to.
In the step 813,, just the value of I is inserted address register I if judge that JJ equals 1 or 3
1, And inserts address register J to the value of J
1(step 814).If judge that JJ equals 2 or 4, just the value of J inserted I
1, And inserts J to the value of I
1(step 815).Read out in character image address (I from character memory 725 then
1, J
1) pixel data IDATA(I
1, J
1) , And checks whether these data equal " 0 " (in vain) (step 816).If in fact these data equal " 0 " (in vain), mark IFLG just resets and (goes on foot 817) , And and turn back to the step 811.Equal under " 1 " situation of (deceiving) in these data, check then whether IFLG equals " 1 " (step 818).If in fact IFLG equals " 1 ", this pixel is immediately following the black pixel of formerly putting after the pixel so, rather than peripheral pixel, does not therefore write the operation of periphery table, And and turn back to the step 811.On the other hand, if IFLG equals " 0 ", this pixel is peripheral pixel so, and in the step 819, IFLG is added " 1 " (indicating the counter of above-mentioned periphery order N) by set , And to ICouNT.Then, " 1 " is written into the address (I that has by on the periphery periphery table in proper order of ICouNT value indication
1, J
1) go up (step 820).Turn back to the step 811 then.
As shown in figure 11, according to finishing of above-mentioned periphery table, characteristic extraction parts 726 are subdivided into character image mesh region , And and form the histogram (peripheral pixel is exactly the significant pixel that is classified into multilayer according to periphery in proper order) of direction code characteristic in proper order for each of the peripheral pixel of each mesh region.Because will describe in detail the graticule mesh segmentation below, so here suppose shown in Figure 17 a to Figure 17 c character image to be subdivided into 3 * 3 mesh regions, each mesh region has 8 * 8 pixels, and And and mesh region be non-overlapping copies each other.
Each periphery table that characteristic extraction parts 726 usefulness raster scannings form in work memory 727; And and read the direction code (character image with direction code assignment is stored in the work memory 727) that correspondence writes pixel on the address of " 1 ", therefore, to the histogram of each periphery order formation as the function of direction code.Under the character pattern situation shown in Figure 14, Figure 19 a represents the histogram to upper left corner mesh region (1,1) the first order periphery formation.Figure 19 b and 19c represent the same histogram for the second order periphery and the 3rd order periphery respectively.
Special in now the process flow diagram shown in Figure 20 a and the 20b, describe forming this histogrammic method.At first, the serial number N that needs is formed histogrammic periphery inserts the register IConuT(step 901), then insert the address register I(step 902) a) deducting the value that " 1 " obtain referring to Figure 18 from address ISTART(, I is added " 1 " (go on foot 903) , And and check that in the step 904 whether the value of I is greater than the upper right side of address IEND(referring to Figure 18 a).If this discrimination result is sure, so, the histogram that forms when the front sequence periphery has just been finished.On the other hand, if discrimination result negate, just from the upper right side of address JSTART(referring to Figure 18 a) subtract the value that " 1 " obtain and insert the address register J(step 905).Then J is added " 1 " (whether the value that goes on foot 906) , And and inspection J is greater than address JEND(upper right side referring to Figure 18 a).If this differentiation is sure,, go on foot 903 so just turn back to because the scanning of a line of its expression is finished.On the other hand, if discrimination result negates, have and be read out And by the data on the address (I, J) in the periphery table of the serial number of the value of ICoUNT indication and be placed into register IP, be read out , And with the data on the address in the character image of direction code assignment (I, J) and be placed into the register IV(step 908).Check then whether IP equals " 1 " (step 909), if the result negates just to turn back to the step 906.On the other hand, if the result is sure, just check whether the value of IV arrives between " 8 " in " 1 ", promptly whether it is direction code (step 910).If the result negates, just turn back to the step 906, on the other hand,, just check the graticule mesh that pixel was positioned at (IMES, JMES) (step 911) on address (I, J) if the result is sure.Then to counter IHIST(IMES, JMES, IV) add " 1 ", this counter is corresponding with the direction code IV in the graticule mesh (IMES, JMES), turns back to the step 906 then.
In counter IHIST, obtain a histogram in this way.Handle one for the histogrammic formation of particular order periphery and finish, ICoUNT just is updated again, repeats to handle for the identical histogrammic formation of next one order periphery.The histogram that obtains so just is used as the feature of input character image.
Under the situation that forms the storehouse, characteristic extraction parts 726 produce the histogram that is used for a plurality of character patterns, and these histograms for example are written into stock's storage 729 by the method for getting its mean value.Under the situation of carrying out character recognition, the histogram that obtains like this offers matching block 728, carry out there and the storehouse between the matching operation of appointment.In matching block 728, compare (for example according to Euclidean distance, principal component etc.) with the histogram that forms by characteristic extraction parts 726 in the storehouse that is stored in stock's storage 728, thus the character of determining to have minor increment.It should be noted that several matching process, will describe these several matching process below.
Matching process 1:
The variation of character identification rate is because its complicacy depends on the periphery order that is used to discern.Therefore, give earlier each character has been determined the maximum order , And of the periphery that is used to mate and by each character being selected maximum order carry out matching operation.
Matching process 2:
To a plurality of orders of periphery computed range , And and all one with the minor increment summation result who chooses as identification.For example, having first order to the situation of the combination periphery of the 4th order, pass through d respectively
1, d
2, d
3And d
4Pointed out the histogram of each order direction code of periphery and the distance between the storehouse, can both obtain d=d for each class
1+ d
2+ d
3+ d
4Thereby, that minimum class of d value is chosen as the result who discerns.
Matching process 3:
In carrying out the matching operation similar, the histogram with the above periphery of order n all is used as the histogram of the direction code of a periphery to above-mentioned matching process 2.For example in matching process 2, under the situation of the periphery that uses first to fourth order, when carrying out character recognition, the 5th, the 6th, the 7th ... histogram on periphery in proper order is included in the histogram of the 4th order periphery.
Utilize histogrammic matching process according to these, can discern hand-written japanese character character than the storehouse of low capacity with higher discrimination by having relatively.
To describe several graticule mesh divided methods below.In the present embodiment, can utilize following any method to be subdivided into the operation of mesh region.
Divided method 1:
For the character image of assignment direction code, determine segmentation point by processing flow chart shown in Figure 21 along the x direction, along x steering handle character image be subdivided into N regional.Also can determine segmentation point by identical processing, also be subdivided into N zone along y steering handle character image along the y direction.Therefore, consequently character image is subdivided into N * N mesh region.
At first illustrate along the segmentation of x direction.Define the total PE(step 1021 of meaning pixel by the scanning character image) and (in the mesh region Subdividing Processing of present embodiment scheme, all the black profile pixel of assignment direction code all handle as meaningful pixel).As the scanning beamhouse operation that is used to segment a detection, set in segmentation numbering counter n is the counter x along x direction indication address reset (going on foot 1022).From at this moment beginning to reach segmentation numbering N(up to n is " 3 " the present embodiment) till, when the x address adds " 1 ", just carry out the y scanning direction (with the raster scanning of y direction) of this character image, thereby detect the segmentation point as main scanning direction.In other words, x is added " 1 " (step 1024), a line of being indicated by x is scanned along the y direction, and the quantity Px from article one line to the feature pixel the current toe-in bundle is counted (step 1025).Check then whether Px is equal to or greater than (PE/N) * n(step 1026), if the result negates that just turning back to the step 1024 carries out the scanning of next bar line.On the other hand, discrimination result is sure in 1026 if go on foot, and just forwards the step 1027 to.In the step 1027, detect the x address (the right segmentation point in this zone) of the value of current x as n mesh region end point.In addition, detecting the lap that from the value of x, deducts between the adjacent segmentation of the T(zone) value that obtains is as the x address (the segmentation point on the left side) of next mesh region starting point.
The left side (x=1) of the corresponding character frame image of the starting point of first mesh region, and the right of the corresponding character frame image of the end point of last mesh region (x=32 is because the supposition character image has 32 * 32 pixels in the present embodiment).Then, to n add " 1 " (go on foot 1028) , And and repeat from go on foot 1023 the beginning processing.Up to reaching n=N and satisfying before the condition in step 1023, repeatedly carry out this processing.
Detection along y direction segmentation point is carried out with identical method.But because obtained significant pixel sum PE, so should no longer repeat in the step.In addition, when the y address adds " 1 ", be used to detect the scanning of segmentation point along the x direction, being equivalent to above-mentioned Px at significant pixel Py() the y address that equals (PE/N) * n place is detected the y address (following segmentation point) that is used as n mesh region and finishes, and is detected the y address (the segmentation point of top) that is used as next mesh region starting point and deduct the address that T obtains from this y address.The top (y=1) of the corresponding character frame image of the starting point of first mesh region, and the bottom (y=32) of the corresponding character frame image of the end point of last mesh region.
Utilize the segmentation point that obtains like this, character image all is subdivided into N zone along x and y both direction, so character image is subdivided into N * N mesh region.According to this graticule mesh divided method, the character image of Japanese ideogram character " Wu " can be subdivided into the mesh region shown in Figure 22 a and Figure 22 b.In both cases, lap T is assumed to be " 0 ".Under Figure 22 a and two kinds of situations of Figure 22 b, the information of top one perpendicular stroke in all having in the upper left corner mesh region of character image.
On the other hand, segment surely a little if use, the character image among Figure 22 a and Figure 22 b is subdivided into for example mesh region shown in Figure 23 a and Figure 23 b respectively.Under the situation of Figure 23 b, in the mesh region of the upper left corner, having above the information of a perpendicular stroke, and under the situation of Figure 23 a, in the mesh region of the upper left corner, do not have this information.In this way, promptly use the method for fixed segmentation point, the middle top one perpendicular stroke of day word kana character " Wu " appears in the specific mesh region sometimes, do not appear at again in this specific mesh region sometimes, and this perpendicular stroke is the critical part that constitutes this character.It is unstable that data are produced, the feature of therefore can not stably extracting, thus cause the reduction of discrimination and the increase of wrong identification.This problem can be resolved by using above-mentioned divided method 1.
Divided method 2:
With with above-mentioned divided method 1 in identical method, along y steering handle character image be subdivided into N the zone.Carry out then referring to process flow diagram shown in Figure 24 this processing being described below along the segmentation of x direction.
At first " 1 " is inserted segmentation numbering counter n , And the counter x zero clearing (step 1131) of indication along x direction address.Check in the step 1132 whether n is equal to or less than N(N=3 in the present embodiment), if the result negates that Subdividing Processing has just been finished.Discrimination result is sure in 1132 if go on foot, so, in the step 1133, carry out following operation in 1134 and 1135 respectively: when the x address adds " 1 ", carry out along the scanning of y direction, the significant number of picture elements that segments among the regional i in each y direction being counted, for the Xx(yis-yie that is counted) regional number of picture elements Pix checks, differentiates Pix and whether is equal to or greater than (PE/N * N) * n.Here yis and yie represent that respectively the y direction segments the address of starting point and the end point of regional i.
Discrimination result is sure in 1135 if go on foot, the x(x address) currency just is detected the end point that is used as n x direction segmentation zone, And and detect x-T(T: the region overlapping amount) as the starting point (step 1136) in next segmentation zone.Then n is added " 1 " (step 1137), then turn back to the step 1132.Utilize the segmentation point along the x direction that obtains like this, be subdivided into N zone along x direction character image, therefore, character image is subdivided into and adds up to N * N mesh region.
In the Japanese ideogram Chinese character, " nu " and " Su " is closely similar in shape, whether dimension one difference is to have from the upper left side to the oblique stroke in bottom-right centre between these two kana characters, so the information of this stroke is very important when difference " nu " and " Su ".When being subdivided into 3 * 3 mesh regions with divided method 1, the character image of " nu " just obtains the result shown in Figure 25 a.When observing the character image left side middle part mesh region (1,2) of segmentation, can see that the information of the oblique stroke in this centre occurs considerably lessly.On the other hand, when the character image of " nu " segments with divided method 2 same quilts, just can obtain the result shown in Figure 25 b.What need indicate is in Figure 25 b, has only represented the y direction central region of being segmented for brevity.Supposed this condition of T=0 in addition.When observing segmentation zone (1,2), can see the tiltedly information of stroke of enough relevant this centres to have occurred.Utilize this method,, can overcome the shortcoming of divided method 1 promptly according to divided method 2.
In the superincumbent explanation, carried out the segmentation of y direction with divided method 1, but Subdividing Processing can be made of following steps: at first carry out along the segmentation of x direction, each has been used with the same treatment shown in Figure 24 process flow diagram along the zone of x direction segmentation carry out along the segmentation of y direction then.
Divided method 3:
T is not equal under 0 the situation in divided method 2, perhaps along x or y direction under the intensive situation of the meaningful pixel in segmentation point place, it is accurate inadequately that segmentation may become.In order to address this problem, according to this divided method, earlier with divided method 1 along x(or y) direction segments, and carries out the processing of following steps then: the meaningful number of picture elements PEi that segments among the regional i is counted, with the PE/N in the PEi/N replacement divided method 2
2Y(or x along the regional i of segmentation) direction segments.
Referring to process flow diagram shown in Figure 26, segmentation of carrying out along the x direction with divided method 1 and the further segmentation of afterwards each segmentation zone being carried out along the y direction will be described below.At first by to character image scanning, determine that each segments the meaningful number of picture elements PEi(step 1241 of regional i along the x direction)." 1 " is inserted counter n, then the counter y zero clearing of y address (step 1242).Check whether n is equal to or less than N(N=5 here), if discrimination result negates that this processing just is through with (step 1243).On the other hand,, then carry out following operation in the step in 1244,1245 and 1246 respectively if the result is sure: when the y address adds " 1 ", scan, the meaningful number of picture elements that each x direction is segmented among the regional i is counted along the x direction, thereby at y
x(Xis-Xie) the number of picture elements Piy that counts in the zone checks, differentiates Piy and whether is equal to or greater than (PEi/N) * n.Here Xis and Xie are respectively the x addresses that the y direction is segmented starting point and the end point of regional i.
If the differentiation result in the step 1246 is sure, just detect the y(y address) currency as for the end point , And in n the y direction segmentation zone in x direction segmentation zone and detect y-T(T: the region overlapping amount) as the starting point (going on foot 1247) in next segmentation zone.Then n being added " 1 " (goes on foot 1248) , And and turns back to the step 1243.Utilize the segmentation point along the y direction that obtains like this, x direction segmentation zone is subdivided into N zone along the y direction, and therefore whole character image is subdivided into and adds up to N * N mesh region.
When under the situation of T=1, segmenting the character image of kana character " nu ", obtain the result shown in Figure 27 a with divided method 2.On the other hand, when under the situation of T=1, segmenting same character image, obtain the result shown in Figure 27 b with divided method 3.In both cases, all be to carry out the segmentation of y direction earlier, and then carry out the segmentation of x direction.Among these external these two figure, only represented result for brevity for the x direction segmentation in segmentation zone, y direction middle part.Observe the mesh region (1,2) among Figure 27 a and Figure 27 b, can see the information of oblique stroke in the middle of having occurred rightly.Make in this way, even be not equal at the T that sets under 0 the situation or under the intensive situation of the meaningful pixel in segmentation point place, can both obtain appropriate segmentation all the time with divided method 3.
As mentioned above, according to this purpose of the present invention, the segmentation point that is used for the graticule mesh segmentation can suitably be adjusted according to the distortion of character image, therefore, even for the character that has in shape distortion, as hand-written japanese character character and kana character, also can be appropriately and carry out character recognition at high speed.In addition, this purpose of the present invention does not relate to any complex calculations for definite segmentation point, so entire process is very fast.
Though above-mentioned explanation provides abundant and complete disclosing to optimum implementation of the present invention, need not to break away from essence of the present invention and protection domain and just can adopt various improvement, other possible design and equivalent.Therefore above-mentioned explanation and accompanying drawing should not constitute the restriction to the protection domain of the present invention that claim limited that is awaited the reply by application.
Claims (11)
1, a kind of mode identification method comprises step:
Read the schema category of a plurality of selections, so that a plurality of supposition patterns are provided, each supposition pattern has the proper vector that is made of a plurality of components;
Rearrange the component of each supposition pattern character vector in proper order according to the predetermined value of dimension, thereby form and the storage reference model, each reference model has the proper vector that rearranges component;
The schema category of each above-mentioned selection is determined a threshold value;
Read unrecognized pattern;
The proper vector of from above-mentioned unrecognized pattern, extracting and constituting by a plurality of components;
Rearrange the component of above-mentioned proper vector of extracting in proper order with the predetermined value of above-mentioned dimension;
For a predetermined value, the proper vector that rearranges of the reference model of the proper vector that rearranges of above-mentioned unrecognized pattern and each above-mentioned storage coupling, so that matching distance is provided than higher-dimension;
The threshold of the matching distance that provides like this with corresponding schema category; With
Determine whether carry out further coupling with the corresponding modes kind according to the result of this comparison for remaining dimension.
2, method according to claim 1 wherein in the step of determining threshold value, is all determined a plurality of threshold values for each above-mentioned schema category, and is selected one in above-mentioned a plurality of threshold value to be used to determine whether further to mate.
3, method according to claim 2, wherein a total parameter of the schema category by all above-mentioned selections is carried out the selection to threshold value, and this threshold value is used to determine whether further to mate.
4, method according to claim 2, wherein a parameter of the schema category by each above-mentioned selection is carried out the selection to threshold value, and this threshold value is used to determine whether further to mate.
5, method according to claim 2, wherein a parameter of the schema category of the total parameter of the schema category by all above-mentioned selections and each above-mentioned selection is carried out the selection to threshold value, and this threshold value is used to determine whether further to mate.
6, method according to claim 1, wherein the quantity of standard deviation by limiting above-mentioned proper vector component or variance is determined the predetermined value order of dimension.
7, a kind of method of pattern-recognition comprises step:
Obtain the character pattern of a unknown character, it is represented character and is represented background by another binary number by binary;
Each pixel that defines the profile of above-mentioned character is all composed to a predetermined direction code;
Above-mentioned character is subdivided into a plurality of mesh regions;
From an edge perpendicular to this character of scanning direction on this limit, thereby detect N neighboring pixel, this pixel is a N point from the background to the character change.
Above-mentioned each mesh region is all formed the histogram of the direction code of a N neighboring pixel; With
The histogram of above-mentioned histogram and each known character is compared, to determine the above-mentioned unknown character of identification.
8, according to the method for the pattern-recognition of claim 7, wherein above-mentioned character is segmented, make the direction code that described each mesh region has number about equally.
9, according to the method for claim 12 or 13, wherein each above-mentioned mesh region has a lap.
10, a kind of method of pattern-recognition comprises step:
Obtain the character pattern of unknown character, this character pattern is represented character and is represented background by another dyadic number by binary,
Give a predetermined direction code to each pixel that defines the profile of above-mentioned character;
Above-mentioned character is subdivided into a plurality of mesh regions, and the feasible above-mentioned direction number of codes that is assigned in each mesh region comes down to equate;
Each above-mentioned mesh region is all formed the histogram of a direction code; With
The histogram of this histogram and each known character is compared, to determine this unknown character of identification.
11, according to the method for claim 10, wherein each mesh region has a lap.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP61144486A JPH0711819B2 (en) | 1986-06-20 | 1986-06-20 | Pattern recognition method |
JP144486/86 | 1986-06-20 | ||
JP61144488A JPH0740288B2 (en) | 1986-06-20 | 1986-06-20 | Pattern recognition method |
JP144488/86 | 1986-06-20 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN87104862A CN87104862A (en) | 1988-01-27 |
CN1010513B true CN1010513B (en) | 1990-11-21 |
Family
ID=26475883
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN87104862A Expired CN1010513B (en) | 1986-06-20 | 1987-06-20 | Pattern recongnition system |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR910000786B1 (en) |
CN (1) | CN1010513B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3649537B2 (en) * | 1996-11-27 | 2005-05-18 | 日本アイ・ビー・エム株式会社 | Data hiding method and data extracting method |
JP2007233873A (en) * | 2006-03-02 | 2007-09-13 | Toshiba Corp | Pattern recognition device and method therefor |
KR101167765B1 (en) | 2009-11-06 | 2012-07-24 | 삼성전자주식회사 | Apparatus and method for playing handwriting message using handwriting data |
-
1987
- 1987-06-19 KR KR1019870006252A patent/KR910000786B1/en not_active IP Right Cessation
- 1987-06-20 CN CN87104862A patent/CN1010513B/en not_active Expired
Also Published As
Publication number | Publication date |
---|---|
KR910000786B1 (en) | 1991-02-08 |
CN87104862A (en) | 1988-01-27 |
KR880000876A (en) | 1988-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1162803C (en) | Bill distinguishing device and method and recording medium for recording the method | |
CN1111818C (en) | The Apparatus and method for of 2 d code identification | |
US6687401B2 (en) | Pattern recognizing apparatus and method | |
CN1151464C (en) | Method of reading characters and method of reading postal addresses | |
CN1677430A (en) | Boundary extracting method, program, and device using the same | |
CN102782703B (en) | Page layout determination of an image undergoing optical character recognition | |
CN1258894A (en) | Apparatus and method for identifying character | |
EP2364011B1 (en) | Fine-grained visual document fingerprinting for accurate document comparison and retrieval | |
CN1492377A (en) | Form processing system and method | |
CN1240024C (en) | Image processor, image processing method and recording medium recording the same | |
US20070168382A1 (en) | Document analysis system for integration of paper records into a searchable electronic database | |
CN1010512B (en) | Character recognition method | |
CN1226696C (en) | Explanatory and search for handwriting sloppy Chinese characters based on shape of radicals | |
CN1760860A (en) | Device part assembly drawing image search apparatus | |
CN1141666C (en) | Online character recognition system for recognizing input characters using standard strokes | |
CN1711559A (en) | Characteristic region extraction device, characteristic region extraction method, and characteristic region extraction program | |
CN1916940A (en) | Template optimized character recognition method and system | |
CN1367460A (en) | Character string identification device, character string identification method and storage medium thereof | |
CN111968115A (en) | Method and system for detecting orthopedic consumables based on rasterization image processing method | |
CN113723410B (en) | Digital identification method and device for nixie tube | |
CN1010513B (en) | Pattern recongnition system | |
Kuo et al. | Automatic pattern recognition and color separation of embroidery fabrics | |
CN1217292C (en) | Bill image face identification method | |
CN1110018C (en) | Eigenvalue extracting method and equipment, storage medium for memory image parser | |
CN1107280C (en) | Chinese and English table recognition system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C13 | Decision | ||
GR02 | Examined patent application | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C15 | Extension of patent right duration from 15 to 20 years for appl. with date before 31.12.1992 and still valid on 11.12.2001 (patent law change 1993) | ||
OR01 | Other related matters | ||
C17 | Cessation of patent right | ||
CX01 | Expiry of patent term |