CN100535931C - Multiple distinguishabilitys retrogress character self-adapting recognition system and method - Google Patents

Multiple distinguishabilitys retrogress character self-adapting recognition system and method Download PDF

Info

Publication number
CN100535931C
CN100535931C CNB2006101128864A CN200610112886A CN100535931C CN 100535931 C CN100535931 C CN 100535931C CN B2006101128864 A CNB2006101128864 A CN B2006101128864A CN 200610112886 A CN200610112886 A CN 200610112886A CN 100535931 C CN100535931 C CN 100535931C
Authority
CN
China
Prior art keywords
character
image
quality
classification
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006101128864A
Other languages
Chinese (zh)
Other versions
CN101140625A (en
Inventor
刘春梅
王春恒
戴汝为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CNB2006101128864A priority Critical patent/CN100535931C/en
Publication of CN101140625A publication Critical patent/CN101140625A/en
Application granted granted Critical
Publication of CN100535931C publication Critical patent/CN100535931C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to the technical field of image process and mode identification techniques, and discloses a multi-resolution degradation character self-adapting identification system and a relevant method; the invention aims at multi-resolution degradation character identification problem; the invention system comprises a multi-resolution image quality identification device, and a multi-resolution degradation character identification device. The method comprises: Perform identification for character images that are put into the character library according to the quality of different resolutions of the character images; meanwhile, fulfill self-adapting identification for the multi-resolution degradation characters according to the identification data of the multi-resolution character image quality. The invention adds the character image quality information into a multi-resolution degradation character identify process, selects sub-category system in a self-adapting mode according to the feed forward information of the input character image resolution quality, and therefore, improves performances of the multi-resolution degradation character identification method.

Description

A kind of multiple distinguishabilitys retrogress character self-adapting recognition system and method
Technical field
The present invention relates to Flame Image Process and the pattern-recognition field that learns a skill, is a kind of multiple distinguishabilitys retrogress character recognition methods.
Background technology
Character recognition is an important branch in the pattern-recognition, has obtained very big progress in recent years.Move towards in the process of practicability in character recognition at present, a lot of problems have but been run into, wherein topmost is exactly the identification problem of the degraded character of low resolution low-quality image, and these problems are the bottleneck of restriction character recognition practical application, is again focus and difficult point in the character recognition research.
Domestic and international in recent years many researchers begin degraded character identification is studied, these methods can be divided into two classes: a class is traditional character identifying method based on bianry image extraction feature, the another kind of character identifying method that is based on gray scale image extraction feature.
1. extract the Feature Recognition method based on bianry image, mainly be degraded image to be recovered by degradation model, utilize the binaryzation algorithm that character picture is carried out binary conversion treatment again, its objective is that removing degradation phenomena obtains desirable binary image, this image recovers and the binaryzation algorithm can not be handled the degraded character identification problem well, because image recover and the binaryzation process in will certainly information loss, phenomenons such as stroke fracture, stroke adhesion, these are concerning all being the formidable enemy who discerns based on the character identifying method of bianry image extraction feature.
Character picture is carried out binaryzation,, be one and be subjected to a long-term field of paying attention to be partitioned into real literal (prospect).Method commonly used at present comprises: global threshold method, dynamic thresholding method and local threshold method.The global threshold method is applicable to the bigger situation of color (or gray scale) difference of prospect and background, whole sub-picture is adopted same threshold value, calculated amount is little, speed is fast, but limitation is bigger, as Otsu[N.Otsu.A thresholding selection method fromgray-level histogram.IEEE Trans.System, Man, and Cybernetics.Vol9 (1): 62-66,1978].Dynamic thresholding method is on the basis of global threshold, added a supplemental threshold, this method can be eliminated the cavitation of thick stroke, but also can introduce extra noise, it also requires the color distinction of display foreground and background bigger equally, big overlapping phenomenon do not occur.There is not global threshold in the local threshold rule, but determines the binaryzation thresholding dynamically according to the colouring information of single pixel itself and surrounding pixel thereof, and this method is applicable to the file and picture that brightness changes.Method [W.Niblack.An Introduction to DigitalImage Processing.pp:115-116, Prentice Hall, 1986] as Niblack.
The degeneration of image and recovery be a classical problem of image processing field, mainly concentrate on the scan image for the research of file and picture degenerate problem.Kannugo[T.Kanungo, R.M.Haralick, I.Phillips.Global and Local Document Degradation Models.InProc.of Second International Conference on Document Analysis andRecognition.pp:730-734,1993] roll phenomenon at the paper in the scanning books process, proposed scan image optical projection distorted pattern.He has also studied the writing degradation phenomena in the scan image, has proposed one based on morphologic writing degradation model.Yet the degradation model that is used for scan image is inapplicable to the image that derives from digital photographing apparatus, the new problem that digital camera has brought many scanners not possess, and research circle is less relatively to the concern of this problem at present.Taylor[M.J.Taylor, C.R.Dance.Enhancement of document images fromcameras.In SPIE Conference on Document Recognition V.pp:230-241,1998] degradation phenomena based on the file and picture of camera is studied, and proposed the algorithm of a kind of de-fuzzy (deblurring) based on the Tikhonov regularization.Clark recovers file and picture Chinese words inclination and distortion based on edges of regions information.Generally speaking, many researchers study for some degenerate problems of file and picture, but considerably less at the systematic Study of the recovery of the static text image that derives from digital camera specially.
2. the method based on gray level image extraction feature is directly to extract feature from gray scale image, grayscale image has comprised than the information of horn of plenty more in the bianry image, therefore directly the feature of extracting from gray scale image is also than based on the feature of extracting in bianry image horn of plenty distinctness more, available research achievements is verified this point.Based on the degraded character recognition methods of gray feature, can be divided into two classes according to the feature of extracting at present: extract the degraded character identification of architectural feature and the degraded character identification of extracting frequency domain character based on gray level image based on gray level image.
Based on used feature in the degraded character recognition methods of structure gray feature direction character, framework characteristic, topological characteristic or the like are arranged.The structure of character can be described exactly based on the architectural feature of gray scale, and can successfully carry out character recognition, but for low resolution degraded character image, this architectural feature just is difficult to extract exactly and obtains, and anti-interference is poor, also will influence character identification rate.
Frequency domain character based on gray scale can be discerned low resolution degraded character image effectively, it mainly is the direction character that utilizes the way extraction character picture of wave filter, as FFT, DCT, wavelet transformation, Gabor wave filter, sobel wave filter etc., these methods have certain anti-interference capability, can discern the character picture under the inferior quality condition effectively, so have certain practical function.Hamamoto in 1996, Y. wait people [Hamamoto, Y., S.Uchimura, M.Watanabe, T.Yasuda, Y.Mitani, S.Tomita.A Gabor filter-based method for recognizinghandwritten numerals.Pattern Recognition, 31 (4): 395-400,1998] propose to utilize the Gabor wave filter directly to handle from the grayscale character image, the Gabor bank of filters is exported as recognition feature, the offline handwriting character identifying method that extracts based on the Gabor filter characteristic is studied, and experiment can reach 2.34% error rate with 7000 test character set, has shown the validity and the practicality of this feature.Hiroshi in 2000, Y. wait people [HiroshiYoshimura, Minoru Etoh, Kenji Kondo, Naokazu Yokoya.Gray-ScaleCharacter Recognition by Gabor Jets Projection.Proceedings ofICPR ' 00,2000] at low resolution character recognition difficult point problem in the complex background in the video image, proposed directly from video image, to extract gray feature, extract recognition feature by partial projection accumulation Gabor output characteristic, done test experiments on the character set in 744 videos, can obtain 85%~90% recognition correct rate, this method is to different fonts, has good robustness by the inaccuracy of cutting apart the character picture housing that causes.Above-mentioned research is obtaining certain effect aspect antinoise and the image recognition of inferior quality grayscale character.[Xuewen Wang such as Wang Xuewen, Xiaoqing Ding, Changsong Liu:Gabor filters-based feature extraction forcharacter recognition.Pattern Recognition, 38 (3): 369-379,2005] method based on the identification of the degeneration Chinese character of Gabor conversion has been proposed, utilize the Gabor wave filter to extract most important and stable stroke direction information in the character image local space, according to the statistical information of character picture, a kind of effective Gabor bank of filters parameter optimization method has been proposed; Nonlinear transformation is carried out in output to the Gabor bank of filters, makes it adapt to the identification of different brightness and inferior quality grayscale character image; Experimental results show that this feature extraction method can strengthen greatly that recognition system resists that picture noise, interference, brightness variation, stroke are fuzzy, the ability of stroke fracture and character deformation, when the print character image recognition that is applied to various low-quality binary and grayscales, can obtain the recognition performance better than other algorithms.
Summary of the invention
At the low problem of multiple distinguishabilitys retrogress character recognition performance, the present invention seeks to propose a kind of multiple distinguishabilitys retrogress character self-adapting recognition system and method.
To achieve these goals, a first aspect of the present invention is to propose a kind of multiple distinguishabilitys retrogress character self-adapting recognition system, comprising:
Multi-resolution image quality discrimination device is used to differentiate the picture quality of different resolution character; The multiple distinguishabilitys retrogress character recognition device according to the picture quality of different resolution character, is discerned self-adaptation the degraded character of multiresolution.
According to embodiments of the invention, described multi-resolution image quality discrimination device comprises:
Pretreatment unit is used for the character picture information processing is become unified image format, character picture is carried out enhancing contrast ratio handle;
Feature extraction unit is used in character picture information, extracts the proper vector that an array dimension is lower, reflect the image resolution ratio qualitative character;
The Classification and Identification unit is used for the n class is differentiated in character picture identification, determines character picture resolution quality grade.
According to embodiments of the invention, described multiple distinguishabilitys retrogress character recognition device comprises:
First order sorter is used for the character picture quality scale identification range of input is dwindled, and makes up the character picture sorter of required mode;
Second level sorter is used for the character picture of described pattern class and image in different resolution quality discrimination self-adaptation identification selection subclassification system as a result.
According to embodiments of the invention, described first order sorter and second level sorter, setting each assorting process is an independently recognition subsystem.
To achieve these goals, a second aspect of the present invention provides a kind of multiple distinguishabilitys retrogress character self-adapting recognition methods, and step is as follows:
According to character picture different resolution character picture quality in the character repertoire, the input character image is differentiated according to different resolution character picture quality;
And
According to character picture different resolution character picture quality in the character repertoire, make up corresponding character recognition subsystem, according to multiresolution character picture quality discrimination data adaptive ground chooser categorizing system, the degraded character of multiresolution is carried out self-adaptation identification.
According to embodiments of the invention, described judging quality of image step also comprises:
Step is P.1-step.1: character picture carried out pre-service, comprises,
The normalization process is that character picture is processed into unified image format, is used for the character picture feature extraction;
The gray scale adjustment process is character picture to be carried out enhancing contrast ratio handle, and is used for the normalization of gray-scale value is handled, and the inhomogeneous situation of intensity profile that former character picture is existed is converted into unified distribution form, obtains the character picture that strengthens;
Step is P.1-step.2: based on gray level image, extract its gray distribution features from the character picture of different resolution, be used to assess degraded character picture quality and generate quality scale information;
Step is P.1-step.3: with statistical decision method or syntactic analysis method, the degraded character image that is identified is classified as a certain classification at the character picture feature space.
According to embodiments of the invention, the classification P.1-step.3 of described step is to utilize neural network classifier with character picture quality scale information Recognition and be divided into N classification, determines the rank of character picture resolution quality.
According to embodiments of the invention, the degraded character identification step of described multiresolution also comprises:
First order classification is that degraded character picture quality level identification scope is dwindled presorting in certain several pattern class scope; And
Second level classification is according to presorting to described image in different resolution quality discrimination result, adopting adaptive classification to carry out character recognition.
According to embodiments of the invention, described first order classification also comprises:
P.2-Rec.1.1 pre-service: to normalization of original character image and grey level histogram adjustment;
P.2-Rec.1.2 feature extraction: the calculating character image has the number of the pixel of gray-scale value through the grey level histogram of the image after being enhanced after the pre-service in the character picture, provide an estimation to the gray-scale value probability of occurrence, calculates and obtains proper vector f;
P.2-Rec.1.3 sorter identification: f sends into sorter proper vector, calculates the Classification and Identification result, obtains 10 candidate characters, participates in second level Classification and Identification.
According to embodiments of the invention, the classification of the described second level also comprises:
P.2-Rec.2.1 each sub-classifier carries out pre-service as first order Classification and Identification process, characteristic processing, and three steps of Classification and Identification obtain recognition result;
P.2-Rec.2.2 utilize the input character adaptive classification weight coefficient of the Classification and Identification of image in different resolution quality discrimination, then sub-classifier n has in the end played leading role in the recognition result; Utilize preceding 10 characters of first order sorter to be the character candidates collection, the template of m character correspondence is arranged in the sorter of the second level, make the classification number of character taper to the m class by original c class; The identification of second level sorter is that sample to be identified is divided into one of m possible classification, obtains study template vector of all categories; Categorised decision rule based on distance metric, compare calculating at sample and template, for training the template that obtains through different images quality scale training sample, finish a sorter assorting process, the component that each sorter is participated in last categorised decision is determined by the adaptive classification weight coefficient.
The present invention adds the multiple distinguishabilitys retrogress character identifying with the character picture quality information, according to the feed-forward information of input character image resolution ratio quality chooser categorizing system adaptively, thereby improves the performance of multiple distinguishabilitys retrogress character recognition methods.
Description of drawings
Fig. 1 is a multiple distinguishabilitys retrogress character self-adapting recognition system block diagram of the present invention
Fig. 2 is a multiple distinguishabilitys retrogress character self-adapting recognition methods synoptic diagram of the present invention
Fig. 3 is a degraded character image resolution ratio quality grade synoptic diagram in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention
Fig. 4 is multiple distinguishabilitys retrogress character gray distribution of image figure in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention
Fig. 5 is a sample training process synoptic diagram in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention
Fig. 6 is a character recognition process synoptic diagram in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention
Embodiment
Below introduce the preferred embodiments of the present invention, this part only is to illustrate of the present invention, but not to the restriction of the present invention and application or purposes.Other embodiment that draws according to the present invention belongs to technological innovation scope of the present invention too.There is the setting of related parameter also not show to have only example value to use in the scheme.
As shown in Figure 1, multiple distinguishabilitys retrogress character self-adapting recognition system mainly is divided into two parts:
Multi-resolution image quality discrimination device P.1;
The multiple distinguishabilitys retrogress character recognition device P.2.
P.1, described multiresolution character picture quality discrimination comprises:
Pretreatment unit is used for the character picture information processing is become unified image format; Being used for that character picture is carried out enhancing contrast ratio handles;
Feature extraction unit is used in character picture information, extracts the proper vector that an array dimension is lower, reflect the image resolution ratio qualitative character;
The Classification and Identification unit is used for the n class is differentiated in character picture identification, determines character picture resolution quality grade.
P.2, described multiple distinguishabilitys retrogress character recognition device comprises:
First order sorter is used for the character picture quality scale identification range of input is dwindled, and makes up the character picture sorter of required mode; First order sorter adopts one group of subclassification system.
Second level sorter is used for the character picture of described pattern class and image in different resolution quality discrimination self-adaptation identification selection subclassification system as a result, and second level sorter adopts three groups of subclassification systems.
Described first order sorter and second level sorter, setting each assorting process is an independently recognition subsystem.Second level sorter comprises that three are: sub-classifier i, sub-classifier j, sub-classifier n.
On multiresolution character picture quality evaluation problem, process object of the present invention is the degraded character image of multiresolution.Disposal route be the degraded character image according to different resolution picture quality, be divided into n grade: good ..., poor, degraded character image resolution ratio quality grade synoptic diagram in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention as shown in Figure 3.
The multi-resolution image quality discrimination device P.1 target of evaluation process is exactly to determine the resolution levels of character picture by the divided image quality grade, promptly for given input picture, utilize image resolution ratio quality discrimination method of the present invention to differentiate the rank of its image resolution ratio, realize judgement purpose its picture quality.Evaluation problem to degraded character image resolution ratio quality changes into the problem that image quality level is determined like this.As shown in Figure 2, be multiple distinguishabilitys retrogress character self-adapting recognition methods synoptic diagram of the present invention, the method that P.1 concrete multiresolution character picture quality discrimination device adopts mainly comprises following three steps:
P.1-step.1 pre-service, pretreated purpose are to strengthen Useful Information, mainly comprise normalization, grey level histogram adjustment etc. here; The normalization process is that character picture is processed into unified image format, so that the carrying out of processes such as feature extraction; The gray scale adjustment process is character picture to be carried out enhancing contrast ratio handle, the normalization that this process is equivalent on the gray-scale value is handled, the inhomogeneous situation of intensity profile that may have former character picture is converted into unified distribution form, so that identification work is easier, identification result is more accurate.
P.1-step.2 feature extraction, through after the pre-service, the present invention extracts array dimension proper vector lower, that can reflect the image resolution ratio qualitative character from image information.One group of stable and representative feature are the cores of a recognizer, have proposed gray distribution features here, by extract the problem that its gray distribution features carries out the quality evaluation of character picture image in different resolution from character picture.
Gray distribution features is based on and carries out on the gray level image, and the character picture of different resolution has different grey value profile.Fig. 4 is multiple distinguishabilitys retrogress character gray distribution of image synoptic diagram in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention, the character picture that (a) and (b), (c) three subgraphs are three different resolutions among Fig. 4, (d), (e), (f) are the grey level histogram of (a) and (b) among Fig. 4, (c) three subgraph correspondences among Fig. 4, as can be seen from Figure 4: for high-resolution character picture, its grey value profile is sparse; To the character picture of intermediate resolution, the high-resolution relatively character picture of its grey value profile is tight; Character picture to low resolution, the character picture of relative high resolving power of its grey value profile and intermediate resolution is tightr, this is because resolution is low more, the gray-scale value that it occupied is many more, its grey value profile is relatively also just tight more, according to these characteristics, propose gray distribution features, and utilize it that multiple distinguishabilitys retrogress character picture quality is differentiated.
P.1-step.3 Classification and Identification, will be identified object with statistical decision method or syntactic analysis method in feature space is classified as a certain classification exactly in identification.After the present invention extracts gray distribution features, utilize neural network classifier that the n class is differentiated in character picture identification, finish the grade of character picture resolution quality is determined.
P.2 multiple distinguishabilitys retrogress character self-adapting is discerned:
Utilize multiple distinguishabilitys retrogress character recognition methods of the present invention, as shown in Figure 1, adopted two-stage classification, first order classification is to presort, and purpose is that identification range is dwindled in certain several pattern class scope; Second level classification then is according to the differentiation result of first to input character image resolution ratio picture quality, adopts a kind of method of adaptive classification to carry out character recognition.No matter first order classification is still classified the second level, and each assorting process all is equivalent to an independently recognition subsystem, and it can select corresponding suitable feature and sorting algorithm according to the quality of image.
P.2-step.1 pre-service: extract before the feature, the original character image will be adjusted two steps comprising normalization and grey level histogram through a preprocessing process.
P.2-step.2 feature extraction: the character recognition subsystem of each resolution can adopt the feature that is fit to it according to the image in different resolution quality of character here, for example, for high-resolution character picture, can adopt the method for extracting feature based on bianry image, because can obtain the better binary conversion image to high-definition picture; For the character of low resolution, can adopt the method for extracting feature based on gray scale image, because can extract abundanter information like this.
P.2-step.3 Classification and Identification: the sorter of each character recognition subsystem can adopt the sorter that is fit to it according to the image in different resolution quality of character, and each subclassification system participates in the component of last categorised decision and determined by the adaptive classification weight coefficient.
Adaptive classification weight coefficient: in order to select recognition subsystem according to input character picture quality information self-adapting, the adaptive classification weight coefficient has been proposed, it decides each recognition subsystem to participate in the weight of classification according to the image in different resolution quality information of input character image, this process is actual to be a feed forward process, and feed-forward information is transferred to adaptive classification weight coefficient form.Detailed process is as follows:
1. for the input character image, at first utilize multiresolution character image quality judging method that it is carried out the image in different resolution quality judging, obtain the rank of the image in different resolution quality of input picture;
2. set the adaptive classification weight coefficient: for the judging quality of image of n grade, each picture quality rank configures the adaptive weighting coefficient w of the sub-classifier of correspondence respectively i, i≤n here, can set each other adaptive classification weight coefficient of level is w 1=(w 11..., w 1n) ..., w j=(w J1..., w Jn) ..., w n=(w N1..., w Nn), corresponding n picture quality rank of difference;
3. if input character picture quality rank is judged to be 1 grade through image resolution ratio quality judging system, its adaptive weighting coefficient is just elected w automatically as so 1(w wherein 11>w 1i, i ≠ 1), train the sub-recognition system that obtains by the picture quality rank for the training sample of " 1 grade " and in the end played leading role in the identifying; Same, if input character picture quality rank is judged to be " j level " (w wherein through image resolution ratio quality judging system Jj>w Ji, i ≠ j), its adaptive weighting coefficient is just elected w automatically as so j
The following expression of specific embodiment:
This embodiment is applied to entire method of the present invention the process of multiple distinguishabilitys retrogress character recognition device.The character repertoire of embodiment is the 3755 class print character image libraries of GB GB2312-80, and the classification number is c=3755, and every class character selects the image of three kinds of resolution to participate in the training of image resolution ratio quality discrimination system respectively, representative is { good respectively, in, poor } three image quality level, i.e. n=3.The multiple distinguishabilitys retrogress character recognition methods, being divided into two stages carries out, and the one, the training stage, this stage utilizes training sample that sorter is carried out training study, make sorter have the Classification and Identification ability, the 2nd, cognitive phase is promptly to the identifying of input character image.Concrete execution in step is as follows:
The I training stage:
In the multiple distinguishabilitys retrogress character recognition methods, the purpose of training study is to obtain needed sorter through the training study to training sample.Here there are two kinds of sorters to carry out training study to it through training sample, the one, the sorter training in the multiresolution character picture quality discrimination method, another is the sorter training in the recognition methods of multiresolution character self-adapting.Concrete training process is as follows:
Multiresolution character picture quality discrimination is P.1:
In multiple distinguishabilitys retrogress character picture quality decision method, we select the character picture of three kinds of resolution to participate in the training of image resolution ratio quality discrimination system, representative is { good respectively, in, poor } three image quality levels, n=3, three kinds of also having represented sorter to classify simultaneously, the average-size size of these three kinds of resolution character pictures be respectively 50 * 50,20 * 20,12 * 12} pixel.We choose at random 1000 character pictures as training sample from each class, and training sample quantity is 3 * 1000=3000 character picture like this.
As shown in Figure 5, be in the multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention all training samples to be carried out the feature extraction synoptic diagram, concrete training sample Train step is as follows:
P.1-Train.1 pre-service: normalization and grey level histogram adjustment
P.1-Train.1.1 normalization is handled; For training sample I t, we are 64 * 64 pixel-matrixs with image normalization size;
P.1-Train.1.2 grey level histogram adjustment: obtain the image after the normalization after handling through normalization, ask its gray-scale value scope, [S Min, S Max], calculate according to formula (1) then, then the corresponding grey scale value is t after the adjustment of gradation of image value k process gray scale k, so just with the gray-scale value scope by former [S Min, S Max] forward [0,255] to, calculate the image I after the enhancing s
t k = 255 - 0 S max - S min ( k - S min ) - - - ( 1 )
P.1-Train.2 feature extraction: gray distribution features
Character picture is through the image I after being enhanced after the pre-service s, computed image I sGrey level histogram h (i)=n i, i=0,1 ..., 255, n iBe character picture I sIn have the number of the pixel of gray-scale value i, h (i) has provided an estimation to gray-scale value i probability of occurrence, so histogram table is understood the distribution situation of gray-scale value in the image.Gradation of image distribution characteristics by formula (2) is calculated:
h d ( i ) = 1 , h ( i ) > T ; 0 , h ( i ) ≤ T . - - - ( 2 )
Gray distribution features is h in the formula d, T is a preset threshold.As character picture I sThe number of pixels n of middle gray-scale value i iDuring greater than threshold value T, it is recorded as 1; As character picture I sThe number of pixels n of middle gray-scale value i iDuring less than threshold value T, it is recorded as 0.
P.1-Train.3 sorter training study
P.1-Train.1 and P.1-Train.2 each training sample is extracted feature according to step, all can obtain proper vector hd to each training sample like this iWith its quality grade cq i(0≤cq i≤ n is the classification number of sorter), we carry out training study to training sample proper vector that obtains and quality grade to sorter, and the sorter of our usefulness is a neuroid here.As shown in Figure 5, can obtain the input vector (hd of sorter study 1, cq 1) ..., (hd 3000, cq 3000), finish the sorter training study process in the multiresolution character picture quality discrimination method like this.
P.2 multiple distinguishabilitys retrogress character self-adapting is discerned:
The multiple distinguishabilitys retrogress character self-adapting recognition methods, because the word amount of character set is big, in order to improve recognition speed, we have adopted two-stage classification here, and first order classification is to presort, and purpose is that identification range is dwindled in certain several pattern class scope; Second level classification then is to adopt a kind of method of adaptive classification to carry out character recognition according to input character picture quality.No matter first order classification is still classified the second level, and each assorting process all is equivalent to an independently recognition subsystem, and it can select corresponding suitable feature and sorting algorithm according to the quality of image.Example training character repertoire is the 3755 class print character image libraries of GB GB2312-80, and the classification number is c=3755, and every class character has the image of 15 kinds of resolution respectively.The all recognition subsystems of example have all adopted Gabor feature and minimum distance classifier.The training step of each sub-recognition system is as follows:
P.2-Train.1 pre-service: normalization and grey level histogram adjustment, as the step of training stage P.1-Train.1;
P.2-Train.2 feature extraction: adopt feature [the Peifeng Hu of Gabor principal direction feature as character recognition, Yannan Zhao, Zehong Yang, Jiaqin Wang.Recognition of graycharacter using gabor filters.Proceedings of FUSION ' 2002,2002]
P.2-Train.3 sorter identification: adopted minimum distance classifier here, the pattern recognition problem based on the template matching method of distance metric is described its objective is sample x=[x to be identified 1, x 2..., x d] TBe divided into c possible classification ω iOne of (i=1 ... c), study template vector of all categories is T=(T 1..., T i..., T c), i=1 ..., c.Categorised decision rule based on distance metric is as follows:
If: dis ( x , T i ) = min k dis ( x , T k ) - - - ( 3 )
Then: x → ω k
X → ω wherein kExpression sample x belongs to ω kClass, dis (x, T i) for sample x to T iDistance, select to make dis (x, T i) minimized class ω k
Training sample to all image in different resolution quality scales carries out feature extraction (15 kinds of resolution of every class character picture), can obtain the input vector (f of sorter study 1, c 1) ..., (f 3755x15, c 3755x15), obtaining template T by the every class Chinese character of training study, T is the template that obtains through the training sample study that comprises all image in different resolution quality scales; Other training sample of " good " class picture quality level is carried out feature extraction (5 kinds of resolution of every class character picture, the center resolution character picture is of a size of 50 * 50 pixels), can obtain the input vector (f of sorter study 1, c 1) ..., (f 3755x5, c 3755x5), obtain template T by the every class Chinese character of training study 1, T 1It is the template that obtains for the training sample study of " good " class through the character picture quality; To " in " other training sample of class picture quality level carries out feature extraction (5 kinds of resolution of every class character picture, the center resolution character picture is of a size of 20 * 20 pixels), can obtain the input vector (f of sorter study 1, c 1) ..., (f 3755x5, c 3755x5), obtain template T by the every class Chinese character of training study 2, T 2Be through the character picture quality for " in " template that obtains of the training sample study of class; Other training sample of " poor " class picture quality level is carried out feature extraction (5 kinds of resolution of every class character picture, the center resolution character picture is of a size of 12 * 12 pixels), can obtain the input vector (f of sorter study 1, c 1) ..., f 3755x5, c 3755x5), obtain template T by the every class Chinese character of training study 3, T 3Be to learn the template that obtains for the training sample of " poor " class through the character picture quality.Here select for use the adaptive classification weight coefficient to be respectively w 1=(1,0,0), w 2=(0,1,0), w 3=(0,0,1), corresponding respectively good, in, poor } Three Estate.
The II cognitive phase:
Input picture is a gray level image I, and shown in character recognition process synoptic diagram in Fig. 6 multiple distinguishabilitys retrogress character self-adapting recognition methods of the present invention, it is as follows specifically to discern the Rec step:
Character picture image in different resolution quality discrimination is P.1:
P.1-Rec.1 pre-service: normalization and grey level histogram adjustment, as the step of training stage P.1-Train.1;
P.1-Rec.2 feature extraction: as the step of training stage P.1-Train.2 gray distribution features, calculates proper vector f;
P.1-Rec.3 sorter Classification and Identification: f sends into sorter proper vector, calculates the Classification and Identification result, and as shown in Figure 4, the image in different resolution quality grade of input character image I is differentiated and is " poor ", and then the adaptive classification weight coefficient is w 3=(0,0,1).
P.2 multiple distinguishabilitys retrogress character self-adapting is discerned:
P.2-Rec.1 first order Classification and Identification:
P.2-Rec.1.1 pre-service: normalization and grey level histogram adjustment, with the step of training stage P.1-Train.1;
P.2-Rec.1.2 feature extraction: with the step of training stage P.2-Train.2, calculate proper vector f;
P.2-Rec.1.3 sorter identification: f sends into sorter proper vector, calculates the Classification and Identification result, obtains 10 candidate characters, participates in second level Classification and Identification, and promptly input character only carries out Classification and Identification with these 10 characters in the second time during Classification and Identification;
P.2-Rec.2 second level Classification and Identification:
P.2-Rec.2.1 each sub-classifier carries out pre-service as first order Classification and Identification process, characteristic processing, and three steps of Classification and Identification obtain recognition result, as shown in Figure 6;
P.2-Rec.2.2 by step P.1-Rec.3 result as can be known the adaptive classification weight coefficient of input character be w 3=(0,0,1), then sub-classifier n has in the end played leading role in the recognition result; Through first order sorter, m=10 character is the character candidates collection before getting.In the sorter of the second level, only consider the template of this m character correspondence, the classification number of character tapers to the m class by original c class like this.The identification problem of second level sorter is exactly with sample x=[x to be identified 1, x 2..., x d] TBe divided into m possible classification ω jOne of (j=1 ... m), study template vector of all categories is T i=(T I1..., T Ij..., T Im), i=1 ..., s, j=1 ... m.Categorised decision rule based on distance metric is as follows:
If: dis ( x , T j ) = min k Σ i = 1,2,3 w i dis ( x , T ik ) - - - 5 - 7
Then: x → ω k
X → ω wherein kExpression sample x belongs to ω iClass, dis (x, T Ij) for sample x to T IjDistance, T IjBe the Character mother plate of i class j level picture quality, w iBe the adaptive classification weight coefficient of i class Character mother plate, select to make
Figure C20061011288600182
Minimized class ω kCompare when calculating in sample and template, for training the template that obtains through different images quality scale training sample, this process all is equivalent to a sorter assorting process, and only each sorter component of participating in last categorised decision is determined by the adaptive classification weight coefficient.

Claims (10)

1, a kind of multiple distinguishabilitys retrogress character self-adapting recognition system is characterized in that, comprising:
Multi-resolution image quality discrimination device extracts the proper vector that reflects the image resolution ratio qualitative character from image information; Carry out the quality evaluation of character picture image in different resolution by extract its gray distribution features from character picture; Be used to differentiate the picture quality of different resolution character;
The multiple distinguishabilitys retrogress character recognition device, adopt n group subclassification system, image in different resolution quality information according to the input character image decides each recognition subsystem to participate in the weight of classification, this process is actual to be a feed forward process, feed-forward information is transferred to adaptive classification weight coefficient form, self-adaptation is discerned the degraded character of multiresolution.
2, retrogress character self-adapting recognition system according to claim 1 is characterized in that multi-resolution image quality discrimination device comprises:
Pretreatment unit is used for the character picture information processing is become unified image format, character picture is carried out enhancing contrast ratio handle;
Feature extraction unit is used in character picture information, extracts the proper vector that an array dimension is lower, reflect the image resolution ratio qualitative character;
The Classification and Identification unit is used for the character picture resolution quality is differentiated into the n class, determines character picture resolution quality grade.
As retrogress character self-adapting recognition system as described in the claim 2, it is characterized in that 3, described feature extraction unit is through the image I after being enhanced after the pre-service with character picture s, computed image I sGrey level histogram h (i)=n i, i=0,1 ..., 255, n iBe character picture I sIn have the number of the pixel of gray-scale value i, h (i) has provided an estimation to gray-scale value i probability of occurrence, so histogram table is understood the distribution situation of gray-scale value in the image.
4, retrogress character self-adapting recognition system according to claim 1 is characterized in that the multiple distinguishabilitys retrogress character recognition device comprises:
First order sorter is used for the character picture identification range of input is dwindled, and makes up the character picture sorter of required mode;
Second level sorter is used for the character picture of described pattern class and image in different resolution quality discrimination self-adaptation identification selection subclassification system as a result, and second level sorter adopts n group subclassification system.
5, as retrogress character self-adapting recognition system as described in the claim 4, it is characterized in that, described first order sorter and second level sorter, setting each assorting process is an independently recognition subsystem.
6, a kind of multiple distinguishabilitys retrogress character self-adapting recognition methods is characterized in that step is as follows:
According to character picture different resolution character picture quality in the character repertoire, the input character image is differentiated according to different resolution character picture quality;
And
According to character picture different resolution character picture quality in the character repertoire, make up corresponding character recognition subsystem, according to multiresolution character picture quality discrimination data adaptive ground chooser categorizing system, the degraded character of multiresolution is carried out self-adaptation identification;
Described judging quality of image also comprises:
Step is P.1-step.1: character picture carried out pre-service, comprises,
The normalization process is that character picture is processed into unified image format, is used for the character picture feature extraction;
The gray scale adjustment process is character picture to be carried out enhancing contrast ratio handle, and is used for the normalization of gray-scale value is handled, and the inhomogeneous situation of intensity profile that former character picture is existed is converted into unified distribution form, obtains the character picture that strengthens;
Step is P.1-step.2: based on gray level image, extract its gray distribution features from the character picture of different resolution, be used to assess degraded character picture quality and generate quality scale information;
Step is P.1-step.3: with statistical decision method or syntactic analysis method, the degraded character image that is identified is classified as a certain classification at the character picture feature space.
7 as retrogress character self-adapting recognition methods as described in the claim 6, it is characterized in that, described step classification P.1-step.3 is to utilize neural network classifier with character picture quality scale information Recognition and be divided into n classification, determines the rank of character picture resolution quality.
8 as retrogress character self-adapting recognition methods as described in the claim 5, it is characterized in that, multiple distinguishabilitys retrogress character identification also comprises:
First order classification is that the degraded character reference image identification is dwindled presorting in certain several pattern class scope; And
Second level classification is according to presorting result and image in different resolution quality discrimination result, adopting adaptive classification to carry out character recognition;
The sorter of each character recognition subsystem can adopt the sorter that is fit to it according to the image in different resolution quality of character, and each subclassification system participates in the component of last categorised decision and determined by the adaptive classification weight coefficient.
9, as retrogress character self-adapting recognition methods as described in the claim 8, it is characterized in that, described first order classification, concrete steps also comprise:
P.2-Rec.1.1 pre-service: to normalization of original character image and grey level histogram adjustment;
P.2-Rec.1.2 feature extraction: the character recognition subsystem of each resolution adopts the feature that is fit to it according to the image in different resolution quality of character, for high-resolution character picture, employing obtains the better binary conversion image based on the method for bianry image extraction feature to high-definition picture; For the character of low resolution, adopt the method for extracting feature based on gray scale image, to extract abundanter information, calculate proper vector f;
P.2-Rec.1.3 sorter identification: f sends into sorter proper vector, calculates the Classification and Identification result, obtains m candidate characters, participates in second level Classification and Identification.
10, as claim 8 and 9 multiple distinguishabilitys retrogress character self-adapting recognition methodss, it is characterized in that: the classification of the described second level, concrete steps also comprise:
P.2-Rec.2.1 each sub-classifier carries out pre-service as first order Classification and Identification process, characteristic processing, and three steps of Classification and Identification obtain recognition result;
P.2-Rec.2.2 utilize the input character adaptive classification weight coefficient of the Classification and Identification of image in different resolution quality discrimination, then sub-classifier n has in the end played leading role in the recognition result; Utilize preceding m character of first order sorter to be the character candidates collection, the template of m character correspondence is arranged in the sorter of the second level, make the classification number of character taper to the m class by original c class; The identification of second level sorter is that sample to be identified is divided into one of m possible classification, obtains study template vector of all categories; Categorised decision rule based on distance metric, compare calculating at sample and template, for training the template that obtains through different images quality scale training sample, finish a sorter assorting process, the component that each sorter is participated in last categorised decision is determined by the adaptive classification weight coefficient.
CNB2006101128864A 2006-09-06 2006-09-06 Multiple distinguishabilitys retrogress character self-adapting recognition system and method Expired - Fee Related CN100535931C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006101128864A CN100535931C (en) 2006-09-06 2006-09-06 Multiple distinguishabilitys retrogress character self-adapting recognition system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006101128864A CN100535931C (en) 2006-09-06 2006-09-06 Multiple distinguishabilitys retrogress character self-adapting recognition system and method

Publications (2)

Publication Number Publication Date
CN101140625A CN101140625A (en) 2008-03-12
CN100535931C true CN100535931C (en) 2009-09-02

Family

ID=39192572

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006101128864A Expired - Fee Related CN100535931C (en) 2006-09-06 2006-09-06 Multiple distinguishabilitys retrogress character self-adapting recognition system and method

Country Status (1)

Country Link
CN (1) CN100535931C (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011123587A (en) * 2009-12-09 2011-06-23 Seiko Epson Corp Image processing apparatus, image display device and image processing method
US20120072013A1 (en) * 2010-09-16 2012-03-22 Kabushiki Kaisha Toshiba Character recognition apparatus, sorting apparatus, sorting control apparatus, and character recognition method
CN104685515B (en) * 2012-09-28 2018-05-08 日本山村硝子株式会社 Word reading device and the container inspection for having used the word reading device
CN103295019B (en) * 2013-05-21 2016-06-01 西安理工大学 A kind of Chinese fragment self-adaptive recovery method based on probability-statistics
CN103294811A (en) * 2013-06-05 2013-09-11 中国科学院自动化研究所 Visual classifier construction method with consideration of characteristic reliability
CN103336832A (en) * 2013-07-10 2013-10-02 中国科学院自动化研究所 Video classifier construction method based on quality metadata
CN104021180B (en) * 2014-06-09 2017-10-24 南京航空航天大学 A kind of modular software defect report sorting technique
CN107239786B (en) * 2016-03-29 2022-01-11 阿里巴巴集团控股有限公司 Character recognition method and device
CN107786867A (en) * 2016-08-26 2018-03-09 原相科技股份有限公司 Image identification method and system based on deep learning architecture
US10726573B2 (en) 2016-08-26 2020-07-28 Pixart Imaging Inc. Object detection method and system based on machine learning
CN106790140A (en) * 2016-12-28 2017-05-31 芜湖乐锐思信息咨询有限公司 The data handling system of efficiently online cooperation
CN109816483B (en) * 2019-01-08 2021-02-09 上海上湖信息技术有限公司 Information recommendation method and device and readable storage medium
CN112241935B (en) * 2019-07-18 2023-05-26 杭州海康威视数字技术股份有限公司 Image processing method, device and equipment and storage medium
CN111275732B (en) * 2020-01-16 2023-05-02 北京师范大学珠海分校 Foreground object image segmentation method based on depth convolution neural network
CN113537192B (en) * 2021-06-30 2024-03-26 北京百度网讯科技有限公司 Image detection method, device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1137657A (en) * 1995-03-08 1996-12-11 佳能株式会社 Image processing method and in image processing apparatus
CN1221927A (en) * 1997-12-19 1999-07-07 松下电器产业株式会社 Character recognizor and its method, and recording medium for computer reading out
US6038343A (en) * 1996-02-06 2000-03-14 Hewlett-Parkard Company Character recognition method and apparatus using writer-specific reference vectors generated during character-recognition processing
CN1607542A (en) * 2003-08-25 2005-04-20 佳能株式会社 Image processing apparatus, image processing method, program and storage medium
CN1920855A (en) * 2005-08-26 2007-02-28 富士通株式会社 Character identification apparatus and method for literal line regression

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1137657A (en) * 1995-03-08 1996-12-11 佳能株式会社 Image processing method and in image processing apparatus
US6038343A (en) * 1996-02-06 2000-03-14 Hewlett-Parkard Company Character recognition method and apparatus using writer-specific reference vectors generated during character-recognition processing
CN1221927A (en) * 1997-12-19 1999-07-07 松下电器产业株式会社 Character recognizor and its method, and recording medium for computer reading out
CN1607542A (en) * 2003-08-25 2005-04-20 佳能株式会社 Image processing apparatus, image processing method, program and storage medium
CN1920855A (en) * 2005-08-26 2007-02-28 富士通株式会社 Character identification apparatus and method for literal line regression

Also Published As

Publication number Publication date
CN101140625A (en) 2008-03-12

Similar Documents

Publication Publication Date Title
CN100535931C (en) Multiple distinguishabilitys retrogress character self-adapting recognition system and method
Afzal et al. Cutting the error by half: Investigation of very deep cnn and advanced training strategies for document image classification
CN106650721B (en) A kind of industrial character identifying method based on convolutional neural networks
US20200134382A1 (en) Neural network training utilizing specialized loss functions
Peng et al. Using convolutional encoder-decoder for document image binarization
Ntogas et al. A binarization algorithm for historical manuscripts
US11715288B2 (en) Optical character recognition using specialized confidence functions
Bui et al. Selecting automatically pre-processing methods to improve OCR performances
Hallale et al. Offline handwritten digit recognition using neural network
Mohsin et al. Developing an Arabic handwritten recognition system by means of artificial neural network
Oz et al. A practical license plate recognition system for real-time environments
Ning et al. Rethinking the backbone architecture for tiny object detection
CN113989806A (en) Extensible CRNN bank card number identification method
Alzebdeh et al. Arabic handwritten recognition based on deep convolutional neural network
Verma et al. Enhanced character recognition using surf feature and neural network technique
CN200969105Y (en) Multi-resolution degradation character recognizer
Rizky et al. Text recognition on images using pre-trained CNN
Alkhateeb Off-Line Arabic Handwritten Isolated Character Recognition
Nath et al. Improving various offline techniques used for handwritten character recognition: a review
Rani et al. Quality assessment model for handwritten photo document images
Bureš et al. Semantic text segmentation from synthetic images of full-text documents
Islam et al. A deep convolutional neural network for Bangla handwritten numeral recognition
AlKhateeb Word-based handwritten Arabic scripts recognition using dynamic Bayesian network
Radhi Text Recognition using Image Segmentation and Neural Network
IBRAHIM et al. OFFLINE KURDISH CHARACTER HANDWRITTEN RECOGNITION (OKCHR) USING CNN WITH VARIOUS PREPROCESSING TECHNIQUES

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090902

Termination date: 20170906