CN102262731A - Character recognizing method based on sparse coding - Google Patents
Character recognizing method based on sparse coding Download PDFInfo
- Publication number
- CN102262731A CN102262731A CN 201110192198 CN201110192198A CN102262731A CN 102262731 A CN102262731 A CN 102262731A CN 201110192198 CN201110192198 CN 201110192198 CN 201110192198 A CN201110192198 A CN 201110192198A CN 102262731 A CN102262731 A CN 102262731A
- Authority
- CN
- China
- Prior art keywords
- font
- square
- basis function
- image
- method based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention provides a character recognizing method based on a sparse coding. The method provided by the invention takes a gray level image as an input to carry out operations of following two phases on any one image to be tested: in a training phase, the image to be tested of each character is randomly divided into a certain number of diamonds and the number of the diamonds is codetermined by the size of the image and the size of the diamond: for example, a gray level image with the size of 512*512 can be divided into 4096 diamonds with the size of 8*8. To any one type of character, the divided diamonds are used as input and a group of primary function capable of sparsely representing any diamond is trained through utilizing an independent component analyzing method and the primary functions are used as models of the characters. The method provided by the invention can recognize Chinese characters, can recognize characters of other languages and also can recognize characters of different languages. The method can be applied to automatic document analysis, article design and the like.
Description
Technical field
The present invention relates to a kind of character recognition method, belong to the technical field of Flame Image Process and pattern-recognition based on sparse coding.
Background technology
Along with the rise of network office and popularizing of digital library's business, people depend on electronic document more to obtaining with exchanging of information.And how traditional information with books form record is converted into electronic document, just become the basic problem that the analysis of computing machine automatic document need solve.Character Font Recognition is exactly to identify the type of font in the text image, is one of research contents important during the computing machine automatic document is analyzed and handled.Over nearly more than 20 years, OCR (optical character recognition, optical character identification) has obtained swift and violent development.At present, the printed character recognition technology is mature on the whole, and discrimination has reached the requirement of commercial application.Yet, existing OCR system mainly is intended for the aspect of " character learning ", promptly identify the word content in the image, and also far apart to the identification of character script from the requirement of practicability, this for identify image comprise literal layout structure information and follow-up to word content again editor be a problem that waits to solve.On the other hand, the accuracy rate of the literal identification of single font will be higher than the accuracy rate of the literal identification of multiple font, therefore, if we can accurately discern the font information of file and picture, just the literal identification of multiple font can be converted into the literal identification of single font, thereby improve the accuracy rate of file and picture literal identification.
The research and the application of Character Font Recognition are significant, but also do not cause the attention that people are enough, and be also less relatively to the research of Character Font Recognition.Present existing technology mainly comprises 1) utilize the local feature of font, carry out Character Font Recognition as serif, font weight etc.; 2) carry out Character Font Recognition based on local or overall space of a whole page feature; 3) based on the Character Font Recognition of texture analysis.These methods have obtained certain recognition effect from extracting the angle of characteristics of image.But since the difference before some font characteristics of image aspect performance be not obvious especially, thereby the character recognition method that causes relying on characteristics of image can not correctly identify similar font.
Summary of the invention
The objective of the invention is to solve because the difference before some font is not obvious especially in characteristics of image aspect performance, thereby the character recognition method that causes relying on characteristics of image can not correctly identify the problem of similar font, and then a kind of character recognition method based on sparse coding is provided.
The objective of the invention is to be achieved through the following technical solutions:
With gray level image as image to be tested and carry out the operation in following two stages, in the training stage, the image division to be tested of font of all categories is become sub-piece, and with the basis function of each sub-piece model as such font, the basis function of described each sub-piece is trained by the independent component analytical model, and can sparse linear represent; At test phase, image division to be tested is become sub-piece, and calculate the coefficient that each sub-piece is represented in the basis function model lower linear of all kinds of fonts, the sparse property that the kurtosis that distributes by design factor is come judgement factor, thus be that class font that produces rarefaction representation with the Character Font Recognition that image to be tested comprises.
The present invention is from the angle of sparse coding, by of the efficient coding of simulating human vision system to visual information, with the most original half-tone information as input, for each font is trained the one group of basis function that can effectively represent affiliated font image, the basis function of being trained is sparse to the response coefficient of input information, thereby causes encoding efficiently.At test phase,, identify the affiliated font classification of test document image by calculating whether the response of test pattern under the basis function of each font is sparse.
The present invention can discern the different fonts of same language, also can discern the font between the different language.The present invention can be applied in automatic document analysis (the literal and the typesetting format etc. that reduces automatically in the identification file and picture) and art design aspects such as (discerning interested artistic font).
Embodiment
The concrete steps based on the character recognition method of sparse coding that this patent is invented are:
Step 1: be each the class font in the training storehouse, the gray level image of collecting some is as training image, for example 10 width of cloth images.Each width of cloth image random division is become the square of some, and for example 512 * 512 gray level image can be divided into 1000 squares, and the size of supposing square is d * d.
Step 2: for k class font, all training images are converted into the column vector of a B=d * d dimension through the arbitrary square that obtains after dividing, with the column vector of all square correspondences of all training images as training sample, use independent component analysis (Independent Component Analysis, ICA) method trains one group of basis function matrix
Dui Ying electric-wave filter matrix with it
Wherein
J basis function representing k class font,
J wave filter of expression expression k class font, T representing matrix transposition;
Step 3: adopt the strategy similar to the training stage, treat the test document image and divide, obtain N size and be the square of d * d, each square is expressed as the column vector that B=d * d ties up, all N squares constitute matrix X={x
1..., x
i..., x
N, x wherein
iThe column vector of representing i square correspondence;
Step 4: calculate the sub-piece x of i
iThe coefficient of j basis function correspondence when representing by all basis functions of k class font are linear
T representing matrix transposition wherein.
Step 5: calculate the kurtosis that the coefficient of j basis function under all squares distributes
Wherein
Step 6: the mean value that calculates all B basis function coefficient distribution kurtosis of k class font
Step 7: calculate the maximal value of the mean value of all B basis function coefficient distribution kurtosis, and be the font classification k of maximal value correspondence with the Character Font Recognition of file and picture to be tested
*:
The present invention is different from traditional character recognition method based on characteristics of image, the present invention with gray level image as input, comprise two stages: in the training stage, the training image of font of all categories is divided into square, uses all squares to train and sparse linear to represent the model of the basis function of each square as such font by the independent component analytical model.At cognitive phase, image to be tested is divided into square in the same way, and calculate the coefficient that each square is represented in the basis function model lower linear of all kinds of fonts, the sparse property that the kurtosis that distributes by design factor is come judgement factor, thus be that class font that produces rarefaction representation with the Character Font Recognition that file and picture to be tested comprises.
The present invention is from the angle of sparse coding, by of the efficient coding of simulating human vision system to visual information, with the most original half-tone information as input, for each font is trained the one group of basis function that can effectively represent affiliated font image, the basis function of being trained is sparse to the response coefficient of input information, thereby causes encoding efficiently.At test phase,, identify the affiliated font classification of test document image by calculating whether the response of test pattern under the basis function of each font is sparse.
Then, realize following step with computer programming, computerese can be chosen wantonly.
1, to each class font, reads the training picture, if 24 colour pictures then are converted into 8 gray scale pictures.Select one group of a certain size sub-piece the picture after transforming at random, the sub-piece as d * d size utilizes the independent component analytical approach, is training sample with the sub-image that extracts, and trains one group of basis function and its corresponding wave filter.Be each class font, the basis function and the wave filter of training stored as model parameter;
2, import image to be tested, if 24 coloured images are converted into 8 gray level images.From transform image afterwards, select one group of sub-piece identical at random, the coefficient when representing by the basis function of each class font is linear by each sub-piece of above-mentioned formula (1) programming calculating with the training stage block size
4, according to formula (3) be the kurtosis calculating mean value of all basis function coefficients of every class font
5,, calculate the maximal value of the kurtosis mean value of all font basis function coefficients, and be the font classification k of maximal value correspondence the Character Font Recognition of test document image according to formula (4)
*
The above; only be the preferable embodiment of the present invention; these embodiments all are based on the different implementations under the general idea of the present invention; and protection scope of the present invention is not limited thereto; anyly be familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claims.
Claims (7)
1. character recognition method based on sparse coding, it is characterized in that, with gray level image as image to be tested and carry out the operation in following two stages, in the training stage, the image division to be tested of font of all categories is become square, and with the basis function of each square model as such font, the basis function of described each square is by the training of independent component analytical model, and can sparse linear represent each square; At test phase, image division to be tested is become square, and calculate the coefficient that each square is represented in the basis function model lower linear of all kinds of fonts, the sparse property that the kurtosis that distributes by design factor is come judgement factor, thus be that class font that produces rarefaction representation with the Character Font Recognition that image to be tested comprises.
2. the character recognition method based on sparse coding according to claim 1 is characterized in that, in the training stage, for each the class font in the training storehouse, collects some training images; For each width of cloth image, its random division is become some squares, allow to overlap between square and the square, the size of square is d * d; For k class font, the square that uses all divisions trains one group of basis function by the independent component analytical approach
Dui Ying wave filter with it
Wherein B=d * d is the number of basis function.
3. the character recognition method based on sparse coding according to claim 2 is characterized in that, at test phase, test pattern is divided into the square X={x of identical size
1..., x
i..., x
N, x wherein
iThe column vector of representing i square correspondence, carry out following steps:
Calculate i sub-piece x
iThe coefficient of j basis function correspondence when representing by all basis functions of k class font are linear
T representing matrix transposition wherein.Calculate the kurtosis that the coefficient of j basis function under all squares distributes
Wherein
The font of the contained literal of test pattern can be identified as font classification k
*:
5. the character recognition method based on sparse coding according to claim 3 is characterized in that, is each basis function system-computed kurtosis of every class font according to formula (2).
6. the character recognition method based on sparse coding according to claim 3 is characterized in that, is the kurtosis calculating mean value of all basis function coefficients of every class font according to formula (3).
7. the character recognition method based on sparse coding according to claim 3, it is characterized in that, according to formula (4), calculate the maximal value of the kurtosis mean value of all font basis function coefficients, and be the font classification of maximal value correspondence the Character Font Recognition of test document image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110192198A CN102262731B (en) | 2011-07-11 | 2011-07-11 | Character recognizing method based on sparse coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110192198A CN102262731B (en) | 2011-07-11 | 2011-07-11 | Character recognizing method based on sparse coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102262731A true CN102262731A (en) | 2011-11-30 |
CN102262731B CN102262731B (en) | 2012-10-10 |
Family
ID=45009352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110192198A Expired - Fee Related CN102262731B (en) | 2011-07-11 | 2011-07-11 | Character recognizing method based on sparse coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102262731B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102930573A (en) * | 2012-11-02 | 2013-02-13 | 北京工业大学 | Image reconstruction method based on two-dimensional analysis sparse model and training dictionaries of two-dimensional analysis sparse model |
CN103870791A (en) * | 2012-12-10 | 2014-06-18 | 山东财经大学 | Method for automatically detecting inside and outside of asymmetric patterned tire |
CN104318269A (en) * | 2014-11-19 | 2015-01-28 | 四川大学 | Authentic work identification method based on subspace learning and sparse coding |
CN106156794A (en) * | 2016-07-01 | 2016-11-23 | 北京旷视科技有限公司 | Character recognition method based on writing style identification and device |
CN111339803A (en) * | 2018-12-19 | 2020-06-26 | 北大方正集团有限公司 | Font identification method, apparatus, device and computer readable storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1185905A (en) * | 1997-07-15 | 1999-03-30 | Ricoh Co Ltd | Device and method for discriminating font and information recording medium |
CN1664820A (en) * | 2005-04-21 | 2005-09-07 | 哈尔滨工业大学 | Image hierarchy classification method |
-
2011
- 2011-07-11 CN CN201110192198A patent/CN102262731B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1185905A (en) * | 1997-07-15 | 1999-03-30 | Ricoh Co Ltd | Device and method for discriminating font and information recording medium |
CN1664820A (en) * | 2005-04-21 | 2005-09-07 | 哈尔滨工业大学 | Image hierarchy classification method |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102930573A (en) * | 2012-11-02 | 2013-02-13 | 北京工业大学 | Image reconstruction method based on two-dimensional analysis sparse model and training dictionaries of two-dimensional analysis sparse model |
CN103870791A (en) * | 2012-12-10 | 2014-06-18 | 山东财经大学 | Method for automatically detecting inside and outside of asymmetric patterned tire |
CN104318269A (en) * | 2014-11-19 | 2015-01-28 | 四川大学 | Authentic work identification method based on subspace learning and sparse coding |
CN106156794A (en) * | 2016-07-01 | 2016-11-23 | 北京旷视科技有限公司 | Character recognition method based on writing style identification and device |
CN106156794B (en) * | 2016-07-01 | 2020-12-25 | 北京旷视科技有限公司 | Character recognition method and device based on character style recognition |
CN111339803A (en) * | 2018-12-19 | 2020-06-26 | 北大方正集团有限公司 | Font identification method, apparatus, device and computer readable storage medium |
CN111339803B (en) * | 2018-12-19 | 2023-10-24 | 新方正控股发展有限责任公司 | Font identification method, apparatus, device and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN102262731B (en) | 2012-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109657221B (en) | Document paragraph sorting method, sorting device, electronic equipment and storage medium | |
US10896357B1 (en) | Automatic key/value pair extraction from document images using deep learning | |
CN106156766A (en) | The generation method and device of line of text grader | |
CN106599940B (en) | Picture character recognition method and device | |
US20120099792A1 (en) | Adaptive optical character recognition on a document with distorted characters | |
CN102262731B (en) | Character recognizing method based on sparse coding | |
CN105139041A (en) | Method and device for recognizing languages based on image | |
CN102193946A (en) | Method and system for adding tags into media file | |
CN109784330B (en) | Signboard content identification method, device and equipment | |
CN110135530A (en) | Convert method and system, computer equipment and the medium of Chinese character style in image | |
US9159147B2 (en) | Method and apparatus for personalized handwriting avatar | |
CN105117740A (en) | Font identification method and device | |
Lehal et al. | Recognition of nastalique urdu ligatures | |
Chtourou et al. | ALTID: Arabic/Latin text images database for recognition research | |
CN103186777B (en) | Based on the human body detecting method of Non-negative Matrix Factorization | |
Alyafeai et al. | Calliar: an online handwritten dataset for Arabic calligraphy | |
CN109086327B (en) | Method and device for rapidly generating webpage visual structure graph | |
Rimas et al. | Optical character recognition for Sinhala language | |
CN113936186A (en) | Content identification method and device, electronic equipment and readable storage medium | |
CN113657279A (en) | Bill image layout analysis method and device | |
CN113516041A (en) | Tibetan ancient book document image layout segmentation and identification method and system | |
KR20160099127A (en) | Method and apparatus for selecting feature used to classify multi-label | |
CN112733735B (en) | Method for classifying and identifying drawing layout by adopting machine learning | |
Odunayo et al. | Rescuing historical climate observations to support hydrological research: a case study of solar radiation data | |
Gamati et al. | Arabic Language Character Recognition using Walsh-Hadamard Transform (WHT) vs. Discrete Fourier Transform (DFT) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20121010 Termination date: 20130711 |