CN102262731A - Character recognizing method based on sparse coding - Google Patents

Character recognizing method based on sparse coding Download PDF

Info

Publication number
CN102262731A
CN102262731A CN 201110192198 CN201110192198A CN102262731A CN 102262731 A CN102262731 A CN 102262731A CN 201110192198 CN201110192198 CN 201110192198 CN 201110192198 A CN201110192198 A CN 201110192198A CN 102262731 A CN102262731 A CN 102262731A
Authority
CN
China
Prior art keywords
font
square
basis function
image
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201110192198
Other languages
Chinese (zh)
Other versions
CN102262731B (en
Inventor
姚鸿勋
张盛平
孙鑫
卢修生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201110192198A priority Critical patent/CN102262731B/en
Publication of CN102262731A publication Critical patent/CN102262731A/en
Application granted granted Critical
Publication of CN102262731B publication Critical patent/CN102262731B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a character recognizing method based on a sparse coding. The method provided by the invention takes a gray level image as an input to carry out operations of following two phases on any one image to be tested: in a training phase, the image to be tested of each character is randomly divided into a certain number of diamonds and the number of the diamonds is codetermined by the size of the image and the size of the diamond: for example, a gray level image with the size of 512*512 can be divided into 4096 diamonds with the size of 8*8. To any one type of character, the divided diamonds are used as input and a group of primary function capable of sparsely representing any diamond is trained through utilizing an independent component analyzing method and the primary functions are used as models of the characters. The method provided by the invention can recognize Chinese characters, can recognize characters of other languages and also can recognize characters of different languages. The method can be applied to automatic document analysis, article design and the like.

Description

A kind of character recognition method based on sparse coding
Technical field
The present invention relates to a kind of character recognition method, belong to the technical field of Flame Image Process and pattern-recognition based on sparse coding.
Background technology
Along with the rise of network office and popularizing of digital library's business, people depend on electronic document more to obtaining with exchanging of information.And how traditional information with books form record is converted into electronic document, just become the basic problem that the analysis of computing machine automatic document need solve.Character Font Recognition is exactly to identify the type of font in the text image, is one of research contents important during the computing machine automatic document is analyzed and handled.Over nearly more than 20 years, OCR (optical character recognition, optical character identification) has obtained swift and violent development.At present, the printed character recognition technology is mature on the whole, and discrimination has reached the requirement of commercial application.Yet, existing OCR system mainly is intended for the aspect of " character learning ", promptly identify the word content in the image, and also far apart to the identification of character script from the requirement of practicability, this for identify image comprise literal layout structure information and follow-up to word content again editor be a problem that waits to solve.On the other hand, the accuracy rate of the literal identification of single font will be higher than the accuracy rate of the literal identification of multiple font, therefore, if we can accurately discern the font information of file and picture, just the literal identification of multiple font can be converted into the literal identification of single font, thereby improve the accuracy rate of file and picture literal identification.
The research and the application of Character Font Recognition are significant, but also do not cause the attention that people are enough, and be also less relatively to the research of Character Font Recognition.Present existing technology mainly comprises 1) utilize the local feature of font, carry out Character Font Recognition as serif, font weight etc.; 2) carry out Character Font Recognition based on local or overall space of a whole page feature; 3) based on the Character Font Recognition of texture analysis.These methods have obtained certain recognition effect from extracting the angle of characteristics of image.But since the difference before some font characteristics of image aspect performance be not obvious especially, thereby the character recognition method that causes relying on characteristics of image can not correctly identify similar font.
Summary of the invention
The objective of the invention is to solve because the difference before some font is not obvious especially in characteristics of image aspect performance, thereby the character recognition method that causes relying on characteristics of image can not correctly identify the problem of similar font, and then a kind of character recognition method based on sparse coding is provided.
The objective of the invention is to be achieved through the following technical solutions:
With gray level image as image to be tested and carry out the operation in following two stages, in the training stage, the image division to be tested of font of all categories is become sub-piece, and with the basis function of each sub-piece model as such font, the basis function of described each sub-piece is trained by the independent component analytical model, and can sparse linear represent; At test phase, image division to be tested is become sub-piece, and calculate the coefficient that each sub-piece is represented in the basis function model lower linear of all kinds of fonts, the sparse property that the kurtosis that distributes by design factor is come judgement factor, thus be that class font that produces rarefaction representation with the Character Font Recognition that image to be tested comprises.
The present invention is from the angle of sparse coding, by of the efficient coding of simulating human vision system to visual information, with the most original half-tone information as input, for each font is trained the one group of basis function that can effectively represent affiliated font image, the basis function of being trained is sparse to the response coefficient of input information, thereby causes encoding efficiently.At test phase,, identify the affiliated font classification of test document image by calculating whether the response of test pattern under the basis function of each font is sparse.
The present invention can discern the different fonts of same language, also can discern the font between the different language.The present invention can be applied in automatic document analysis (the literal and the typesetting format etc. that reduces automatically in the identification file and picture) and art design aspects such as (discerning interested artistic font).
Embodiment
The concrete steps based on the character recognition method of sparse coding that this patent is invented are:
Step 1: be each the class font in the training storehouse, the gray level image of collecting some is as training image, for example 10 width of cloth images.Each width of cloth image random division is become the square of some, and for example 512 * 512 gray level image can be divided into 1000 squares, and the size of supposing square is d * d.
Step 2: for k class font, all training images are converted into the column vector of a B=d * d dimension through the arbitrary square that obtains after dividing, with the column vector of all square correspondences of all training images as training sample, use independent component analysis (Independent Component Analysis, ICA) method trains one group of basis function matrix
Figure BSA00000534661500031
Dui Ying electric-wave filter matrix with it
Figure BSA00000534661500032
Wherein
Figure BSA00000534661500033
J basis function representing k class font,
Figure BSA00000534661500034
J wave filter of expression expression k class font, T representing matrix transposition;
Step 3: adopt the strategy similar to the training stage, treat the test document image and divide, obtain N size and be the square of d * d, each square is expressed as the column vector that B=d * d ties up, all N squares constitute matrix X={x 1..., x i..., x N, x wherein iThe column vector of representing i square correspondence;
Step 4: calculate the sub-piece x of i iThe coefficient of j basis function correspondence when representing by all basis functions of k class font are linear
Figure BSA00000534661500035
a ij k = ( w j k ) T x i - - - ( 5 )
T representing matrix transposition wherein.
Step 5: calculate the kurtosis that the coefficient of j basis function under all squares distributes
f j k = 1 N Σ i = 1 N ( a ij k - a ‾ j k ) 4 ( 1 N Σ i = 1 N ( a ij k - a ‾ j k ) 2 ) 2 - - - ( 6 )
Wherein a ‾ j k = Σ i = 1 N a ij k .
Step 6: the mean value that calculates all B basis function coefficient distribution kurtosis of k class font
Figure BSA000005346615000310
f ‾ k = 1 B Σ i = 1 B f j k - - - ( 7 )
Step 7: calculate the maximal value of the mean value of all B basis function coefficient distribution kurtosis, and be the font classification k of maximal value correspondence with the Character Font Recognition of file and picture to be tested *:
k * = arg max k f ‾ k - - - ( 8 )
The present invention is different from traditional character recognition method based on characteristics of image, the present invention with gray level image as input, comprise two stages: in the training stage, the training image of font of all categories is divided into square, uses all squares to train and sparse linear to represent the model of the basis function of each square as such font by the independent component analytical model.At cognitive phase, image to be tested is divided into square in the same way, and calculate the coefficient that each square is represented in the basis function model lower linear of all kinds of fonts, the sparse property that the kurtosis that distributes by design factor is come judgement factor, thus be that class font that produces rarefaction representation with the Character Font Recognition that file and picture to be tested comprises.
The present invention is from the angle of sparse coding, by of the efficient coding of simulating human vision system to visual information, with the most original half-tone information as input, for each font is trained the one group of basis function that can effectively represent affiliated font image, the basis function of being trained is sparse to the response coefficient of input information, thereby causes encoding efficiently.At test phase,, identify the affiliated font classification of test document image by calculating whether the response of test pattern under the basis function of each font is sparse.
Then, realize following step with computer programming, computerese can be chosen wantonly.
1, to each class font, reads the training picture, if 24 colour pictures then are converted into 8 gray scale pictures.Select one group of a certain size sub-piece the picture after transforming at random, the sub-piece as d * d size utilizes the independent component analytical approach, is training sample with the sub-image that extracts, and trains one group of basis function and its corresponding wave filter.Be each class font, the basis function and the wave filter of training stored as model parameter;
2, import image to be tested, if 24 coloured images are converted into 8 gray level images.From transform image afterwards, select one group of sub-piece identical at random, the coefficient when representing by the basis function of each class font is linear by each sub-piece of above-mentioned formula (1) programming calculating with the training stage block size
Figure BSA00000534661500041
3, according to formula (2) be each basis function system-computed kurtosis of every class font
Figure BSA00000534661500042
4, according to formula (3) be the kurtosis calculating mean value of all basis function coefficients of every class font
5,, calculate the maximal value of the kurtosis mean value of all font basis function coefficients, and be the font classification k of maximal value correspondence the Character Font Recognition of test document image according to formula (4) *
The above; only be the preferable embodiment of the present invention; these embodiments all are based on the different implementations under the general idea of the present invention; and protection scope of the present invention is not limited thereto; anyly be familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claims.

Claims (7)

1. character recognition method based on sparse coding, it is characterized in that, with gray level image as image to be tested and carry out the operation in following two stages, in the training stage, the image division to be tested of font of all categories is become square, and with the basis function of each square model as such font, the basis function of described each square is by the training of independent component analytical model, and can sparse linear represent each square; At test phase, image division to be tested is become square, and calculate the coefficient that each square is represented in the basis function model lower linear of all kinds of fonts, the sparse property that the kurtosis that distributes by design factor is come judgement factor, thus be that class font that produces rarefaction representation with the Character Font Recognition that image to be tested comprises.
2. the character recognition method based on sparse coding according to claim 1 is characterized in that, in the training stage, for each the class font in the training storehouse, collects some training images; For each width of cloth image, its random division is become some squares, allow to overlap between square and the square, the size of square is d * d; For k class font, the square that uses all divisions trains one group of basis function by the independent component analytical approach
Figure FSA00000534661400011
Dui Ying wave filter with it
Figure FSA00000534661400012
Wherein B=d * d is the number of basis function.
3. the character recognition method based on sparse coding according to claim 2 is characterized in that, at test phase, test pattern is divided into the square X={x of identical size 1..., x i..., x N, x wherein iThe column vector of representing i square correspondence, carry out following steps:
Calculate i sub-piece x iThe coefficient of j basis function correspondence when representing by all basis functions of k class font are linear
a ij k = ( w j k ) T x i - - - ( 1 )
T representing matrix transposition wherein.Calculate the kurtosis that the coefficient of j basis function under all squares distributes
Figure FSA00000534661400015
f j k = 1 N Σ i = 1 N ( a ij k - a ‾ j k ) 4 ( 1 N Σ i = 1 N ( a ij k - a ‾ j k ) 2 ) 2 - - - ( 2 )
Wherein a ‾ j k = Σ i = 1 N a ij k
Calculate the mean value of all B basis function coefficient distribution kurtosis of k class font
Figure FSA00000534661400021
f ‾ k = 1 B Σ i = 1 B f j k - - - ( 3 )
The font of the contained literal of test pattern can be identified as font classification k *:
k * = arg max k f ‾ k - - - ( 4 )
4. the character recognition method based on sparse coding according to claim 3 is characterized in that, the coefficient when representing by the basis function of each class font is linear according to each square of formula (1) programming calculating
Figure FSA00000534661400024
5. the character recognition method based on sparse coding according to claim 3 is characterized in that, is each basis function system-computed kurtosis of every class font according to formula (2).
6. the character recognition method based on sparse coding according to claim 3 is characterized in that, is the kurtosis calculating mean value of all basis function coefficients of every class font according to formula (3).
7. the character recognition method based on sparse coding according to claim 3, it is characterized in that, according to formula (4), calculate the maximal value of the kurtosis mean value of all font basis function coefficients, and be the font classification of maximal value correspondence the Character Font Recognition of test document image.
CN201110192198A 2011-07-11 2011-07-11 Character recognizing method based on sparse coding Expired - Fee Related CN102262731B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110192198A CN102262731B (en) 2011-07-11 2011-07-11 Character recognizing method based on sparse coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110192198A CN102262731B (en) 2011-07-11 2011-07-11 Character recognizing method based on sparse coding

Publications (2)

Publication Number Publication Date
CN102262731A true CN102262731A (en) 2011-11-30
CN102262731B CN102262731B (en) 2012-10-10

Family

ID=45009352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110192198A Expired - Fee Related CN102262731B (en) 2011-07-11 2011-07-11 Character recognizing method based on sparse coding

Country Status (1)

Country Link
CN (1) CN102262731B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930573A (en) * 2012-11-02 2013-02-13 北京工业大学 Image reconstruction method based on two-dimensional analysis sparse model and training dictionaries of two-dimensional analysis sparse model
CN103870791A (en) * 2012-12-10 2014-06-18 山东财经大学 Method for automatically detecting inside and outside of asymmetric patterned tire
CN104318269A (en) * 2014-11-19 2015-01-28 四川大学 Authentic work identification method based on subspace learning and sparse coding
CN106156794A (en) * 2016-07-01 2016-11-23 北京旷视科技有限公司 Character recognition method based on writing style identification and device
CN111339803A (en) * 2018-12-19 2020-06-26 北大方正集团有限公司 Font identification method, apparatus, device and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1185905A (en) * 1997-07-15 1999-03-30 Ricoh Co Ltd Device and method for discriminating font and information recording medium
CN1664820A (en) * 2005-04-21 2005-09-07 哈尔滨工业大学 Image hierarchy classification method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1185905A (en) * 1997-07-15 1999-03-30 Ricoh Co Ltd Device and method for discriminating font and information recording medium
CN1664820A (en) * 2005-04-21 2005-09-07 哈尔滨工业大学 Image hierarchy classification method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930573A (en) * 2012-11-02 2013-02-13 北京工业大学 Image reconstruction method based on two-dimensional analysis sparse model and training dictionaries of two-dimensional analysis sparse model
CN103870791A (en) * 2012-12-10 2014-06-18 山东财经大学 Method for automatically detecting inside and outside of asymmetric patterned tire
CN104318269A (en) * 2014-11-19 2015-01-28 四川大学 Authentic work identification method based on subspace learning and sparse coding
CN106156794A (en) * 2016-07-01 2016-11-23 北京旷视科技有限公司 Character recognition method based on writing style identification and device
CN106156794B (en) * 2016-07-01 2020-12-25 北京旷视科技有限公司 Character recognition method and device based on character style recognition
CN111339803A (en) * 2018-12-19 2020-06-26 北大方正集团有限公司 Font identification method, apparatus, device and computer readable storage medium
CN111339803B (en) * 2018-12-19 2023-10-24 新方正控股发展有限责任公司 Font identification method, apparatus, device and computer readable storage medium

Also Published As

Publication number Publication date
CN102262731B (en) 2012-10-10

Similar Documents

Publication Publication Date Title
CN109657221B (en) Document paragraph sorting method, sorting device, electronic equipment and storage medium
US10896357B1 (en) Automatic key/value pair extraction from document images using deep learning
CN106156766A (en) The generation method and device of line of text grader
CN106599940B (en) Picture character recognition method and device
US20120099792A1 (en) Adaptive optical character recognition on a document with distorted characters
CN102262731B (en) Character recognizing method based on sparse coding
CN105139041A (en) Method and device for recognizing languages based on image
CN102193946A (en) Method and system for adding tags into media file
CN109784330B (en) Signboard content identification method, device and equipment
CN110135530A (en) Convert method and system, computer equipment and the medium of Chinese character style in image
US9159147B2 (en) Method and apparatus for personalized handwriting avatar
CN105117740A (en) Font identification method and device
Lehal et al. Recognition of nastalique urdu ligatures
Chtourou et al. ALTID: Arabic/Latin text images database for recognition research
CN103186777B (en) Based on the human body detecting method of Non-negative Matrix Factorization
Alyafeai et al. Calliar: an online handwritten dataset for Arabic calligraphy
CN109086327B (en) Method and device for rapidly generating webpage visual structure graph
Rimas et al. Optical character recognition for Sinhala language
CN113936186A (en) Content identification method and device, electronic equipment and readable storage medium
CN113657279A (en) Bill image layout analysis method and device
CN113516041A (en) Tibetan ancient book document image layout segmentation and identification method and system
KR20160099127A (en) Method and apparatus for selecting feature used to classify multi-label
CN112733735B (en) Method for classifying and identifying drawing layout by adopting machine learning
Odunayo et al. Rescuing historical climate observations to support hydrological research: a case study of solar radiation data
Gamati et al. Arabic Language Character Recognition using Walsh-Hadamard Transform (WHT) vs. Discrete Fourier Transform (DFT)

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121010

Termination date: 20130711