CN101751565B - Method for character identification through fusing binary image and gray level image - Google Patents
Method for character identification through fusing binary image and gray level image Download PDFInfo
- Publication number
- CN101751565B CN101751565B CN 200810239331 CN200810239331A CN101751565B CN 101751565 B CN101751565 B CN 101751565B CN 200810239331 CN200810239331 CN 200810239331 CN 200810239331 A CN200810239331 A CN 200810239331A CN 101751565 B CN101751565 B CN 101751565B
- Authority
- CN
- China
- Prior art keywords
- image
- character
- sigma
- feature
- binary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 239000011159 matrix material Substances 0.000 claims abstract description 43
- 238000000513 principal component analysis Methods 0.000 claims abstract description 36
- 230000009466 transformation Effects 0.000 claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 12
- 230000004927 fusion Effects 0.000 claims description 20
- 238000012549 training Methods 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 14
- 238000013507 mapping Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 4
- 239000006185 dispersion Substances 0.000 claims description 4
- 238000012887 quadratic function Methods 0.000 claims description 4
- 230000017105 transposition Effects 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 102100039856 Histone H1.1 Human genes 0.000 claims 1
- 101001035402 Homo sapiens Histone H1.1 Proteins 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
Images
Landscapes
- Character Discrimination (AREA)
Abstract
本发明涉及一种融合二值图像与灰度图像的字符识别的方法,包括对字符图像的二值图像与灰度图像的融合图像进行处理,进行字符识别:将字符图像的二值图像与灰度图像进行融合得到融合图像;对融合图像进行大小和位置的归一化;提取归一化图像的梯度直方图的特征;利用主分量分析与线性判别分析得到特征降维的变换矩阵;建立字符特征模板库,进行字符识别。本发明克服了基于字符的二值图像或者是基于字符的灰度图像的传统字符识别技术不能同时识别退化字符图像以及包含复杂背景的字符图像的缺点。
The invention relates to a character recognition method of fusing a binary image and a grayscale image, which includes processing the fused image of the binary image and the grayscale image of the character image to perform character recognition: combining the binary image of the character image with the grayscale image The fused image is obtained by fusing the high-degree image; the size and position of the fused image are normalized; the features of the gradient histogram of the normalized image are extracted; the transformation matrix of feature reduction is obtained by using principal component analysis and linear discriminant analysis; Feature template library for character recognition. The invention overcomes the disadvantage that the traditional character recognition technology based on the binary image of the character or the gray scale image of the character cannot recognize the degraded character image and the character image containing complex background at the same time.
Description
技术领域 technical field
本发明属于字符识别领域(简称OCR),涉及一致融合二值图像与灰度图像的字符识别的方法。The invention belongs to the field of character recognition (abbreviated as OCR), and relates to a character recognition method for consistent fusion of binary images and grayscale images.
背景技术 Background technique
传统的字符识别技术是基于字符的二值图像或者是基于字符的灰度图像。当基于字符的二值图像的识别技术应用于各种低质量图像,比如视频中的退化字符图像、身份证图像、汽车牌照、自然场景中的字符图像等低分辨率图像,由于二值化后的字符图像质量低,识别效果差。当基于字符的灰度图像的识别技术应用于包含复杂背景的字符图像,比如视频中的字符图像,由于字符图像包含非一致的背景,识别效果将变差。The traditional character recognition technology is based on the binary image of the character or the grayscale image based on the character. When the recognition technology based on character binary images is applied to various low-quality images, such as low-resolution images such as degraded character images in videos, ID card images, car license plates, and character images in natural scenes, due to binarization The character image quality is low, and the recognition effect is poor. When the recognition technology based on character grayscale images is applied to character images containing complex backgrounds, such as character images in videos, the recognition effect will be poor because the character images contain non-uniform backgrounds.
发明内容 Contents of the invention
为了解决现有技术的问题,本发明的目的在于提供一种融合字符的二值图像与灰度图像进行字符识别的方法。In order to solve the problems in the prior art, the object of the present invention is to provide a method for character recognition by fusing binary images and grayscale images of characters.
为达成所述目的,本发明提供的融合二值图像与灰度图像的字符识别的方法,对所述二值图像与灰度图像的融合图像进行处理,进行字符识别,其包括以下步骤:In order to achieve the stated purpose, the method for character recognition of fusing binary images and grayscale images provided by the present invention is to process the fusion image of said binary images and grayscale images to perform character recognition, which includes the following steps:
步骤1:设预处理后得到的单个字符图像的二值图像为B0=[b0(x,y)],其中位于第x行第y列的像素点的值为b0(x,y),b0(x,y)为0或1,图像的大小为W1×H1;字符的灰度图像为Gc=[gc(x,y)],位于第x行第y列的像素点的值为gc(x,y),0≤gc(x,y)≤255;将字符图像的二值图像B0与灰度图像Gc进行融合,得融合后的图像G=[g(x,y)],位于第x行第y列的像素点的值为g(x,y),0≤g(x,y)≤255;Step 1: Let the binary image of a single character image obtained after preprocessing be B 0 =[b 0 (x, y)], where the value of the pixel located in row x, column y is b 0 (x, y ), b 0 (x, y) is 0 or 1, and the size of the image is W 1 ×H 1 ; the grayscale image of the character is G c =[g c (x, y)], located in row x, column y The value of the pixel point is g c (x, y), 0≤g c (x, y)≤255; the binary image B 0 of the character image is fused with the grayscale image G c to obtain the fused image G =[g(x, y)], the value of the pixel located in row x, column y is g(x, y), 0≤g(x, y)≤255;
步骤2:在提取融合图像G=[g(x,y)]的特征前,先进行融合图像G=[g(x,y)]的位置和大小的归一化处理;图像归一化处理的输入图像为G=[g(x,y)],归一化后的输出图像为F=[f(x’,y’)],其大小分别为W1×H1和W2×H2;输入图像G=[g(x,y)]位于第x行第y列的像素点将被映射到F=[f(x’,y’)]位于第x’行第y’列的像素点,通过输入图像和输出图像的坐标映射来实现图像归一化:Step 2: Before extracting the feature of the fusion image G=[g(x, y)], first carry out the normalization processing of the position and the size of the fusion image G=[g(x, y)]; image normalization processing The input image of is G=[g(x, y)], and the output image after normalization is F=[f(x', y')], whose sizes are W 1 ×H 1 and W 2 ×H respectively 2 ; the input image G=[g(x, y)] is located in the pixel of the xth row and the yth column will be mapped to F=[f(x', y')] is located in the x'th row and the y'column Pixels, through the coordinate mapping of the input image and the output image to achieve image normalization:
一维坐标映射为:One-dimensional coordinate mapping is:
步骤3:基于梯度直方图提取归一化图像的梯度直方图的特征;Step 3: Extract the features of the gradient histogram of the normalized image based on the gradient histogram;
步骤4:利用主分量分析与线性判别分析对归一化图像的梯度直方图的特征进行降维处理,得到特征降维的变换矩阵;Step 4: Use principal component analysis and linear discriminant analysis to perform dimensionality reduction processing on the features of the gradient histogram of the normalized image, and obtain a transformation matrix for feature dimensionality reduction;
步骤5:建立字符特征模板库,读取特征降维的变换矩阵并对字符进行识别。Step 5: Establish a character feature template library, read the transformation matrix of feature dimensionality reduction and recognize the characters.
本发明的有益效果:本发明的特征在于对字符图像的二值图像与灰度图像的融合图像进行处理,进行字符识别,其包括以下步骤:(1)二值图像与灰度图像的融合;(2)图像的归一化;(3)基于梯度直方图的特征提取;(4)特征降维;(5)分类器设计与字符识别。本发明克服了基于字符的二值图像或者是基于字符的灰度图像的传统字符识别技术不能同时识别退化字符图像以及包含复杂背景的字符图像的缺点。本发明的应用的技术领域包括视频中的字符识别,身份证图像、汽车牌照、自然场景图像中的字符识别Beneficial effects of the present invention: the feature of the present invention is to process the fusion image of the binary image and the grayscale image of the character image, and perform character recognition, which includes the following steps: (1) fusion of the binary image and the grayscale image; (2) Normalization of images; (3) Feature extraction based on gradient histogram; (4) Feature dimensionality reduction; (5) Classifier design and character recognition. The invention overcomes the disadvantage that the traditional character recognition technology based on the binary image of the character or the gray scale image of the character cannot recognize the degraded character image and the character image containing complex background at the same time. The technical field of application of the present invention includes character recognition in videos, character recognition in ID card images, car license plates, and natural scene images
附图说明 Description of drawings
图1为本发明的字符识别系统流程图;Fig. 1 is a flow chart of the character recognition system of the present invention;
图2为本发明二值图像与灰度图像的融合的构架示意图;2 is a schematic diagram of the framework of the fusion of binary images and grayscale images of the present invention;
图3为本发明图像的归一化的构架示意图;Fig. 3 is a schematic diagram of the framework of the normalization of the image of the present invention;
图4为本发明基于梯度直方图的特征提取的构架示意图;Fig. 4 is the framework schematic diagram of the feature extraction based on gradient histogram of the present invention;
图5为本发明求特征降维的变换矩阵的架构示意图;Fig. 5 is a schematic diagram of the architecture of the transformation matrix for feature dimensionality reduction in the present invention;
图6为本发明分类器设计和字符识别架构示意图;Fig. 6 is a schematic diagram of classifier design and character recognition architecture of the present invention;
图7为Sobel梯度算子模板;Figure 7 is a Sobel gradient operator template;
图8为L个标准方向示例,左边L=4,右边L=8;Figure 8 is an example of L standard directions, with L=4 on the left and L=8 on the right;
图9为梯度分解示例;Figure 9 is an example of gradient decomposition;
图10为计算像素与矩形区域中心的在水平方向上和竖直方向上距离示例。FIG. 10 is an example of calculating the horizontal and vertical distances between a pixel and the center of a rectangular area.
具体实施方式 Detailed ways
下面结合附图详细说明本发明技术方案中所涉及的各个细节问题。应指出的是,所描述的实施例仅旨在便于对本发明的理解,而对其不起任何限定作用。Various details involved in the technical solution of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be pointed out that the described embodiments are only intended to facilitate the understanding of the present invention, rather than limiting it in any way.
如图1所示,本发明的字符识别系统流程图,识别算法可以分为两个部分:训练系统和识别系统。训练系统对每个字符训练样本,融合其二值图像与灰度图像,对融合图像进行大小和位置的归一化,提取梯度直方图的特征;利用从训练样本中提取的特征,求解进行特征降维的变换矩阵,得到字符识别库。在识别系统中,融合待识别字符的二值图像与灰度图像,对融合图像进行大小和位置的归一化,提取梯度直方图的特征,利用训练系统得到的变换矩阵对特征进行降维,然后送入识别器,得到识别结果。As shown in Figure 1, the flow chart of the character recognition system of the present invention, the recognition algorithm can be divided into two parts: the training system and the recognition system. For each character training sample, the training system fuses its binary image and grayscale image, normalizes the size and position of the fused image, and extracts the features of the gradient histogram; uses the features extracted from the training samples to solve the feature The dimensionality reduction transformation matrix is used to obtain the character recognition library. In the recognition system, the binary image and the grayscale image of the characters to be recognized are fused, the size and position of the fused image are normalized, the features of the gradient histogram are extracted, and the transformation matrix obtained by the training system is used to reduce the dimensionality of the features. Then send it to the recognizer to get the recognition result.
融合字符二值图像与灰度图像进行字符识别系统的实现需要考虑如下几个方面:The following aspects need to be considered in the realization of the character recognition system by fusing the character binary image and the grayscale image:
1)训练系统的实现;1) Realization of the training system;
2)识别系统的实现。2) Realization of the identification system.
下面分别对这两个方面进行详细介绍。These two aspects are described in detail below.
1 训练系统的实现1 Implementation of training system
1.1 请参阅图2示出的二值图像与灰度图像的融合的构架。1.1 Please refer to the framework of the fusion of binary images and grayscale images shown in Figure 2.
设预处理后得到的单个字符图像的二值图像为B0=[b0(x,y)],其中位于第x行第y列的像素点的值为b0(x,y),b0(x,y)为0或1。字符的灰度图像为Gc=[gc(x,y)],位于第x行第y列的像素点的值为gc(x,y),0≤gc(x,y)≤255。二值图像为B0和灰度图像Gc的大小为W1×H1。通过下述流程进行字符的二值图像B0与字符的灰度图像Gc的融合,得到融合后的图像G=[g(x,y)],位于第x行第y列的像素点的值为g(x,y),0≤g(x,y)≤255:Suppose the binary image of the single character image obtained after the preprocessing is B 0 =[b 0 (x, y)], wherein the value of the pixel at the xth row and the yth column is b 0 (x, y), b 0 (x,y) is 0 or 1. The grayscale image of the character is G c =[g c (x, y)], the value of the pixel located in the xth row and the yth column is g c (x, y), 0≤g c (x, y)≤ 255. The size of the binary image is B 0 and the grayscale image G c is W 1 ×H 1 . Carry out the fusion of the binary image B 0 of the character and the grayscale image G c of the character through the following process, obtain the image G=[g(x, y)] after fusion, and the pixels located in the xth row and the yth column The value is g(x,y), 0≤g(x,y)≤255:
求一个全值阈值th,例如可以对灰度图像gc(x,y)利用传统的最大类间方差法(OSTU方法)求得这个全值阈值thostu,令th=a x thostu,a是一个常数。利用全局阈值th对灰度图像g(x,y)进行门限处理得到二值图像Bg=[bg(x,y)],bg(x,y)定义为:To find a full-value threshold th, for example, the traditional maximum inter-class variance method (OSTU method) can be used to obtain the full-value threshold th ostu for the grayscale image g c (x, y), let th=a x th ostu , a is a constant. Use the global threshold th to perform threshold processing on the grayscale image g(x, y) to obtain a binary image B g = [b g (x, y)], b g (x, y) is defined as:
利用二值形态学对二值图像Bo进行条件膨胀。设D是一个3 x 3的图像,其每个位置的像素值为1。对二值图像Bo进行条件膨胀为:The binary image B o is conditionally dilated using binary morphology. Let D be a 3 x 3 image with
根据上式对二值图像Bi反复进行条件膨胀,直到Bi+1=Bi或达到最大的迭代次数,设最后得到的二值图像为B=[b(x,y)]。According to the above formula, the binary image B i is repeatedly subjected to conditional expansion until B i+1 =B i or the maximum number of iterations is reached, and the final binary image obtained is B=[b(x,y)].
得到用于单个字符识别的融合图像G=[g(x,y)],g(x,y)定义为:Obtain the fused image G=[g(x, y)] that is used for single character recognition, g(x, y) is defined as:
1.2 请参阅图3示出的图像的归一化的构架。1.2 Please refer to the normalized framework of the image shown in Figure 3.
在提取融合图像的特征前,先进行字符图像的位置和大小的归一化处理。图像归一化的输入图像为G=[g(x,y)],归一化后的输出图像为F=[f(x’,y’)],其大小分别为W1×H1和W2×H2。输入图像G=[g(x,y)]位于第x行第y列的像素点将被映射到F=[f(x’,y’)]位于第x’行第y’列的像素点,通过输入图像和输出图像的坐标映射来实现图像归一化:Before extracting the features of the fused image, the position and size of the character image are normalized. The input image of image normalization is G=[g(x, y)], the output image after normalization is F=[f(x', y')], and its size is W 1 ×H 1 and W 2 ×H 2 . Input image G=[g(x, y)] The pixel located in the xth row and the yth column will be mapped to F=[f(x', y')] The pixel located in the x'th row and the y'column , to achieve image normalization by coordinate mapping of the input image and the output image:
一维坐标映射为One-dimensional coordinates are mapped to
计算融合图像G=[g(x,y)]的质心(xc,yc),把质心调整为归一化图像F=[f(x’,y’)]的中心(W2/2,H2/2):Calculate the centroid (x c , y c ) of the fused image G=[g(x,y)], and adjust the centroid to the center of the normalized image F=[f(x', y')] (W 2 /2 , H 2 /2):
其中gx(x)和gy(y)分别为融合图像G=[g(x,y)]在垂直方向和水平方向上的像素密度;;Where g x (x) and g y (y) are the pixel densities of the fused image G=[g(x, y)] in the vertical and horizontal directions, respectively;
根据质心位置(xc,yc),计算图像G=[g(x,y)]单边二阶矩 和 According to the position of the centroid (x c , y c ), calculate the second moment of the image G=[g(x, y)] and
根据计算的单边二阶矩设置输入图像的外框为
根据坐标映射函数确定输入图像G=[g(x,y)]与归一化图像F=[f(x’,y’)]坐标映射关系,对输入图像灰度值通过双线性插值,得到归一化图像F=[f(x’,y’)]的值。Determine the coordinate mapping relationship between the input image G=[g(x, y)] and the normalized image F=[f(x', y')] according to the coordinate mapping function, and pass bilinear interpolation to the gray value of the input image, Get the value of the normalized image F=[f(x', y')].
1.3 请参阅图4示出的基于梯度直方图的特征提取的构架。1.3 Please refer to the framework of feature extraction based on gradient histogram shown in Figure 4.
利用Sobel算子的两个3×3模板分别计算图像F=[f(x,y)]中每个位置上的梯度,Sobel算子的两个3x3模板如图7所示。对于图像F=[f(x,y)],其分别沿x轴和y轴方向的一阶导数分量通过下式求得:Using two 3×3 templates of the Sobel operator to calculate the gradient at each position in the image F=[f(x,y)] respectively, the two 3×3 templates of the Sobel operator are shown in FIG. 7 . For the image F=[f(x, y)], its first-order derivative components along the x-axis and y-axis directions are obtained by the following formula:
gx(x,y)=f(x+1,y-1)+2f(x+1,y)+f(x+1,y+1)g x (x, y) = f(x+1, y-1)+2f(x+1, y)+f(x+1, y+1)
-f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1),-f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1),
gy(x,y)=f(x-1,y+1)+2f(x,y+1)+f(x+1,y+1)g y (x, y) = f(x-1, y+1)+2f(x, y+1)+f(x+1, y+1)
-f(x-1,y-1)-2f(x,y-1)-f(x+1,y-1).-f(x-1,y-1)-2f(x,y-1)-f(x+1,y-1).
x=0,...,W2-1,y=1,...,H2-1;x=0,..., W2-1 , y=1,..., H2-1 ;
计算图像F=[f(x,y)]位置(x,y)的梯度强度mag(x,y)和方向角(x,y)分别为:Compute gradient strength mag(x,y) and orientation angle of position (x,y) in image F = [f(x,y)] (x, y) are:
定义L个标准方向,L=4和L=8的情况如图8和所示。将梯度利用平行四边形法则分解为离它最近的两个标准方向,如图9所示。将归一化后大小为W2×H2的图像F=[f(x,y)]分割成R×R个互不相交的矩形区域,为每个矩形区域建立L维的梯度方向直方图。图像F=[f(x,y)]中每个像素的梯度对与这个像素最近的4个矩形区域的梯度方向直方图有贡献。如图10所示为一个像素与其最近的4个矩形区域(从上到下,从左到右分别编号为1,2,3和4),其中每个小矩形框表示一个像素,4×4个小矩形框组成一个大矩形区域。在水平方向上,像素与矩形区域中心的距离分别为dhl和dhr;在竖直方向上,像素与矩形区域中心的距离分别为dvt和dvb。设像素梯度在l方向上的分量的强度为gl,则这个像素的梯度对第1,2,3和4个矩形区域的梯度方向直方图的第l维的贡献值分别为gl×dhr×dvb/((dhl+dhr)×(dvt+dvb)),gl×dhl×dvb/((dhl+dhr)×(dvt+dvb)),gl×dhr×dvt/((dhl+dhr)×(dvt+dvb))和gl×dhl×dvt/((dhl+dhr)×(dvt+dvb))。利用这种方法计算每个像素的梯度对与其邻近的矩形区域的梯度方向直方图的贡献,求得每个矩形区域梯度方向直方图,最后得到了字符图像的R×R×L维特征。Define L standard directions, and the situations of L=4 and L=8 are shown in Fig. 8 and . The gradient is decomposed into two standard directions closest to it using the parallelogram rule, as shown in Figure 9. Divide the normalized image F=[f(x, y)] with a size of W 2 ×H 2 into R×R disjoint rectangular areas, and establish an L-dimensional gradient direction histogram for each rectangular area . The gradient of each pixel in the image F=[f(x,y)] contributes to the gradient direction histogram of the 4 nearest rectangular regions to this pixel. As shown in Figure 10, a pixel and its nearest 4 rectangular areas (numbered 1, 2, 3 and 4 from top to bottom and from left to right), where each small rectangular box represents a pixel, 4×4 A large rectangular area is composed of two small rectangular boxes. In the horizontal direction, the distances between the pixel and the center of the rectangular area are respectively d hl and d hr ; in the vertical direction, the distances between the pixel and the center of the rectangular area are respectively d vt and d vb . Assuming that the component intensity of the pixel gradient in the l direction is g l , then the contribution value of the gradient of this pixel to the lth dimension of the gradient direction histogram of the 1st, 2nd, 3rd and 4th rectangular areas is respectively g l × d hr ×d vb /((d hl +d hr )×(d vt +d vb )), g l ×d hl ×d vb /((d hl +d hr )×(d vt +d vb )), g l ×d hr ×d vt /((d hl +d hr )×(d vt +d vb )) and g l ×d hl ×d vt /((d hl +d hr )×(d vt +d vb )). Using this method to calculate the contribution of the gradient of each pixel to the gradient orientation histogram of its adjacent rectangular area, obtain the gradient orientation histogram of each rectangular area, and finally obtain the R×R×L dimensional features of the character image.
1.4 请参阅图5示出的求特征降维的变换矩阵的架构:1.4 Please refer to the structure of the transformation matrix for feature dimensionality reduction shown in Figure 5:
1.4.1 主分量分析(PCA)1.4.1 Principal Component Analysis (PCA)
高维特征向量包含相互关联的特征,对其处理运算量大,利用主分量分析对高维特征向量进行主分量分析(PCA),求解PCA降维矩阵PPCA;设从n个训练样本中提取的字符特征为xi,i=1,...,n,xi的维数m=R×R×L;训练样本字符特征的散度矩阵为:High-dimensional eigenvectors contain interrelated features, which require a large amount of processing. Principal component analysis is used to perform principal component analysis (PCA) on high-dimensional eigenvectors to solve the PCA dimensionality reduction matrix P PCA ; it is assumed to extract from n training samples The character feature of x i , i=1,...,n, the dimension of x i m=R×R×L; the scatter matrix of the character feature of the training sample is:
对散度矩阵进行特征值分解为:The eigenvalue decomposition of the scatter matrix is:
∑=UΛUT ∑=UΛU T
其中U=[u1,u2,...,um]为正交矩阵,Λ=diag(λ1,λ2,…,λm)为对角矩阵,λ1≥λ2≥…≥λm为特征值。设主分量分析PCA降维后要保存r%的能量,则主分量分析保存的主方向个数l为Where U=[u 1 , u 2 ,..., u m ] is an orthogonal matrix, Λ=diag(λ 1 , λ 2 ,..., λ m ) is a diagonal matrix, λ 1 ≥λ 2 ≥...≥ λ m is the eigenvalue. Assuming that r% energy is to be saved after PCA dimension reduction, the number of main directions l saved by principal component analysis is
主分量分析得到的变换矩阵为PPCA=[u1,u2,...,ul],对字符特征xi,进行降维得到降维后的l维字符特征zi=(PPCA)Txi,i=1,...,n,(PPCA)T表示PPCA的转置矩阵;The transformation matrix obtained by principal component analysis is P PCA =[u 1 , u 2 ,..., u l ], and character feature x i is subjected to dimensionality reduction to obtain dimension-reduced l-dimensional character feature z i =(P PCA ) T x i , i=1,...,n, (P PCA ) T represents the transposition matrix of P PCA ;
1.4.2 对训练样本降维后的字符特征进行线性判别分析(LDA),求解变换矩阵W:1.4.2 Perform linear discriminant analysis (LDA) on the character features of the training samples after dimensionality reduction, and solve the transformation matrix W:
设识别系统中待识别的字符类别数为C,第i类包含ni个训练样本。计算第i类字符样本特征均值μi和所有样本特征均值μ:Assuming that the number of character categories to be recognized in the recognition system is C, the i-th category contains n i training samples. Calculate the i-th character sample feature mean μ i and all sample feature mean μ:
计算类间散度矩阵Sb和类内散度矩阵Sw:Calculate the between-class scatter matrix S b and the intra-class scatter matrix S w :
线性判别分析寻找一个变换矩阵W,使得变换后类间离散度尽量大,同时类内离散度尽量小,利用最大化判据Linear discriminant analysis looks for a transformation matrix W, so that after the transformation, the inter-class dispersion is as large as possible, while the intra-class dispersion is as small as possible, and the maximization criterion is used
来表示。LDA可以通过求解广义特征向量问题来解决:To represent. LDA can be solved by solving a generalized eigenvector problem:
Sbw=λSwwS b w = λS w w
设向量w1,...,wd,...,wl为广义特征向量问题的解,它们对应的广义特征值λ1≥…≥λd≥…≥λl,选择前d个广义特征向量问题的解特征向量组成W,即W=[w1,...,wd]。Let the vectors w 1 ,...,w d ,...,w l be the solution of the generalized eigenvector problem, and their corresponding generalized eigenvalues λ 1 ≥...≥λ d ≥...≥λ l , choose the first d generalized eigenvectors The solution eigenvectors of the eigenvector problem consist of W, that is, W=[w 1 , . . . , w d ].
1.5 请参阅图6示出的分类器设计和字符识别架构:1.5 Please refer to the classifier design and character recognition architecture shown in Figure 6:
利用变换矩阵W对第i个字符类特征均值μi进行降维,并归一化降维后的特征Use the transformation matrix W to reduce the dimensionality of the i-th character class feature mean μ i , and normalize the dimensionality-reduced features
保存变换矩阵P=WPPCA,每个字符类的编码及其对应的特征μi在识别库中文件中。The transformation matrix P=WP PCA is saved, and the code of each character class and its corresponding feature μ i are in the file in the recognition library.
(5.2)字符识别(5.2) Character recognition
2 识别系统的实现2 Realization of the recognition system
从字符识别库中文件中读取变换矩阵P,每个字符类的编码及其对应的特征μi。对每个待识别字符的二值图像与灰度图像进行融合,对融合后的图像进行归一化,进行特征提取得到字符图像的多维特征a。利用变换矩阵P对字符图像的多维特征a进行特征降维得到降维后的特征b=PTa,PT为变换矩阵P的转置矩阵。对降维后的特征归一化得到Read the transformation matrix P, the code of each character class and its corresponding feature μ i from the file in the character recognition library. The binary image and the grayscale image of each character to be recognized are fused, the fused image is normalized, and feature extraction is performed to obtain the multidimensional feature a of the character image. Using the transformation matrix P to perform feature dimensionality reduction on the multi-dimensional feature a of the character image to obtain the dimensionality-reduced feature b= PT a, where PT is the transposition matrix of the transformation matrix P. The features after dimensionality reduction are normalized to get
顺序计算b与每个字符类的归一化中心矢量{μi}1≤i≤C的余弦距离{di}1≤i≤C Sequentially calculate the cosine distance {d i } 1≤i≤C of b and the normalized center vector {μ i } 1≤i≤C of each character class
di=1-yTμd i =1-y T μ
距离最小的类即为字符图像的识别结果。The class with the smallest distance is the recognition result of the character image.
以上所述,仅为本发明中的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉该技术的人在本发明所揭露的技术范围内,可理解想到的变换或替换,都应涵盖在本发明的包含范围之内,因此,本发明的保护范围应该以权利要求书的保护范围为准。The above is only a specific implementation mode in the present invention, but the scope of protection of the present invention is not limited thereto. Anyone familiar with the technology can understand the conceivable transformation or replacement within the technical scope disclosed in the present invention. All should be covered within the scope of the present invention, therefore, the protection scope of the present invention should be based on the protection scope of the claims.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200810239331 CN101751565B (en) | 2008-12-10 | 2008-12-10 | Method for character identification through fusing binary image and gray level image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200810239331 CN101751565B (en) | 2008-12-10 | 2008-12-10 | Method for character identification through fusing binary image and gray level image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101751565A CN101751565A (en) | 2010-06-23 |
CN101751565B true CN101751565B (en) | 2013-01-02 |
Family
ID=42478527
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200810239331 Expired - Fee Related CN101751565B (en) | 2008-12-10 | 2008-12-10 | Method for character identification through fusing binary image and gray level image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101751565B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102011075275A1 (en) * | 2011-05-04 | 2012-11-08 | Bundesdruckerei Gmbh | Method and device for recognizing a character |
CN102750530B (en) * | 2012-05-31 | 2014-11-26 | 贺江涛 | Character recognition method and device |
CN103854020B (en) * | 2012-11-29 | 2018-11-30 | 捷讯平和(北京)科技发展有限公司 | Character recognition method and device |
CN103679208A (en) * | 2013-11-27 | 2014-03-26 | 北京中科模识科技有限公司 | Broadcast and television caption recognition based automatic training data generation and deep learning method |
CN106257495A (en) * | 2015-06-19 | 2016-12-28 | 阿里巴巴集团控股有限公司 | A kind of digit recognition method and device |
CN106203434B (en) * | 2016-07-08 | 2019-07-19 | 中国科学院自动化研究所 | Document Image Binarization Method Based on Symmetry of Stroke Structure |
CN108319958A (en) * | 2018-03-16 | 2018-07-24 | 福州大学 | A kind of matched driving license of feature based fusion detects and recognition methods |
CN108830138B (en) * | 2018-04-26 | 2021-05-07 | 平安科技(深圳)有限公司 | Livestock identification method, device and storage medium |
CN109520706B (en) * | 2018-11-21 | 2020-10-09 | 云南师范大学 | Screw hole coordinate extraction method of automobile fuse box |
CN109919253A (en) * | 2019-03-27 | 2019-06-21 | 北京爱数智慧科技有限公司 | Character identifying method, device, equipment and computer-readable medium |
CN111583217A (en) * | 2020-04-30 | 2020-08-25 | 深圳开立生物医疗科技股份有限公司 | Tumor ablation curative effect prediction method, device, equipment and computer medium |
CN112200247B (en) * | 2020-10-12 | 2021-07-02 | 西安泽塔云科技股份有限公司 | Image processing system and method based on multi-dimensional image mapping |
-
2008
- 2008-12-10 CN CN 200810239331 patent/CN101751565B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN101751565A (en) | 2010-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101751565B (en) | Method for character identification through fusing binary image and gray level image | |
Boiman et al. | In defense of nearest-neighbor based image classification | |
CN100426314C (en) | Feature classification based multiple classifiers combined people face recognition method | |
Banerji et al. | New image descriptors based on color, texture, shape, and wavelets for object and scene image classification | |
CN101739555B (en) | Method and system for detecting false face, and method and system for training false face model | |
CN104318252B (en) | Hyperspectral image classification method based on stratified probability model | |
CN104392241B (en) | A kind of head pose estimation method returned based on mixing | |
US20120093420A1 (en) | Method and device for classifying image | |
CN106778586A (en) | Offline handwriting signature verification method and system | |
CN102663413A (en) | Multi-gesture and cross-age oriented face image authentication method | |
Ameur et al. | Fusing Gabor and LBP feature sets for KNN and SRC-based face recognition | |
CN101968813A (en) | Method for detecting counterfeit webpage | |
CN102682287A (en) | Pedestrian detection method based on saliency information | |
CN113239839B (en) | Expression recognition method based on DCA face feature fusion | |
CN106169073A (en) | A kind of expression recognition method and system | |
Cai et al. | Traffic sign recognition algorithm based on shape signature and dual-tree complex wavelet transform | |
Sinha et al. | New color GPHOG descriptors for object and scene image classification | |
Nanni et al. | Ensemble of texture descriptors for face recognition obtained by varying feature transforms and preprocessing approaches | |
Han et al. | Multilinear supervised neighborhood embedding of a local descriptor tensor for scene/object recognition | |
Chen et al. | Unconstrained face verification using fisher vectors computed from frontalized faces | |
Aly et al. | A multi-modal feature fusion framework for kinect-based facial expression recognition using dual kernel discriminant analysis (DKDA) | |
CN107578005A (en) | A LBP Face Recognition Method in Complex Wavelet Transform Domain | |
CN103942572A (en) | Method and device for extracting facial expression features based on bidirectional compressed data space dimension reduction | |
CN109376680A (en) | A fast face recognition method based on the efficient fusion of Hog and Gabor features based on near-infrared face images | |
Alrashed et al. | Facial gender recognition using eyes images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130102 |
|
CF01 | Termination of patent right due to non-payment of annual fee |