CN101751565B - Method for character identification through fusing binary image and gray level image - Google Patents

Method for character identification through fusing binary image and gray level image Download PDF

Info

Publication number
CN101751565B
CN101751565B CN 200810239331 CN200810239331A CN101751565B CN 101751565 B CN101751565 B CN 101751565B CN 200810239331 CN200810239331 CN 200810239331 CN 200810239331 A CN200810239331 A CN 200810239331A CN 101751565 B CN101751565 B CN 101751565B
Authority
CN
China
Prior art keywords
image
character
sigma
feature
binary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200810239331
Other languages
Chinese (zh)
Other versions
CN101751565A (en
Inventor
张树武
杨武夷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN 200810239331 priority Critical patent/CN101751565B/en
Publication of CN101751565A publication Critical patent/CN101751565A/en
Application granted granted Critical
Publication of CN101751565B publication Critical patent/CN101751565B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Character Discrimination (AREA)

Abstract

本发明涉及一种融合二值图像与灰度图像的字符识别的方法,包括对字符图像的二值图像与灰度图像的融合图像进行处理,进行字符识别:将字符图像的二值图像与灰度图像进行融合得到融合图像;对融合图像进行大小和位置的归一化;提取归一化图像的梯度直方图的特征;利用主分量分析与线性判别分析得到特征降维的变换矩阵;建立字符特征模板库,进行字符识别。本发明克服了基于字符的二值图像或者是基于字符的灰度图像的传统字符识别技术不能同时识别退化字符图像以及包含复杂背景的字符图像的缺点。

The invention relates to a character recognition method of fusing a binary image and a grayscale image, which includes processing the fused image of the binary image and the grayscale image of the character image to perform character recognition: combining the binary image of the character image with the grayscale image The fused image is obtained by fusing the high-degree image; the size and position of the fused image are normalized; the features of the gradient histogram of the normalized image are extracted; the transformation matrix of feature reduction is obtained by using principal component analysis and linear discriminant analysis; Feature template library for character recognition. The invention overcomes the disadvantage that the traditional character recognition technology based on the binary image of the character or the gray scale image of the character cannot recognize the degraded character image and the character image containing complex background at the same time.

Description

融合二值图像与灰度图像的字符识别的方法Method of Character Recognition by Fusion of Binary Image and Grayscale Image

技术领域 technical field

本发明属于字符识别领域(简称OCR),涉及一致融合二值图像与灰度图像的字符识别的方法。The invention belongs to the field of character recognition (abbreviated as OCR), and relates to a character recognition method for consistent fusion of binary images and grayscale images.

背景技术 Background technique

传统的字符识别技术是基于字符的二值图像或者是基于字符的灰度图像。当基于字符的二值图像的识别技术应用于各种低质量图像,比如视频中的退化字符图像、身份证图像、汽车牌照、自然场景中的字符图像等低分辨率图像,由于二值化后的字符图像质量低,识别效果差。当基于字符的灰度图像的识别技术应用于包含复杂背景的字符图像,比如视频中的字符图像,由于字符图像包含非一致的背景,识别效果将变差。The traditional character recognition technology is based on the binary image of the character or the grayscale image based on the character. When the recognition technology based on character binary images is applied to various low-quality images, such as low-resolution images such as degraded character images in videos, ID card images, car license plates, and character images in natural scenes, due to binarization The character image quality is low, and the recognition effect is poor. When the recognition technology based on character grayscale images is applied to character images containing complex backgrounds, such as character images in videos, the recognition effect will be poor because the character images contain non-uniform backgrounds.

发明内容 Contents of the invention

为了解决现有技术的问题,本发明的目的在于提供一种融合字符的二值图像与灰度图像进行字符识别的方法。In order to solve the problems in the prior art, the object of the present invention is to provide a method for character recognition by fusing binary images and grayscale images of characters.

为达成所述目的,本发明提供的融合二值图像与灰度图像的字符识别的方法,对所述二值图像与灰度图像的融合图像进行处理,进行字符识别,其包括以下步骤:In order to achieve the stated purpose, the method for character recognition of fusing binary images and grayscale images provided by the present invention is to process the fusion image of said binary images and grayscale images to perform character recognition, which includes the following steps:

步骤1:设预处理后得到的单个字符图像的二值图像为B0=[b0(x,y)],其中位于第x行第y列的像素点的值为b0(x,y),b0(x,y)为0或1,图像的大小为W1×H1;字符的灰度图像为Gc=[gc(x,y)],位于第x行第y列的像素点的值为gc(x,y),0≤gc(x,y)≤255;将字符图像的二值图像B0与灰度图像Gc进行融合,得融合后的图像G=[g(x,y)],位于第x行第y列的像素点的值为g(x,y),0≤g(x,y)≤255;Step 1: Let the binary image of a single character image obtained after preprocessing be B 0 =[b 0 (x, y)], where the value of the pixel located in row x, column y is b 0 (x, y ), b 0 (x, y) is 0 or 1, and the size of the image is W 1 ×H 1 ; the grayscale image of the character is G c =[g c (x, y)], located in row x, column y The value of the pixel point is g c (x, y), 0≤g c (x, y)≤255; the binary image B 0 of the character image is fused with the grayscale image G c to obtain the fused image G =[g(x, y)], the value of the pixel located in row x, column y is g(x, y), 0≤g(x, y)≤255;

步骤2:在提取融合图像G=[g(x,y)]的特征前,先进行融合图像G=[g(x,y)]的位置和大小的归一化处理;图像归一化处理的输入图像为G=[g(x,y)],归一化后的输出图像为F=[f(x’,y’)],其大小分别为W1×H1和W2×H2;输入图像G=[g(x,y)]位于第x行第y列的像素点将被映射到F=[f(x’,y’)]位于第x’行第y’列的像素点,通过输入图像和输出图像的坐标映射来实现图像归一化:Step 2: Before extracting the feature of the fusion image G=[g(x, y)], first carry out the normalization processing of the position and the size of the fusion image G=[g(x, y)]; image normalization processing The input image of is G=[g(x, y)], and the output image after normalization is F=[f(x', y')], whose sizes are W 1 ×H 1 and W 2 ×H respectively 2 ; the input image G=[g(x, y)] is located in the pixel of the xth row and the yth column will be mapped to F=[f(x', y')] is located in the x'th row and the y'column Pixels, through the coordinate mapping of the input image and the output image to achieve image normalization:

xx ′′ == xx ′′ (( xx ,, ythe y )) ythe y ′′ == ythe y ′′ (( xx ,, ythe y ))

一维坐标映射为:One-dimensional coordinate mapping is:

xx ′′ == xx ′′ (( xx )) ythe y ′′ == ythe y ′′ (( xx )) ;;

步骤3:基于梯度直方图提取归一化图像的梯度直方图的特征;Step 3: Extract the features of the gradient histogram of the normalized image based on the gradient histogram;

步骤4:利用主分量分析与线性判别分析对归一化图像的梯度直方图的特征进行降维处理,得到特征降维的变换矩阵;Step 4: Use principal component analysis and linear discriminant analysis to perform dimensionality reduction processing on the features of the gradient histogram of the normalized image, and obtain a transformation matrix for feature dimensionality reduction;

步骤5:建立字符特征模板库,读取特征降维的变换矩阵并对字符进行识别。Step 5: Establish a character feature template library, read the transformation matrix of feature dimensionality reduction and recognize the characters.

本发明的有益效果:本发明的特征在于对字符图像的二值图像与灰度图像的融合图像进行处理,进行字符识别,其包括以下步骤:(1)二值图像与灰度图像的融合;(2)图像的归一化;(3)基于梯度直方图的特征提取;(4)特征降维;(5)分类器设计与字符识别。本发明克服了基于字符的二值图像或者是基于字符的灰度图像的传统字符识别技术不能同时识别退化字符图像以及包含复杂背景的字符图像的缺点。本发明的应用的技术领域包括视频中的字符识别,身份证图像、汽车牌照、自然场景图像中的字符识别Beneficial effects of the present invention: the feature of the present invention is to process the fusion image of the binary image and the grayscale image of the character image, and perform character recognition, which includes the following steps: (1) fusion of the binary image and the grayscale image; (2) Normalization of images; (3) Feature extraction based on gradient histogram; (4) Feature dimensionality reduction; (5) Classifier design and character recognition. The invention overcomes the disadvantage that the traditional character recognition technology based on the binary image of the character or the gray scale image of the character cannot recognize the degraded character image and the character image containing complex background at the same time. The technical field of application of the present invention includes character recognition in videos, character recognition in ID card images, car license plates, and natural scene images

附图说明 Description of drawings

图1为本发明的字符识别系统流程图;Fig. 1 is a flow chart of the character recognition system of the present invention;

图2为本发明二值图像与灰度图像的融合的构架示意图;2 is a schematic diagram of the framework of the fusion of binary images and grayscale images of the present invention;

图3为本发明图像的归一化的构架示意图;Fig. 3 is a schematic diagram of the framework of the normalization of the image of the present invention;

图4为本发明基于梯度直方图的特征提取的构架示意图;Fig. 4 is the framework schematic diagram of the feature extraction based on gradient histogram of the present invention;

图5为本发明求特征降维的变换矩阵的架构示意图;Fig. 5 is a schematic diagram of the architecture of the transformation matrix for feature dimensionality reduction in the present invention;

图6为本发明分类器设计和字符识别架构示意图;Fig. 6 is a schematic diagram of classifier design and character recognition architecture of the present invention;

图7为Sobel梯度算子模板;Figure 7 is a Sobel gradient operator template;

图8为L个标准方向示例,左边L=4,右边L=8;Figure 8 is an example of L standard directions, with L=4 on the left and L=8 on the right;

图9为梯度分解示例;Figure 9 is an example of gradient decomposition;

图10为计算像素与矩形区域中心的在水平方向上和竖直方向上距离示例。FIG. 10 is an example of calculating the horizontal and vertical distances between a pixel and the center of a rectangular area.

具体实施方式 Detailed ways

下面结合附图详细说明本发明技术方案中所涉及的各个细节问题。应指出的是,所描述的实施例仅旨在便于对本发明的理解,而对其不起任何限定作用。Various details involved in the technical solution of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be pointed out that the described embodiments are only intended to facilitate the understanding of the present invention, rather than limiting it in any way.

如图1所示,本发明的字符识别系统流程图,识别算法可以分为两个部分:训练系统和识别系统。训练系统对每个字符训练样本,融合其二值图像与灰度图像,对融合图像进行大小和位置的归一化,提取梯度直方图的特征;利用从训练样本中提取的特征,求解进行特征降维的变换矩阵,得到字符识别库。在识别系统中,融合待识别字符的二值图像与灰度图像,对融合图像进行大小和位置的归一化,提取梯度直方图的特征,利用训练系统得到的变换矩阵对特征进行降维,然后送入识别器,得到识别结果。As shown in Figure 1, the flow chart of the character recognition system of the present invention, the recognition algorithm can be divided into two parts: the training system and the recognition system. For each character training sample, the training system fuses its binary image and grayscale image, normalizes the size and position of the fused image, and extracts the features of the gradient histogram; uses the features extracted from the training samples to solve the feature The dimensionality reduction transformation matrix is used to obtain the character recognition library. In the recognition system, the binary image and the grayscale image of the characters to be recognized are fused, the size and position of the fused image are normalized, the features of the gradient histogram are extracted, and the transformation matrix obtained by the training system is used to reduce the dimensionality of the features. Then send it to the recognizer to get the recognition result.

融合字符二值图像与灰度图像进行字符识别系统的实现需要考虑如下几个方面:The following aspects need to be considered in the realization of the character recognition system by fusing the character binary image and the grayscale image:

1)训练系统的实现;1) Realization of the training system;

2)识别系统的实现。2) Realization of the identification system.

下面分别对这两个方面进行详细介绍。These two aspects are described in detail below.

1 训练系统的实现1 Implementation of training system

1.1 请参阅图2示出的二值图像与灰度图像的融合的构架。1.1 Please refer to the framework of the fusion of binary images and grayscale images shown in Figure 2.

设预处理后得到的单个字符图像的二值图像为B0=[b0(x,y)],其中位于第x行第y列的像素点的值为b0(x,y),b0(x,y)为0或1。字符的灰度图像为Gc=[gc(x,y)],位于第x行第y列的像素点的值为gc(x,y),0≤gc(x,y)≤255。二值图像为B0和灰度图像Gc的大小为W1×H1。通过下述流程进行字符的二值图像B0与字符的灰度图像Gc的融合,得到融合后的图像G=[g(x,y)],位于第x行第y列的像素点的值为g(x,y),0≤g(x,y)≤255:Suppose the binary image of the single character image obtained after the preprocessing is B 0 =[b 0 (x, y)], wherein the value of the pixel at the xth row and the yth column is b 0 (x, y), b 0 (x,y) is 0 or 1. The grayscale image of the character is G c =[g c (x, y)], the value of the pixel located in the xth row and the yth column is g c (x, y), 0≤g c (x, y)≤ 255. The size of the binary image is B 0 and the grayscale image G c is W 1 ×H 1 . Carry out the fusion of the binary image B 0 of the character and the grayscale image G c of the character through the following process, obtain the image G=[g(x, y)] after fusion, and the pixels located in the xth row and the yth column The value is g(x,y), 0≤g(x,y)≤255:

求一个全值阈值th,例如可以对灰度图像gc(x,y)利用传统的最大类间方差法(OSTU方法)求得这个全值阈值thostu,令th=a x thostu,a是一个常数。利用全局阈值th对灰度图像g(x,y)进行门限处理得到二值图像Bg=[bg(x,y)],bg(x,y)定义为:To find a full-value threshold th, for example, the traditional maximum inter-class variance method (OSTU method) can be used to obtain the full-value threshold th ostu for the grayscale image g c (x, y), let th=a x th ostu , a is a constant. Use the global threshold th to perform threshold processing on the grayscale image g(x, y) to obtain a binary image B g = [b g (x, y)], b g (x, y) is defined as:

bb gg (( xx ,, ythe y )) == 11 gg cc (( xx ,, ythe y )) >> ththe th 00 gg cc (( xx ,, ythe y )) ≤≤ ththe th ,, xx == 00 ,, ·&Center Dot; ·&Center Dot; ·&Center Dot; ,, WW 11 -- 11 ,, ythe y == 00 ,, ·&Center Dot; ·&Center Dot; ·&Center Dot; ,, Hh 11 -- 11 ;;

利用二值形态学对二值图像Bo进行条件膨胀。设D是一个3 x 3的图像,其每个位置的像素值为1。对二值图像Bo进行条件膨胀为:The binary image B o is conditionally dilated using binary morphology. Let D be a 3 x 3 image with pixel value 1 at each location. The conditional expansion of the binary image B o is:

BB ii ++ 11 == (( BB ii ⊕⊕ DD. )) ∩∩ BB gg ,, ii == 1,21,2 ,, ·· ·· ·&Center Dot; ,, NN ,, BB 11 == BB 00

根据上式对二值图像Bi反复进行条件膨胀,直到Bi+1=Bi或达到最大的迭代次数,设最后得到的二值图像为B=[b(x,y)]。According to the above formula, the binary image B i is repeatedly subjected to conditional expansion until B i+1 =B i or the maximum number of iterations is reached, and the final binary image obtained is B=[b(x,y)].

得到用于单个字符识别的融合图像G=[g(x,y)],g(x,y)定义为:Obtain the fused image G=[g(x, y)] that is used for single character recognition, g(x, y) is defined as:

gg (( xx ,, ythe y )) == gg cc (( xx ,, ythe y )) bb (( xx ,, ythe y )) == 11 00 bb (( xx ,, ythe y )) == 00 ,, xx == 00 ,, ·· ·&Center Dot; ·&Center Dot; ,, WW 11 -- 11 ,, ythe y == 00 ,, ·· ·&Center Dot; ·· ,, Hh 11 -- 11 ..

1.2 请参阅图3示出的图像的归一化的构架。1.2 Please refer to the normalized framework of the image shown in Figure 3.

在提取融合图像的特征前,先进行字符图像的位置和大小的归一化处理。图像归一化的输入图像为G=[g(x,y)],归一化后的输出图像为F=[f(x’,y’)],其大小分别为W1×H1和W2×H2。输入图像G=[g(x,y)]位于第x行第y列的像素点将被映射到F=[f(x’,y’)]位于第x’行第y’列的像素点,通过输入图像和输出图像的坐标映射来实现图像归一化:Before extracting the features of the fused image, the position and size of the character image are normalized. The input image of image normalization is G=[g(x, y)], the output image after normalization is F=[f(x', y')], and its size is W 1 ×H 1 and W 2 ×H 2 . Input image G=[g(x, y)] The pixel located in the xth row and the yth column will be mapped to F=[f(x', y')] The pixel located in the x'th row and the y'column , to achieve image normalization by coordinate mapping of the input image and the output image:

xx ′′ == xx ′′ (( xx ,, ythe y )) ythe y ′′ == ythe y ′′ (( xx ,, ythe y ))

一维坐标映射为One-dimensional coordinates are mapped to

xx ′′ == xx ′′ (( xx )) ythe y ′′ == ythe y ′′ (( xx )) ;;

计算融合图像G=[g(x,y)]的质心(xc,yc),把质心调整为归一化图像F=[f(x’,y’)]的中心(W2/2,H2/2):Calculate the centroid (x c , y c ) of the fused image G=[g(x,y)], and adjust the centroid to the center of the normalized image F=[f(x', y')] (W 2 /2 , H 2 /2):

gg xx (( xx )) == ΣΣ ythe y == 00 Hh 11 -- 11 gg (( xx ,, ythe y )) // ΣΣ xx == 00 WW 11 -- 11 ΣΣ ythe y == 00 Hh 11 -- 11 gg (( xx ,, ythe y )) ,, xx == 00 ,, ·· ·&Center Dot; ·&Center Dot; ,, WW 11 -- 11 ,,

gg ythe y (( ythe y )) == ΣΣ xx == 00 WW 11 -- 11 gg (( xx ,, ythe y )) // ΣΣ xx == 00 WW 11 -- 11 ΣΣ ythe y == 00 Hh 11 -- 11 gg (( xx ,, ythe y )) ,, ythe y == 00 ,, ·&Center Dot; ·· ·&Center Dot; ,, Hh 11 -- 11 ,,

xx cc == ΣΣ xx == 00 WW 11 -- 11 xx gg xx (( xx )) ,,

ythe y cc == ΣΣ ythe y == 00 Hh 11 -- 11 ythe y gg ythe y (( ythe y )) ,,

其中gx(x)和gy(y)分别为融合图像G=[g(x,y)]在垂直方向和水平方向上的像素密度;;Where g x (x) and g y (y) are the pixel densities of the fused image G=[g(x, y)] in the vertical and horizontal directions, respectively;

根据质心位置(xc,yc),计算图像G=[g(x,y)]单边二阶矩

Figure G2008102393315D00055
Figure G2008102393315D00056
Figure G2008102393315D00057
Figure G2008102393315D00058
According to the position of the centroid (x c , y c ), calculate the second moment of the image G=[g(x, y)]
Figure G2008102393315D00055
Figure G2008102393315D00056
Figure G2008102393315D00057
and
Figure G2008102393315D00058

μμ xx ++ == ΣΣ xx >> xx cc (( xx -- xx cc )) 22 gg xx (( xx ))

&mu;&mu; xx ++ == &Sigma;&Sigma; xx << xx cc (( xx -- xx cc )) 22 gg xx (( xx ))

&mu;&mu; ythe y ++ == &Sigma;&Sigma; ythe y >> ythe y cc (( ythe y -- ythe y cc )) 22 gg ythe y (( ythe y ))

&mu;&mu; ythe y ++ == &Sigma;&Sigma; ythe y << ythe y cc (( ythe y -- ythe y cc )) 22 gg ythe y (( ythe y )) ;;

根据计算的单边二阶矩设置输入图像的外框为 [ x c - 2 &mu; x - , x c + 2 &mu; x + ] [ y c - 2 &mu; y - , y c + 2 &mu; y + ] . 对于x轴,求解二次函数u(x)=ax2+bx+c把x轴上的三个点 ( x c - 2 &mu; x - , x c , x c + 2 &mu; x + ) 分别映射为(0,0.5,1),同理得到y轴的二次函数u(y)把y轴上的三个点 ( y c - 2 &mu; y - , y c , y c + 2 &mu; y + ) 分别映射为(0,0.5,1);得到输入图像G=[g(x,y)]位于第x行第y列的像素点和输出图像F=[f(x’,y’)]位于第x’行第y’列的像素点的坐标映射函数:According to the calculated unilateral second-order moment, the outer frame of the input image is set as [ x c - 2 &mu; x - , x c + 2 &mu; x + ] and [ the y c - 2 &mu; the y - , the y c + 2 &mu; the y + ] . For the x-axis, solve the quadratic function u(x)=ax 2 +bx+c put the three points on the x-axis ( x c - 2 &mu; x - , x c , x c + 2 &mu; x + ) They are respectively mapped to (0, 0.5, 1), and similarly, the quadratic function u(y) of the y-axis is obtained and the three points on the y-axis are ( the y c - 2 &mu; the y - , the y c , the y c + 2 &mu; the y + ) respectively mapped to (0, 0.5, 1); the input image G=[g(x, y)] is located at the pixel of the x-th row and the y-column and the output image F=[f(x', y')] is located at The coordinate mapping function of the pixel point in the x'th row and the y'th column:

xx &prime;&prime; == WW 22 uu (( xx )) ythe y &prime;&prime; == Hh 22 uu (( ythe y )) ;;

根据坐标映射函数确定输入图像G=[g(x,y)]与归一化图像F=[f(x’,y’)]坐标映射关系,对输入图像灰度值通过双线性插值,得到归一化图像F=[f(x’,y’)]的值。Determine the coordinate mapping relationship between the input image G=[g(x, y)] and the normalized image F=[f(x', y')] according to the coordinate mapping function, and pass bilinear interpolation to the gray value of the input image, Get the value of the normalized image F=[f(x', y')].

1.3 请参阅图4示出的基于梯度直方图的特征提取的构架。1.3 Please refer to the framework of feature extraction based on gradient histogram shown in Figure 4.

利用Sobel算子的两个3×3模板分别计算图像F=[f(x,y)]中每个位置上的梯度,Sobel算子的两个3x3模板如图7所示。对于图像F=[f(x,y)],其分别沿x轴和y轴方向的一阶导数分量通过下式求得:Using two 3×3 templates of the Sobel operator to calculate the gradient at each position in the image F=[f(x,y)] respectively, the two 3×3 templates of the Sobel operator are shown in FIG. 7 . For the image F=[f(x, y)], its first-order derivative components along the x-axis and y-axis directions are obtained by the following formula:

gx(x,y)=f(x+1,y-1)+2f(x+1,y)+f(x+1,y+1)g x (x, y) = f(x+1, y-1)+2f(x+1, y)+f(x+1, y+1)

          -f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1),-f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1),

gy(x,y)=f(x-1,y+1)+2f(x,y+1)+f(x+1,y+1)g y (x, y) = f(x-1, y+1)+2f(x, y+1)+f(x+1, y+1)

          -f(x-1,y-1)-2f(x,y-1)-f(x+1,y-1).-f(x-1,y-1)-2f(x,y-1)-f(x+1,y-1).

x=0,...,W2-1,y=1,...,H2-1;x=0,..., W2-1 , y=1,..., H2-1 ;

计算图像F=[f(x,y)]位置(x,y)的梯度强度mag(x,y)和方向角

Figure G2008102393315D0006104115QIETU
(x,y)分别为:Compute gradient strength mag(x,y) and orientation angle of position (x,y) in image F = [f(x,y)]
Figure G2008102393315D0006104115QIETU
(x, y) are:

magmag (( xx ,, ythe y )) == [[ gg xx 22 (( xx ,, ythe y )) ++ gg ythe y 22 (( xx ,, ythe y )) ]] 11 // 22 ,,

Figure G2008102393315D00062
Figure G2008102393315D00062

定义L个标准方向,L=4和L=8的情况如图8和所示。将梯度利用平行四边形法则分解为离它最近的两个标准方向,如图9所示。将归一化后大小为W2×H2的图像F=[f(x,y)]分割成R×R个互不相交的矩形区域,为每个矩形区域建立L维的梯度方向直方图。图像F=[f(x,y)]中每个像素的梯度对与这个像素最近的4个矩形区域的梯度方向直方图有贡献。如图10所示为一个像素与其最近的4个矩形区域(从上到下,从左到右分别编号为1,2,3和4),其中每个小矩形框表示一个像素,4×4个小矩形框组成一个大矩形区域。在水平方向上,像素与矩形区域中心的距离分别为dhl和dhr;在竖直方向上,像素与矩形区域中心的距离分别为dvt和dvb。设像素梯度在l方向上的分量的强度为gl,则这个像素的梯度对第1,2,3和4个矩形区域的梯度方向直方图的第l维的贡献值分别为gl×dhr×dvb/((dhl+dhr)×(dvt+dvb)),gl×dhl×dvb/((dhl+dhr)×(dvt+dvb)),gl×dhr×dvt/((dhl+dhr)×(dvt+dvb))和gl×dhl×dvt/((dhl+dhr)×(dvt+dvb))。利用这种方法计算每个像素的梯度对与其邻近的矩形区域的梯度方向直方图的贡献,求得每个矩形区域梯度方向直方图,最后得到了字符图像的R×R×L维特征。Define L standard directions, and the situations of L=4 and L=8 are shown in Fig. 8 and . The gradient is decomposed into two standard directions closest to it using the parallelogram rule, as shown in Figure 9. Divide the normalized image F=[f(x, y)] with a size of W 2 ×H 2 into R×R disjoint rectangular areas, and establish an L-dimensional gradient direction histogram for each rectangular area . The gradient of each pixel in the image F=[f(x,y)] contributes to the gradient direction histogram of the 4 nearest rectangular regions to this pixel. As shown in Figure 10, a pixel and its nearest 4 rectangular areas (numbered 1, 2, 3 and 4 from top to bottom and from left to right), where each small rectangular box represents a pixel, 4×4 A large rectangular area is composed of two small rectangular boxes. In the horizontal direction, the distances between the pixel and the center of the rectangular area are respectively d hl and d hr ; in the vertical direction, the distances between the pixel and the center of the rectangular area are respectively d vt and d vb . Assuming that the component intensity of the pixel gradient in the l direction is g l , then the contribution value of the gradient of this pixel to the lth dimension of the gradient direction histogram of the 1st, 2nd, 3rd and 4th rectangular areas is respectively g l × d hr ×d vb /((d hl +d hr )×(d vt +d vb )), g l ×d hl ×d vb /((d hl +d hr )×(d vt +d vb )), g l ×d hr ×d vt /((d hl +d hr )×(d vt +d vb )) and g l ×d hl ×d vt /((d hl +d hr )×(d vt +d vb )). Using this method to calculate the contribution of the gradient of each pixel to the gradient orientation histogram of its adjacent rectangular area, obtain the gradient orientation histogram of each rectangular area, and finally obtain the R×R×L dimensional features of the character image.

1.4 请参阅图5示出的求特征降维的变换矩阵的架构:1.4 Please refer to the structure of the transformation matrix for feature dimensionality reduction shown in Figure 5:

1.4.1 主分量分析(PCA)1.4.1 Principal Component Analysis (PCA)

高维特征向量包含相互关联的特征,对其处理运算量大,利用主分量分析对高维特征向量进行主分量分析(PCA),求解PCA降维矩阵PPCA;设从n个训练样本中提取的字符特征为xi,i=1,...,n,xi的维数m=R×R×L;训练样本字符特征的散度矩阵为:High-dimensional eigenvectors contain interrelated features, which require a large amount of processing. Principal component analysis is used to perform principal component analysis (PCA) on high-dimensional eigenvectors to solve the PCA dimensionality reduction matrix P PCA ; it is assumed to extract from n training samples The character feature of x i , i=1,...,n, the dimension of x i m=R×R×L; the scatter matrix of the character feature of the training sample is:

&Sigma;&Sigma; == 11 nno &Sigma;&Sigma; ii == 11 nno (( xx ii -- xx &OverBar;&OverBar; )) TT (( xx ii -- xx &OverBar;&OverBar; )) ,, xx &OverBar;&OverBar; == &Sigma;&Sigma; ii == 11 nno xx ii

对散度矩阵进行特征值分解为:The eigenvalue decomposition of the scatter matrix is:

∑=UΛUT ∑=UΛU T

其中U=[u1,u2,...,um]为正交矩阵,Λ=diag(λ1,λ2,…,λm)为对角矩阵,λ1≥λ2≥…≥λm为特征值。设主分量分析PCA降维后要保存r%的能量,则主分量分析保存的主方向个数l为Where U=[u 1 , u 2 ,..., u m ] is an orthogonal matrix, Λ=diag(λ 1 , λ 2 ,..., λ m ) is a diagonal matrix, λ 1 ≥λ 2 ≥...≥ λ m is the eigenvalue. Assuming that r% energy is to be saved after PCA dimension reduction, the number of main directions l saved by principal component analysis is

ll == argarg minmin kk (( &Sigma;&Sigma; ii == 11 kk &lambda;&lambda; ii &Sigma;&Sigma; ii == 11 mm &lambda;&lambda; ii &GreaterEqual;&Greater Equal; rr ))

主分量分析得到的变换矩阵为PPCA=[u1,u2,...,ul],对字符特征xi,进行降维得到降维后的l维字符特征zi=(PPCA)Txi,i=1,...,n,(PPCA)T表示PPCA的转置矩阵;The transformation matrix obtained by principal component analysis is P PCA =[u 1 , u 2 ,..., u l ], and character feature x i is subjected to dimensionality reduction to obtain dimension-reduced l-dimensional character feature z i =(P PCA ) T x i , i=1,...,n, (P PCA ) T represents the transposition matrix of P PCA ;

1.4.2 对训练样本降维后的字符特征进行线性判别分析(LDA),求解变换矩阵W:1.4.2 Perform linear discriminant analysis (LDA) on the character features of the training samples after dimensionality reduction, and solve the transformation matrix W:

设识别系统中待识别的字符类别数为C,第i类包含ni个训练样本。计算第i类字符样本特征均值μi和所有样本特征均值μ:Assuming that the number of character categories to be recognized in the recognition system is C, the i-th category contains n i training samples. Calculate the i-th character sample feature mean μ i and all sample feature mean μ:

&mu;&mu; ii == 11 nno ii &Sigma;&Sigma; kk == 11 nno ii zz kk ii ,, &mu;&mu; ii == 11 nno &Sigma;&Sigma; ii == 11 CC &Sigma;&Sigma; kk == 11 nno ii zz kk ii ,, nno == &Sigma;&Sigma; ii == 11 CC nno ii

计算类间散度矩阵Sb和类内散度矩阵SwCalculate the between-class scatter matrix S b and the intra-class scatter matrix S w :

SS bb == &Sigma;&Sigma; ii == 11 CC nno ii nno (( &mu;&mu; ii -- &mu;&mu; )) (( &mu;&mu; ii -- &mu;&mu; )) TT

SS ww == &Sigma;&Sigma; ii == 11 CC (( nno ii nno &Sigma;&Sigma; kk == 11 nno ii (( zz kk ii -- &mu;&mu; ii )) (( zz kk ii -- &mu;&mu; ii )) TT ))

线性判别分析寻找一个变换矩阵W,使得变换后类间离散度尽量大,同时类内离散度尽量小,利用最大化判据Linear discriminant analysis looks for a transformation matrix W, so that after the transformation, the inter-class dispersion is as large as possible, while the intra-class dispersion is as small as possible, and the maximization criterion is used

JJ == trtr (( WW TT SS bb WW )) trtr (( WW TT SS ww WW ))

来表示。LDA可以通过求解广义特征向量问题来解决:To represent. LDA can be solved by solving a generalized eigenvector problem:

Sbw=λSwwS b w = λS w w

设向量w1,...,wd,...,wl为广义特征向量问题的解,它们对应的广义特征值λ1≥…≥λd≥…≥λl,选择前d个广义特征向量问题的解特征向量组成W,即W=[w1,...,wd]。Let the vectors w 1 ,...,w d ,...,w l be the solution of the generalized eigenvector problem, and their corresponding generalized eigenvalues λ 1 ≥...≥λ d ≥...≥λ l , choose the first d generalized eigenvectors The solution eigenvectors of the eigenvector problem consist of W, that is, W=[w 1 , . . . , w d ].

1.5 请参阅图6示出的分类器设计和字符识别架构:1.5 Please refer to the classifier design and character recognition architecture shown in Figure 6:

利用变换矩阵W对第i个字符类特征均值μi进行降维,并归一化降维后的特征Use the transformation matrix W to reduce the dimensionality of the i-th character class feature mean μ i , and normalize the dimensionality-reduced features

&mu;&mu; ii ** == WW TT &mu;&mu; ii ,, &mu;&mu; &OverBar;&OverBar; ii == &mu;&mu; ii ** // (( &mu;&mu; ii ** )) TT &mu;&mu; ii **

保存变换矩阵P=WPPCA,每个字符类的编码及其对应的特征μi在识别库中文件中。The transformation matrix P=WP PCA is saved, and the code of each character class and its corresponding feature μ i are in the file in the recognition library.

(5.2)字符识别(5.2) Character recognition

2 识别系统的实现2 Realization of the recognition system

从字符识别库中文件中读取变换矩阵P,每个字符类的编码及其对应的特征μi。对每个待识别字符的二值图像与灰度图像进行融合,对融合后的图像进行归一化,进行特征提取得到字符图像的多维特征a。利用变换矩阵P对字符图像的多维特征a进行特征降维得到降维后的特征b=PTa,PT为变换矩阵P的转置矩阵。对降维后的特征归一化得到Read the transformation matrix P, the code of each character class and its corresponding feature μ i from the file in the character recognition library. The binary image and the grayscale image of each character to be recognized are fused, the fused image is normalized, and feature extraction is performed to obtain the multidimensional feature a of the character image. Using the transformation matrix P to perform feature dimensionality reduction on the multi-dimensional feature a of the character image to obtain the dimensionality-reduced feature b= PT a, where PT is the transposition matrix of the transformation matrix P. The features after dimensionality reduction are normalized to get

bb &OverBar;&OverBar; == bb // bb TT bb ..

顺序计算b与每个字符类的归一化中心矢量{μi}1≤i≤C的余弦距离{di}1≤i≤C Sequentially calculate the cosine distance {d i } 1≤i≤C of b and the normalized center vector {μ i } 1≤i≤C of each character class

di=1-yTμd i =1-y T μ

距离最小的类即为字符图像的识别结果。The class with the smallest distance is the recognition result of the character image.

以上所述,仅为本发明中的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉该技术的人在本发明所揭露的技术范围内,可理解想到的变换或替换,都应涵盖在本发明的包含范围之内,因此,本发明的保护范围应该以权利要求书的保护范围为准。The above is only a specific implementation mode in the present invention, but the scope of protection of the present invention is not limited thereto. Anyone familiar with the technology can understand the conceivable transformation or replacement within the technical scope disclosed in the present invention. All should be covered within the scope of the present invention, therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims (5)

1.一种融合二值图像与灰度图像的字符识别的方法,其特征在于,对所述二值图像与灰度图像的融合图像进行处理,进行字符识别,其包括以下步骤:1. a method for character recognition of fusion binary image and grayscale image, it is characterized in that, process the fusion image of described binary image and grayscale image, carry out character recognition, it may further comprise the steps: 步骤1:设预处理后得到的单个字符图像的二值图像为B0=[b0(x,y)],其中位于第x行第y列的像素点的值为b0(x,y),b0(x,y)为0或1,图像的大小为W1×H1;字符的灰度图像为Gc=[gc(x,y)],位于第x行第y列的像素点的值为gc(x,y),0≤gc(x,y)≤255;将字符图像的二值图像B0与灰度图像Gc进行融合,得到融合后的图像G=[g(x,y)],位于第x行第y列的像素点的值为g(x,y),0≤g(x,y)≤255;Step 1: Let the binary image of a single character image obtained after preprocessing be B 0 =[b 0 (x, y)], where the value of the pixel located in row x, column y is b 0 (x, y ), b 0 (x, y) is 0 or 1, and the size of the image is W 1 ×H 1 ; the grayscale image of the character is G c =[g c (x, y)], located in row x, column y The value of the pixel point is g c (x, y), 0≤g c (x, y)≤255; the binary image B 0 of the character image is fused with the grayscale image G c to obtain the fused image G =[g(x, y)], the value of the pixel located in row x, column y is g(x, y), 0≤g(x, y)≤255; 步骤2:在提取融合图像G=[g(x,y)]的特征前,先进行融合图像G=[g(x,y)]的位置和大小的归一化处理;图像归一化处理的输入图像为G=[g(x,y)],归一化后的输出图像为F=[f(x’,y’)],其大小分别为W1×H1和W2×H2;输入图像G=[g(x,y)]位于第x行第y列的像素点将被映射到F=[f(x’,y’)]位于第x’行第y’列的像素点,通过输入图像和输出图像的坐标映射来实现图像归一化:Step 2: Before extracting the feature of the fusion image G=[g(x, y)], first carry out the normalization processing of the position and the size of the fusion image G=[g(x, y)]; image normalization processing The input image of is G=[g(x, y)], and the output image after normalization is F=[f(x', y')], whose sizes are W 1 ×H 1 and W 2 ×H respectively 2 ; the input image G=[g(x, y)] is located in the pixel of the xth row and the yth column will be mapped to F=[f(x', y')] is located in the x'th row and the y'column Pixels, through the coordinate mapping of the input image and the output image to achieve image normalization: xx &prime;&prime; == xx &prime;&prime; (( xx ,, ythe y )) ythe y &prime;&prime; == ythe y &prime;&prime; (( xx ,, ythe y )) 一维坐标映射为:One-dimensional coordinate mapping is: xx &prime;&prime; == xx &prime;&prime; (( xx )) ythe y &prime;&prime; == ythe y &prime;&prime; (( xx )) ;; 步骤3:基于梯度直方图提取归一化图像的梯度直方图的特征;Step 3: Extract the features of the gradient histogram of the normalized image based on the gradient histogram; 步骤4:利用主分量分析与线性判别分析对归一化图像的梯度直方图的特征进行降维处理,得到特征降维的变换矩阵;Step 4: Use principal component analysis and linear discriminant analysis to perform dimensionality reduction processing on the features of the gradient histogram of the normalized image, and obtain a transformation matrix for feature dimensionality reduction; 步骤5:建立字符特征模板库,读取特征降维的变换矩阵并对字符进行识别;Step 5: Establish a character feature template library, read the transformation matrix of feature dimensionality reduction and recognize the characters; 所述二值图像与灰度图像的融合包括:The fusion of the binary image and the grayscale image includes: 步骤11:对灰度图像Gc=[gc(x,y)]的像素点值gc(x,y)利用传统的最大类间方差法求得阈值thostu,求一个全局阈值th,令th=a×thostu,a是一个常数;利用全值阈值th对灰度图像Gc=[gc(x,y)]的像素点值gc(x,y)进行门限处理,得到二值图像Bg=[bg(x,y)],二值图像的像素点值bg(x,y)定义为:Step 11: For the pixel value g c (x, y) of the grayscale image G c =[g c (x, y)], use the traditional maximum inter-class variance method to obtain the threshold th ostu , and find a global threshold th, Let th=a×th ostu , a is a constant; use the full-value threshold th to perform threshold processing on the pixel value g c (x, y) of the grayscale image G c =[g c (x, y)], and obtain Binary image B g =[b g (x, y)], the pixel point value b g (x, y) of binary image is defined as: b g ( x , y ) = 1 g c ( x , y ) > th 0 g c ( x , y ) &le; th , x=0,...,W1-1,y=0,...,H1-1; b g ( x , the y ) = 1 g c ( x , the y ) > the th 0 g c ( x , the y ) &le; the th , x=0,..., W1-1 , y=0,..., H1-1 ; 步骤12:利用二值形态学对二值图像Bo进行条件膨胀,设D是一个3x3的图像,其每个位置的像素值为1;对二值图像Bo进行条件膨胀为:Step 12: Use binary morphology to conditionally expand the binary image B o , let D be a 3x3 image, and the pixel value of each position is 1; perform conditional expansion on the binary image B o as follows: B i + 1 = ( B i &CirclePlus; ) &cap; B g , i=1,2,...,N,B1=B0 B i + 1 = ( B i &CirclePlus; ) &cap; B g , i=1, 2, . . . , N, B 1 =B 0 根据上式对二值图像Bi反复进行条件膨胀,直到Bi+1=Bi或达到最大的迭代次数,设最后得到的二值图像为B=[b(x,y)];According to the above formula, the binary image B i is repeatedly subjected to conditional expansion until B i+1 =B i or reaches the maximum number of iterations, and the final binary image obtained is B=[b(x, y)]; 步骤13:得到用于单个字符识别的融合图像G=[g(x,y)],g(x,y)定义为:Step 13: Obtain the fused image G=[g(x, y)] used for single character recognition, g(x, y) is defined as: g ( x , y ) = g c ( x , y ) b ( x , y ) = 1 0 b ( x , y ) = 0 , x=0,...,W1-1,y=0,...,H1-1。 g ( x , the y ) = g c ( x , the y ) b ( x , the y ) = 1 0 b ( x , the y ) = 0 , x=0, . . . , W 1 -1, y=0, . . . , H 1 -1. 2.根据权利要求1所述融合二值图像与灰度图像的字符识别的方法,其特征在于,图像的归一化包括:2. according to the method for the character recognition of fusion binary image and grayscale image described in claim 1, it is characterized in that, the normalization of image comprises: 步骤21:计算融合图像G=[g(x,y)]的质心(xc,yc),把质心调整为归一化图像F=[f(x’,y’)]的中心(W2/2,H2/2):Step 21: Calculate the centroid (x c , y c ) of the fused image G=[g(x,y)], and adjust the centroid to the center (W 2 /2, H 2 /2): g x ( x ) = &Sigma; y = 0 H 1 - 1 g ( x , y ) / &Sigma; x = 0 W 1 - 1 &Sigma; x = 0 H 1 - 1 g ( x , y ) , x=0,...,W1-1, g x ( x ) = &Sigma; the y = 0 h 1 - 1 g ( x , the y ) / &Sigma; x = 0 W 1 - 1 &Sigma; x = 0 h 1 - 1 g ( x , the y ) , x=0, . . . , W 1 -1, g y ( y ) = &Sigma; x = 0 W 1 - 1 g ( x , y ) / &Sigma; x = 0 W 1 - 1 &Sigma; y = 0 H 1 - 1 g ( x , y ) , y=0,...,H1-1, g the y ( the y ) = &Sigma; x = 0 W 1 - 1 g ( x , the y ) / &Sigma; x = 0 W 1 - 1 &Sigma; the y = 0 h 1 - 1 g ( x , the y ) , y=0, . . . , H 1 -1, xx cc == &Sigma;&Sigma; xx == 00 WW 11 -- 11 xgx g xx (( xx )) ,, ythe y cc == &Sigma;&Sigma; ythe y == 00 Hh 11 -- 11 ygyg ythe y (( ythe y )) ,, 其中gx(x)和gy(y)分别为融合图像G=[g(x,y)]在垂直方向和水平方向上的像素密度;Wherein g x (x) and g y (y) are the pixel density of fusion image G=[g(x, y)] in vertical direction and horizontal direction respectively; 步骤22:根据质心位置(xc,yc),计算图像G=[g(x,y)]单边二阶矩
Figure FDA00002237040400028
Figure FDA00002237040400029
Figure FDA000022370404000210
Step 22: According to the centroid position (x c , y c ), calculate the unilateral second-order moment of the image G=[g(x, y)]
Figure FDA00002237040400028
Figure FDA00002237040400029
and
Figure FDA000022370404000210
&mu;&mu; xx ++ == &Sigma;&Sigma; xx >> xx cc (( xx -- xx cc )) 22 gg xx (( xx )) &mu;&mu; xx -- == &Sigma;&Sigma; xx << xx cc (( xx -- xx cc )) 22 gg xx (( xx )) &mu;&mu; ythe y ++ == &Sigma;&Sigma; ythe y >> ythe y cc (( ythe y -- ythe y cc )) 22 gg ythe y (( ythe y )) &mu;&mu; ythe y -- == &Sigma;&Sigma; ythe y << ythe y cc (( ythe y -- ythe y cc )) 22 gg ythe y (( ythe y )) 步骤23:根据计算的单边二阶矩设置输入图像的外框为:Step 23: Set the outer frame of the input image according to the calculated unilateral second moment: [ x c - 2 &mu; x - , x c = 2 &mu; x + ] [ y c - 2 &mu; y - , y c + 2 &mu; y + ] ; [ x c - 2 &mu; x - , x c = 2 &mu; x + ] and [ the y c - 2 &mu; the y - , the y c + 2 &mu; the y + ] ; 对于x轴,求解二次函数u(x)=ax2+bx+c把x轴上的三个点
Figure FDA00002237040400035
分别映射为(0,0.5,1),同理得到y轴的二次函数u(y)把y轴上的三个点
Figure FDA00002237040400036
分别映射为(0,0.5,1);得到输入图像G=[g(x,y)]位于第x行第y列的像素点和输出图像F=[f(x’,y’)]位于第x’行第y’列的像素点的坐标映射函数:
For the x-axis, solve the quadratic function u(x)=ax 2 +bx+c put the three points on the x-axis
Figure FDA00002237040400035
They are respectively mapped to (0, 0.5, 1), and similarly, the quadratic function u(y) of the y-axis is obtained and the three points on the y-axis are
Figure FDA00002237040400036
respectively mapped to (0, 0.5, 1); the input image G=[g(x, y)] is located at the pixel point of the xth row and the yth column and the output image F=[f(x', y')] is located at The coordinate mapping function of the pixel point in the x'th row and the y'th column:
x &prime; = W 2 u ( x ) y &prime; = H 2 u ( y ) , W2,H2分别为输出图像F=[f(x’,y’)]的宽和高; x &prime; = W 2 u ( x ) the y &prime; = h 2 u ( the y ) , W 2 , H 2 are respectively the width and height of the output image F=[f(x', y')]; 步骤24:通过双线性插值最终得到归一化图像F=[f(x’,y’)]的值。Step 24: finally obtain the value of the normalized image F=[f(x', y')] through bilinear interpolation.
3.根据权利要求1所述融合二值图像与灰度图像的字符识别的方法,其特征在于,所述基于梯度直方图提取归一化图像的梯度直方图的特征的步骤包括:3. according to the method for the character recognition of fusion binary image and gray-scale image described in claim 1, it is characterized in that, the described step of extracting the feature of the gradient histogram of normalized image based on gradient histogram comprises: 步骤31:利用Sobel算子的两个3×3模板分别计算图像F=[f(x,y)]中每个位置上的梯度;对于图像F=[f(x,y)],其分别沿x轴和y轴方向的一阶导数分量通过下式求得:Step 31: Utilize two 3 * 3 templates of Sobel operator to calculate the gradient on each position in image F=[f(x, y)] respectively; For image F=[f(x, y)], it respectively The first-order derivative components along the x-axis and y-axis directions are obtained by the following formula: gx(x,y)=f(x+1,y-1)+2f(x+1,y)+f(x+1,y+1)g x (x, y) = f(x+1, y-1)+2f(x+1, y)+f(x+1, y+1) -f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1),-f(x-1,y-1)-2f(x-1,y)-f(x-1,y+1), gy(x,y)=f(x-1,y+1)+2f(x,y+1)+f(x+1,y+1),g y (x,y)=f(x-1,y+1)+2f(x,y+1)+f(x+1,y+1), -f(x-1,y-1)-2f(x,y-1)-f(x+1,y-1).-f(x-1, y-1)-2f(x, y-1)-f(x+1, y-1). x=0,...,W2-1,y=0,...,H2-1;x=0,..., W2-1 , y=0,..., H2-1 ; 步骤32:图像F=[f(x,y)]位置(x,y)的梯度强度mag(x,y)和方向角分别为:Step 32: Image F = [f(x, y)] Gradient strength mag(x, y) and orientation angle at position (x, y) They are: magmag (( xx ,, ythe y )) == [[ gg xx 22 (( xx ,, ythe y )) ++ gg xx 22 (( xx ,, ythe y )) ]] 11 // 22 ,,
Figure FDA000022370404000310
Figure FDA000022370404000310
步骤33:定义L个标准方向,将梯度利用平行四边形法则分解为离它最近的两个标准方向,将归一化后大小为W2×H2的图像F=[f(x,y)]分割成R×R个互不相交的矩形区域,为每个矩形区域建立L维的梯度方向直方图;图像F=[f(x,y)]中每个像素的梯度对与这个像素最近的4个矩形区域的梯度方向直方图有贡献;计算每个像素的梯度对与其邻近的矩形区域的梯度方向直方图的贡献,求得每个矩形区域梯度方向直方图,最后得到了字符图像的R×R×L维特征。Step 33: Define L standard directions, use the parallelogram rule to decompose the gradient into two standard directions closest to it, and normalize the image F=[f(x,y)] whose size is W 2 ×H 2 Divide into R×R non-intersecting rectangular areas, and establish an L-dimensional gradient direction histogram for each rectangular area; the gradient pair of each pixel in the image F=[f(x, y)] is closest to this pixel The gradient direction histograms of the four rectangular regions contribute; calculate the contribution of the gradient of each pixel to the gradient direction histogram of its adjacent rectangular region, obtain the gradient direction histogram of each rectangular region, and finally get the R of the character image ×R×L dimensional features.
4.根据权利要求1所述融合二值图像与灰度图像的字符识别的方法,其特征在于,所述利用主分量分析与线性判别分析对归一化图像的梯度直方图的特征进行降维处理的步骤包括:4. according to the method for the character recognition of fusion binary image and gray-scale image described in claim 1, it is characterized in that, described utilize principal component analysis and linear discriminant analysis to carry out dimensionality reduction to the feature of the gradient histogram of normalized image The processing steps include: 步骤41:对高维特征向量进行主分量分析(PCA),求解PCA降维矩阵PPCAStep 41: Perform principal component analysis (PCA) on the high-dimensional eigenvectors, and solve the PCA dimensionality reduction matrix P PCA : 高维特征向量包含相互关联的特征,对其处理运算量大,利用主分量分析对高维特征向量进行主分量分析(PCA),求解PCA降维矩阵PPCA;设从n个训练样本中提取的字符特征为xi,i=1,...,n,xi的维数m=R×R×L;训练样本字符特征的散度矩阵为:High-dimensional eigenvectors contain interrelated features, which require a large amount of processing. Principal component analysis is used to perform principal component analysis (PCA) on high-dimensional eigenvectors to solve the PCA dimensionality reduction matrix P PCA ; it is assumed to extract from n training samples The character feature of x i , i=1,...,n, the dimension of x i m=R×R×L; the scatter matrix of the character feature of the training sample is: &Sigma;&Sigma; == 11 nno &Sigma;&Sigma; ii == 11 nno (( xx ii -- xx &OverBar;&OverBar; )) TT (( xx ii -- xx &OverBar;&OverBar; )) ,, xx &OverBar;&OverBar; == &Sigma;&Sigma; ii == 11 nno xx ii ,, 对散度矩阵进行特征值分解为:The eigenvalue decomposition of the scatter matrix is: ∑=UΛUT ∑=UΛU T 其中U=[u1,u2,...,um]为正交矩阵,Λ=diag(λ1,λ2,...,λm)为对角矩阵,λ1≥λ2≥...≥λm为特征值,设主分量分析PCA降维后要保存r%的能量,则主分量分析保存的主方向个数l为:Where U=[u 1 , u 2 ,..., u m ] is an orthogonal matrix, Λ=diag(λ 1 , λ 2 ,..., λ m ) is a diagonal matrix, λ 1 ≥λ 2 ≥ ...≥λ m is the eigenvalue, and it is assumed that r% of the energy should be saved after PCA dimension reduction, then the number of main directions l saved by the principal component analysis is: ll == argarg minmin kk (( &Sigma;&Sigma; ii == 11 kk &lambda;&lambda; ii &Sigma;&Sigma; ii == 11 mm &lambda;&lambda; ii &GreaterEqual;&Greater Equal; rr )) 主分量分析得到的变换矩阵为PPCA=[u1,u2,...,ul],对字符特征xi,进行降维得到降维后的l维字符特征zi=(PPCA)Txi,i=1,...,n,(PPCA)T表示PPCA的转置矩阵;The transformation matrix obtained by principal component analysis is P PCA =[u 1 , u 2 ,..., u l ], and character feature x i is subjected to dimensionality reduction to obtain dimension-reduced l-dimensional character feature z i =(P PCA ) T x i , i=1,...,n, (P PCA ) T represents the transposition matrix of P PCA ; 步骤42:对训练样本降维后的字符特征进行线性判别分析,求解变换矩阵W:Step 42: Perform linear discriminant analysis on the character features of the training samples after dimensionality reduction, and solve the transformation matrix W: 设识别系统中待识别的字符类别数为C,第i类包含ni个训练样本;计算第i类字符样本特征均值μi和所有样本特征均值μ:Assume that the number of character categories to be recognized in the recognition system is C, and the i-th category contains n i training samples; calculate the i-th character sample feature mean μ i and all sample feature mean μ: &mu;&mu; ii == 11 nno ii &Sigma;&Sigma; kk == 11 nno ii zz kk ii ,, &mu;&mu; == 11 nno &Sigma;&Sigma; ii == 11 CC &Sigma;&Sigma; kk == 11 nno ii zz kk ii ,, nno == &Sigma;&Sigma; ii == 11 CC nno ii 计算类间散度矩阵Sb和类内散度矩阵SwCalculate the between-class scatter matrix S b and the intra-class scatter matrix S w : SS bb == &Sigma;&Sigma; ii == 11 CC nno ii nno (( &mu;&mu; ii -- &mu;&mu; )) (( &mu;&mu; ii -- &mu;&mu; )) TT SS ww == &Sigma;&Sigma; ii == 11 CC (( nno ii nno &Sigma;&Sigma; kk == 11 nno ii (( zz kk ii -- &mu;&mu; ii )) (( zz kk ii -- &mu;&mu; ii )) TT )) 线性判别分析寻找一个变换矩阵W,使得变换后类间离散度尽量大,同时类内离散度尽量小,利用最大化判据Linear discriminant analysis looks for a transformation matrix W, so that after the transformation, the inter-class dispersion is as large as possible, while the intra-class dispersion is as small as possible, and the maximization criterion is used J = tr ( W T S b W ) tr ( W T S w W ) 来表示; J = tr ( W T S b W ) tr ( W T S w W ) To represent; 通过求解广义特征向量问题来解决线性判别分析:Solve linear discriminant analysis by solving a generalized eigenvector problem: Sbw=λSwwS b w = λS w w 设向量w1,...,wd,...,wl为广义特征向量问题的解,它们对应的广义特征值λ1≥...≥λd≥...≥λl,选择前d个广义特征向量问题的解特征向量组成W,即W=[w1,...,wd]。Let the vectors w 1 ,...,w d ,...,w l be the solution of the generalized eigenvector problem, and their corresponding generalized eigenvalues λ 1 ≥...≥λ d ≥...≥λ l , choose The solution eigenvectors of the first d generalized eigenvector problems form W, that is, W=[w 1 , . . . , w d ]. 5.根据权利要求4所述融合二值图像与灰度图像的字符识别的方法,其特征在于,所述字符识别包括:5. the method for the character recognition of fusion binary image and gray scale image according to claim 4, is characterized in that, described character recognition comprises: 步骤51:设计分类器Step 51: Design the classifier 利用变换矩阵W对第i个字符类特征均值μi进行降维,并归一化降维后的特征 &mu; i * = W T &mu; i , &mu; &OverBar; i = &mu; i * / ( &mu; i * ) T &mu; i * ; Use the transformation matrix W to reduce the dimensionality of the i-th character class feature mean μ i , and normalize the dimensionality-reduced features &mu; i * = W T &mu; i , &mu; &OverBar; i = &mu; i * / ( &mu; i * ) T &mu; i * ; 保存变换矩阵P=WPPCA、每个字符类的编码及其对应的字符类特征
Figure FDA00002237040400059
在字符识别库中的文件中;
Save the transformation matrix P=WP PCA , the encoding of each character class and its corresponding character class features
Figure FDA00002237040400059
In the file in the character recognition library;
步骤52:字符识别Step 52: Character Recognition 从字符识别库中的文件中读取变换矩阵P、每个字符类的编码及其对应的字符类特征
Figure FDA00002237040400061
对每个待识别字符的二值图像与灰度图像进行融合,对融合后的图像进行归一化,进行特征提取得到字符图像的多维特征a;利用变换矩阵P对字符图像的多维特征a进行特征降维得到降维后的特征b=PTa,PT为变换矩阵P的转置矩阵;
Read the transformation matrix P, the encoding of each character class and its corresponding character class features from the file in the character recognition library
Figure FDA00002237040400061
The binary image of each character to be recognized is fused with the grayscale image, the fused image is normalized, and the feature extraction is performed to obtain the multidimensional feature a of the character image; Feature dimensionality reduction obtains the feature b=P T a after dimension reduction, and P T is the transposition matrix of the transformation matrix P;
对降维后特征b归一化得到
Figure FDA00002237040400062
After dimensionality reduction, feature b is normalized to get
Figure FDA00002237040400062
顺序计算字符归一化特征
Figure FDA00002237040400063
与每个字符类的归一化中心矢量的余弦距离{di}1≤i≤C
Compute character normalized features sequentially
Figure FDA00002237040400063
with the normalized center vector for each character class The cosine distance {d i } 1≤i≤C
dd ii == 11 -- ythe y &OverBar;&OverBar; TT &mu;&mu; &OverBar;&OverBar; 距离最小的类即为字符图像的识别结果。The class with the smallest distance is the recognition result of the character image.
CN 200810239331 2008-12-10 2008-12-10 Method for character identification through fusing binary image and gray level image Expired - Fee Related CN101751565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810239331 CN101751565B (en) 2008-12-10 2008-12-10 Method for character identification through fusing binary image and gray level image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810239331 CN101751565B (en) 2008-12-10 2008-12-10 Method for character identification through fusing binary image and gray level image

Publications (2)

Publication Number Publication Date
CN101751565A CN101751565A (en) 2010-06-23
CN101751565B true CN101751565B (en) 2013-01-02

Family

ID=42478527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810239331 Expired - Fee Related CN101751565B (en) 2008-12-10 2008-12-10 Method for character identification through fusing binary image and gray level image

Country Status (1)

Country Link
CN (1) CN101751565B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102011075275A1 (en) * 2011-05-04 2012-11-08 Bundesdruckerei Gmbh Method and device for recognizing a character
CN102750530B (en) * 2012-05-31 2014-11-26 贺江涛 Character recognition method and device
CN103854020B (en) * 2012-11-29 2018-11-30 捷讯平和(北京)科技发展有限公司 Character recognition method and device
CN103679208A (en) * 2013-11-27 2014-03-26 北京中科模识科技有限公司 Broadcast and television caption recognition based automatic training data generation and deep learning method
CN106257495A (en) * 2015-06-19 2016-12-28 阿里巴巴集团控股有限公司 A kind of digit recognition method and device
CN106203434B (en) * 2016-07-08 2019-07-19 中国科学院自动化研究所 Document Image Binarization Method Based on Symmetry of Stroke Structure
CN108319958A (en) * 2018-03-16 2018-07-24 福州大学 A kind of matched driving license of feature based fusion detects and recognition methods
CN108830138B (en) * 2018-04-26 2021-05-07 平安科技(深圳)有限公司 Livestock identification method, device and storage medium
CN109520706B (en) * 2018-11-21 2020-10-09 云南师范大学 Screw hole coordinate extraction method of automobile fuse box
CN109919253A (en) * 2019-03-27 2019-06-21 北京爱数智慧科技有限公司 Character identifying method, device, equipment and computer-readable medium
CN111583217A (en) * 2020-04-30 2020-08-25 深圳开立生物医疗科技股份有限公司 Tumor ablation curative effect prediction method, device, equipment and computer medium
CN112200247B (en) * 2020-10-12 2021-07-02 西安泽塔云科技股份有限公司 Image processing system and method based on multi-dimensional image mapping

Also Published As

Publication number Publication date
CN101751565A (en) 2010-06-23

Similar Documents

Publication Publication Date Title
CN101751565B (en) Method for character identification through fusing binary image and gray level image
Boiman et al. In defense of nearest-neighbor based image classification
CN100426314C (en) Feature classification based multiple classifiers combined people face recognition method
Banerji et al. New image descriptors based on color, texture, shape, and wavelets for object and scene image classification
CN101739555B (en) Method and system for detecting false face, and method and system for training false face model
CN104318252B (en) Hyperspectral image classification method based on stratified probability model
CN104392241B (en) A kind of head pose estimation method returned based on mixing
US20120093420A1 (en) Method and device for classifying image
CN106778586A (en) Offline handwriting signature verification method and system
CN102663413A (en) Multi-gesture and cross-age oriented face image authentication method
Ameur et al. Fusing Gabor and LBP feature sets for KNN and SRC-based face recognition
CN101968813A (en) Method for detecting counterfeit webpage
CN102682287A (en) Pedestrian detection method based on saliency information
CN113239839B (en) Expression recognition method based on DCA face feature fusion
CN106169073A (en) A kind of expression recognition method and system
Cai et al. Traffic sign recognition algorithm based on shape signature and dual-tree complex wavelet transform
Sinha et al. New color GPHOG descriptors for object and scene image classification
Nanni et al. Ensemble of texture descriptors for face recognition obtained by varying feature transforms and preprocessing approaches
Han et al. Multilinear supervised neighborhood embedding of a local descriptor tensor for scene/object recognition
Chen et al. Unconstrained face verification using fisher vectors computed from frontalized faces
Aly et al. A multi-modal feature fusion framework for kinect-based facial expression recognition using dual kernel discriminant analysis (DKDA)
CN107578005A (en) A LBP Face Recognition Method in Complex Wavelet Transform Domain
CN103942572A (en) Method and device for extracting facial expression features based on bidirectional compressed data space dimension reduction
CN109376680A (en) A fast face recognition method based on the efficient fusion of Hog and Gabor features based on near-infrared face images
Alrashed et al. Facial gender recognition using eyes images

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130102

CF01 Termination of patent right due to non-payment of annual fee