CN102509091B - Airplane tail number recognition method - Google Patents
Airplane tail number recognition method Download PDFInfo
- Publication number
- CN102509091B CN102509091B CN 201110388239 CN201110388239A CN102509091B CN 102509091 B CN102509091 B CN 102509091B CN 201110388239 CN201110388239 CN 201110388239 CN 201110388239 A CN201110388239 A CN 201110388239A CN 102509091 B CN102509091 B CN 102509091B
- Authority
- CN
- China
- Prior art keywords
- tail number
- image
- aircraft tail
- sigma
- area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
本发明公开了一种飞机尾号识别方法,首先基于Otsu动态阈值二值化对飞机尾号图像进行预处理,将飞机尾号图像与背景分离,再采用连通域法对飞机尾号图像进行字符分割,得到飞机尾号的单个字符并获取射影变换标记点,以此作为标准对飞机尾号图像进行逆射影变换;在飞机尾号字符识别中研究了最优参数支持向量机分类器;对飞机尾号字符采用重心法提取字符特征,支持向量机采用RBF核函数,利用二次网格搜索方法得到最优参数,采用“一对一”多类分类方法,获取基于最优参数支持向量机的飞机尾号。本发明具有较高的识别正确率,能够适用于光照环境多样性的机场场面环境。
The invention discloses an aircraft tail number recognition method. Firstly, the image of the aircraft tail number is preprocessed based on Otsu dynamic threshold binarization, the image of the aircraft tail number is separated from the background, and then the connected domain method is used to characterize the image of the aircraft tail number. Segmentation, get the single character of the aircraft tail number and obtain the projective transformation mark points, and use this as a standard to perform inverse projective transformation on the aircraft tail number image; study the optimal parameter support vector machine classifier in the aircraft tail number character recognition; The tail number character uses the center of gravity method to extract character features, the support vector machine uses the RBF kernel function, and uses the quadratic grid search method to obtain the optimal parameters. Aircraft tail number. The invention has a high recognition accuracy rate and can be applied to airport scene environments with diverse lighting environments.
Description
技术领域 technical field
本发明属于尾号识别的技术领域,特别涉及一种飞机尾号识别方法。The invention belongs to the technical field of tail number identification, in particular to a method for identifying an aircraft tail number.
背景领域background field
飞机尾号识别系统中的飞机尾号字符识别技术,其本质是模式识别范畴中的重要方向——文字识别。文字识别(Optical Character Recognition,OCR)概念是由德国科学家Tausheck于1929年首次提出。随着计算机的出现和发展,OCR在全球范围内广泛研究。经过近一个世纪的发展,OCR已经成为当今模式识别领域中最活跃的研究内容之一。它综合了数字图像处理、计算机图形学和人工智能等多方面的知识,并在其相关领域中得到广泛应用。通常OCR识别方法可以分为如下3类:统计特征字符识别技术、结构字符识别技术和基于人工神经网络的识别技术。统计特征字符识别技术一般选取同一类字符中共有的、相对稳定的并且分类性能好的统计特征作为特征向量。常用的统计特征有字符二维平面的位置特征、字符在水平或者垂直方向投影的直方图特征、矩特征和字符经过频域变换或其他形式变换后的特征等。二维的图像被一维的投影代替,计算量减少,同时也消除了文字在投影方向偏移的影响,但是对于字符的旋转变形却无能为力。The essence of the aircraft tail number character recognition technology in the aircraft tail number recognition system is an important direction in the field of pattern recognition - text recognition. The concept of Optical Character Recognition (OCR) was first proposed by German scientist Tausheck in 1929. With the emergence and development of computers, OCR has been extensively studied worldwide. After nearly a century of development, OCR has become one of the most active research contents in the field of pattern recognition today. It combines the knowledge of digital image processing, computer graphics and artificial intelligence, and is widely used in its related fields. Generally, OCR recognition methods can be divided into the following three categories: statistical feature character recognition technology, structural character recognition technology and recognition technology based on artificial neural network. Statistical feature character recognition technology generally selects common, relatively stable and statistical features with good classification performance in the same type of characters as feature vectors. The commonly used statistical features include the position features of the two-dimensional plane of the character, the histogram feature of the character projected in the horizontal or vertical direction, the moment feature, and the feature of the character after frequency domain transformation or other forms of transformation. The two-dimensional image is replaced by a one-dimensional projection, which reduces the amount of calculation and eliminates the influence of the offset of the characters in the projection direction, but it can't do anything about the rotation and deformation of the characters.
基于结构的文字识别实际上是将字符映射到了基元组成的结构空间进行识别。识别过程是在提取基元的基础上,利用形式语言和自动机理论,采取词法分析、树匹配、图匹配和知识推理的方法分析字符结构的过程。J.Park分析了传统结构识别方法的弊端,提出主动字符识别(Active Character Recognition)的思想,主动依据输入图像,动态确定结构特征的选取,达到节省资源、加速识别的目的。与统计识别方法相对应,字符的结构识别技术更加便于区分字型变化大的字符和字型相近的字符。但是由于对结构特征的描述和比较要占用大量的存储和计算资源,因此算法在实现上相对复杂、识别速度慢。Text recognition based on structure actually maps characters to a structural space composed of primitives for recognition. The recognition process is based on the extraction of primitives, using formal language and automata theory, and adopting the methods of lexical analysis, tree matching, graph matching and knowledge reasoning to analyze the character structure. J.Park analyzed the disadvantages of traditional structure recognition methods, and proposed the idea of Active Character Recognition, which actively determines the selection of structural features based on the input image, so as to save resources and speed up recognition. Corresponding to the statistical recognition method, the character structure recognition technology is more convenient to distinguish characters with large font changes and characters with similar fonts. However, since the description and comparison of structural features takes up a lot of storage and computing resources, the algorithm is relatively complex to implement and the recognition speed is slow.
基于人工神经网络的字符识别技术力图通过对人脑功能和结构的模拟来实现字符的高效识别。经过近几年的迅速发展,人工神经网络在字符识别方面得到了广泛的应用。在OCR系统中,人工神经网络主要充当分类器的功能。网络的输入是字符的特征向量,输出是字符的分类结果,即识别结果。经过反复学习,神经网络可以智能地将特征向量优化,去除冗余、矛盾的信息,强化类间的差异。由于神经网络采用分布式的网络结构,本身具备可以并行的条件,可以加快大规模问题的求解速度。Krezyak和Le Cun主要研究了BP(Back-Propagation)神经网络在文字识别方面的应用,针对BP网络学习速度慢、泛化能力弱的缺点,在BP网络的基础上产生了竞争监督学习的策略。The character recognition technology based on artificial neural network tries to realize the efficient recognition of characters by simulating the function and structure of the human brain. After rapid development in recent years, artificial neural network has been widely used in character recognition. In the OCR system, the artificial neural network mainly acts as a classifier. The input of the network is the feature vector of the character, and the output is the classification result of the character, that is, the recognition result. After repeated learning, the neural network can intelligently optimize the feature vector, remove redundant and contradictory information, and strengthen the differences between classes. Since the neural network adopts a distributed network structure, it has the conditions for parallelism, which can speed up the solution of large-scale problems. Krezyak and Le Cun mainly studied the application of BP (Back-Propagation) neural network in text recognition. Aiming at the shortcomings of BP network's slow learning speed and weak generalization ability, a competitive supervised learning strategy was produced on the basis of BP network.
在飞机尾号识别方法研究过程中,需要准确的对飞机尾号进行定位、分割和识别,所面临的主要难题有:In the research process of aircraft tail number recognition method, it is necessary to accurately locate, segment and identify the aircraft tail number. The main problems faced are:
1、拍摄得到的机场场面图像受环境因素干扰,图片质量很难保证;1. The captured image of the airport scene is disturbed by environmental factors, and the quality of the image is difficult to guarantee;
2、飞机尾号图像部分被遮挡和飞机尾号图像变形;2. Part of the image of the aircraft tail number is blocked and the image of the aircraft tail number is deformed;
3、机场场面监控图像背景复杂和一幅图像多飞机尾号等问题;3. Problems such as the complex background of the airport scene monitoring image and multiple aircraft tail numbers in one image;
4、受环境中光线等因素的影响,飞机尾号图像会有较大的噪声干扰,造成待分割的尾号字符的字迹模糊,相邻字符粘连,甚至残缺不全。4. Affected by factors such as light in the environment, the image of the aircraft tail number will have large noise interference, resulting in blurred handwriting of the tail number characters to be segmented, adhesion of adjacent characters, or even incompleteness.
发明内容 Contents of the invention
本发明的目的在于针对现有技术的不足,提出一种不同视角下的高分辨率的飞机尾号识别方法,提高系统的识别性能,满足国内实际系统的性能。本发明是对飞机尾号识别方法的研究,学习现有的模式识别和文字识别技术,并进行改进和创新,将其应用在飞机尾号识别领域,为我国的A-SMGCS系统研究和实现提供机场飞机与车辆探测与识别方面的支持,从而为我国的机场场面监视向着A-SMGCS系统的过渡和发展提供理论研究和技术支持。The purpose of the present invention is to address the deficiencies of the prior art, to propose a high-resolution aircraft tail number recognition method under different viewing angles, to improve the recognition performance of the system, and to meet the performance of domestic actual systems. The present invention is a research on the aircraft tail number recognition method, learns the existing pattern recognition and character recognition technology, improves and innovates, applies it in the field of aircraft tail number recognition, and provides research and implementation for the A-SMGCS system of our country. Airport aircraft and vehicle detection and identification support, so as to provide theoretical research and technical support for the transition and development of my country's airport surface surveillance to the A-SMGCS system.
为实现上述目的,本发明的技术方案为:一种飞机尾号识别方法,包括如下具体步骤:In order to achieve the above object, the technical solution of the present invention is: a method for identifying the tail number of an aircraft, comprising the following specific steps:
步骤一、原始图片的图像捕获,从不同的视角拍摄多幅图像,并从多幅图像中找出含有飞机尾号的图像;Step 1, image capture of the original picture, taking multiple images from different angles of view, and finding the image containing the tail number of the aircraft from the multiple images;
步骤二、飞机尾号定位预处理,先根据飞机尾号的区域特征来判断牌照,从包含整个机场场面监控的图像中找到尾号区域的位置;Step 2, aircraft tail number positioning preprocessing, first judge the license plate according to the regional characteristics of the aircraft tail number, and find the position of the tail number area from the image including the entire airport scene monitoring;
步骤三、基于DCT域和边缘检测的飞机尾号定位技术,由于飞机尾号部分存在大量的边缘信息,提出了基于DCT域和边缘特征的飞机尾号定位技术;Step 3. Aircraft tail number positioning technology based on DCT domain and edge detection. Since there is a large amount of edge information in the aircraft tail number part, the aircraft tail number positioning technology based on DCT domain and edge features is proposed;
步骤四、采用Otsu动态阈值二值化的飞机尾号分割图像预处理,对已经定位的飞机尾号区域实施基于Otsu动态阈值二值化的飞机尾号分割图像预处理,获得更清晰的飞机尾号的目标区域;Step 4. Use Otsu dynamic threshold binarization for aircraft tail number segmentation image preprocessing, implement aircraft tail number segmentation image preprocessing based on Otsu dynamic threshold binarization for the positioned aircraft tail number area, and obtain a clearer aircraft tail No. target area;
步骤五、连通区域飞机尾号分割,然后采用连通区域分割法将飞机尾号区域分割成单个字符区域;Step 5, segment the aircraft tail number in the connected area, and then use the connected area segmentation method to segment the aircraft tail number area into a single character area;
步骤六、飞机尾号识别,提出基于最优参数支持向量机的飞机尾号识别算法实现飞机尾号识别;Step 6, aircraft tail number recognition, the aircraft tail number recognition algorithm based on the optimal parameter support vector machine is proposed to realize the aircraft tail number recognition;
步骤七、识别结果后处理。Step 7, post-processing the recognition result.
其中,步骤四所述的采用Otsu动态阈值二值化的飞机尾号分割图像预处理具体方法为:Wherein, the specific method of the aircraft tail number segmentation image preprocessing described in step 4 using Otsu dynamic threshold binarization is:
采用Otsu动态二值化算法的具体步骤如下:设图像中的灰度级范围为0-L-1,灰度级为i的像素点数为ni,则图像的全部像素数为:The specific steps of using the Otsu dynamic binarization algorithm are as follows: Suppose the gray scale range in the image is 0-L-1, and the number of pixels with gray scale i is n i , then the total number of pixels in the image is:
N=n0+n1+...+nL-1 (2)N=n 0 +n 1 +...+n L-1 (2)
归一化直方图得到:Normalizing the histogram yields:
阈值t将灰度划分为两类:The threshold t divides the grayscale into two categories:
C1={1,2,...,t},C2={t+1,t+2,...,L-1} (4)C 1 ={1, 2, . . . , t}, C 2 ={t+1, t+2, . . . , L-1} (4)
由图像的灰度直方图可得C1和C2类的出现概率为:From the gray histogram of the image, the occurrence probability of C1 and C2 classes can be obtained as:
则C1和C2类各自的均值分别为:Then the respective means of C1 and C2 are:
C1和C2类各自的方差分别为:The respective variances of classes C1 and C2 are:
定义C1和C2类的类间方差为:Define the between-class variance for classes C1 and C2 for:
C1和C2类的类内方差为:Intra-class variance for classes C1 and C2 for:
则Otsu动态二值化算法的准则函数为:Then the criterion function of Otsu dynamic binarization algorithm is:
从最小灰度值到最大灰度值遍历t,当t使得η取得最大值时的T即为Otsu最佳分割阈值,Otsu动态阈值二值化算法实现较为简单,只要找到遍历所有t的范围,找到一个合适的阈值T即可,该方法对飞机尾号目标和背景分离、图像二值化效果较好。Traversing t from the minimum gray value to the maximum gray value, when t makes η obtain the maximum value, T is the optimal segmentation threshold of Otsu. The implementation of Otsu dynamic threshold binarization algorithm is relatively simple, as long as the range of traversing all t is found, It is enough to find a suitable threshold T. This method has a good effect on the separation of the aircraft tail number target from the background and the image binarization.
其中,步骤五所述的采用连通区域分割法将飞机尾号区域分割成单个字符区域的具体的字符切分方法步骤如下:Wherein, the specific character segmentation method steps of adopting the connected area segmentation method described in step 5 to segment the aircraft tail number area into a single character area are as follows:
步骤A1、扫描图像,找到一个当前不属于任何区域的像素,也即要找出一个新的进行区域增长的起点;Step A1, scan the image, find a pixel that does not currently belong to any region, that is, find a new starting point for region growth;
步骤A2、把这个像素的灰度与其周围的4邻域(上、下、左、右。或8邻域,本飞机尾号识别系统中为了减轻字符粘连的影响,采用4领域)不属于任何一个区域的像素灰度相比较,如果满足一定判断准则,就把它作为同一个区域加以合并;Step A2, the gray level of this pixel and its surrounding 4 neighbors (upper, lower, left, right. Or 8 neighbors, in order to reduce the influence of character adhesion in this aircraft tail number recognition system, adopt 4 areas) do not belong to any Compare the pixel gray level of an area, if it meets a certain judgment criterion, it will be merged as the same area;
步骤A3、对于那些新合并的像素,反复进行步骤A2的操作;Step A3, for those newly merged pixels, repeat the operation of step A2;
步骤A4、反复进行步骤A2、步骤A3的操作,直至区域不能再扩张为止;Step A4, repeating the operations of step A2 and step A3 until the area can no longer be expanded;
步骤A5、返回到步骤A1,寻求能成为新区域出发点的像素。Step A5, return to step A1, and search for a pixel that can be the starting point of the new area.
本发明和现有技术相比的优点在于:Compared with the prior art, the present invention has the following advantages:
1、本发明采用的基于Otsu动态阈值二值化的飞机尾号分割图像预处理算法,使阈值分割的两类的类间方差大,类内方差小,即类间方差与类内方差的比最大,取到动态的最佳阈值,这种方法能够适用于光照环境多样性的机场场面环境。1, the aircraft tail number segmentation image preprocessing algorithm based on the Otsu dynamic threshold binarization that the present invention adopts makes the variance between the two classes of the threshold segmentation large, and the variance within the class is small, that is, the ratio of the variance between the classes and the variance within the class Maximum, to get the dynamic optimal threshold, this method can be applied to the airport scene environment with a variety of lighting environments.
2、本发明基于飞机尾号国籍标志的飞机尾号逆射影变换方法能够消除摄像头在机场场面监控对飞机尾号图像采集过程中产生的射影变换,对飞机尾号进行逆射影变换能够有效提高飞机尾号识别系统的识别正确率。2. The aircraft tail number inverse projection transformation method based on the aircraft tail number nationality mark of the present invention can eliminate the projective transformation generated by the camera in the process of collecting images of the aircraft tail number in the airport scene monitoring, and the inverse projection transformation of the aircraft tail number can effectively improve the aircraft tail number. The recognition accuracy of the tail number recognition system.
附图说明 Description of drawings
图1为本发明中飞机尾号识别方法的实现过程;Fig. 1 is the realization process of aircraft tail number recognition method among the present invention;
图2为飞机尾号定位图像预处理过程中灰度变换原理图;这样的变换发现可以把我们感兴趣的灰度区间(r0-r1)内的灰度范围拉开,从而达到增强对比度的目的;Figure 2 is the schematic diagram of the grayscale transformation during the preprocessing of the aircraft tail number positioning image; such transformation can open the grayscale range in the grayscale interval (r0-r1) we are interested in, so as to achieve the purpose of enhancing the contrast ;
图3为使用基于DCT域和边缘特征的飞机尾号定位技术的飞机尾号定位效果;图3(a)为RGB图像转换后的灰度图像;图3(b)是将灰度图像分割为8×8的图像块,对每一图像块做二维DCT变换,从而得到二维的DCT矩阵;图3(c)使用Canny算子对尾号区域作边缘检测,得到该区域的边缘图;图3(d)为使用基于DCT域和边缘特征的飞机尾号定位技术的得到的尾号定位结果;Figure 3 shows the effect of aircraft tail number positioning using the aircraft tail number positioning technology based on DCT domain and edge features; Figure 3(a) is the grayscale image after RGB image conversion; Figure 3(b) is the grayscale image divided into For 8×8 image blocks, two-dimensional DCT transformation is performed on each image block to obtain a two-dimensional DCT matrix; Figure 3(c) uses the Canny operator to perform edge detection on the tail number area to obtain the edge map of this area; Figure 3(d) is the tail number positioning result obtained by using the aircraft tail number positioning technology based on DCT domain and edge features;
图4为采用Otsu动态阈值图像二值化进行飞机尾号分割图像预处理得到的部分实验结果;Figure 4 shows some experimental results obtained by using Otsu dynamic threshold image binarization for image preprocessing of aircraft tail number segmentation;
图5为本发明采用的连通区域法进行飞机尾号字符分割的实验结果;Fig. 5 is the experimental result that the connected region method that the present invention adopts carries out the aircraft tail number character segmentation;
图6为飞机尾号图像的逆射影变换去控制点的过程;Fig. 6 is the process of removing control points for the inverse projection transformation of the aircraft tail number image;
图7为机尾号图像的逆射影变换的实验结果;Fig. 7 is the experimental result of the inverse projection transformation of the tail number image;
图8为二维两类线性可分的样本集的情况下的支持向量机最优分类面;Fig. 8 is the support vector machine optimal classification surface in the case of two-dimensional two-class linearly separable sample sets;
图9为使用网格重心法提取飞机尾号字符特征的效果;Figure 9 is the effect of using the grid center of gravity method to extract the character features of the aircraft tail number;
图10为多类分类器方法。Figure 10 shows the multi-class classifier approach.
具体实施方式: Detailed ways:
为了更好的理解本发明的技术方案,以下结合附图对本发明的具体实施方式作进一步的描述。In order to better understand the technical solution of the present invention, the specific implementation manners of the present invention will be further described below in conjunction with the accompanying drawings.
图1为本发明的飞机尾号识别方法的实现过程。各个步骤具体实施细节如下:Fig. 1 is the implementation process of the aircraft tail number recognition method of the present invention. The specific implementation details of each step are as follows:
步骤一、原始图片的图像捕获,从不同的视角拍摄多幅图像,并从多幅图像中找出含有飞机尾号的图像。Step 1, the image capture of the original picture, taking multiple images from different angles of view, and finding the image containing the tail number of the aircraft from the multiple images.
在传统道路交通违章抓拍中,专用相机在特定的触发下,进行拍照,所以抓拍到的汽车车牌一般出现在图像的固定区域。由于相机焦距和视角都是固定的,所以车牌的大小和角度也相对确定。而且,车牌的字符数和字体都相对固定。然而上述识别方法并不适用于机场场面飞机的机尾号识别,主要存在以下两个难点:1、机尾号分割难度大。由于飞机停放位置、与摄像头的相对角度不同、机身为光滑弧度曲面,导致机尾号在图像中的位置、大小不固定,字体存在透视变形;机尾号也不存在边框与图像其它部分分割,因此图像分割难度比较大;2、字符多样性。机身上的字体(正体、斜体)、颜色(有白底黑字、深色底白字)和长短(字符数)也各异,加之字符间固有的相似性,很大程度上影响了识别的准确率。In the traditional capture of road traffic violations, a special camera takes pictures under a specific trigger, so the captured car license plate generally appears in a fixed area of the image. Since the focal length and angle of view of the camera are fixed, the size and angle of the license plate are also relatively determined. Moreover, the number of characters and the font of the license plate are relatively fixed. However, the above-mentioned identification method is not applicable to the identification of the tail number of the aircraft at the airport scene, and there are mainly two difficulties as follows: 1. It is difficult to segment the tail number. Due to the different relative angles between the aircraft’s parking position and the camera, and the smooth curved surface of the fuselage, the position and size of the tail number in the image are not fixed, and the font has perspective deformation; there is no border between the tail number and other parts of the image. , so image segmentation is more difficult; 2. Character diversity. The fonts (normal, italic), colors (black on white background, white on dark background) and length (number of characters) on the fuselage are also different. In addition, the inherent similarity between characters greatly affects the accuracy of recognition. Accuracy.
针对以上两个难点,我们需要从不同的视角拍摄多幅图像,并从多幅图像中找出含有飞机尾号的图像。For the above two difficulties, we need to take multiple images from different angles of view, and find the image containing the tail number of the aircraft from the multiple images.
步骤二、飞机尾号定位预处理。Step 2, aircraft tail number positioning preprocessing.
对机场场面监控图像进行飞机尾号定位的预处理首先要对图像进行灰度化。在RGB模型中,如果R=G=B,则颜色表示一种灰度颜色,其中R=G=B的值叫做灰度值,我们用g来表示。由彩色转换为灰度的过程叫做灰度化处理。由于彩色图像的存储往往占用很大的空间,在对图像进行识别等处理中经常将彩色图像转换为灰度图像,以加快后续的处理速度。R、G、B的取值范围是0-255,所以灰度的级别是256级。The preprocessing of the aircraft tail number location on the airport scene monitoring image first needs to grayscale the image. In the RGB model, if R=G=B, the color represents a grayscale color, where the value of R=G=B is called the grayscale value, and we use g to represent it. The process of converting from color to grayscale is called grayscale processing. Because the storage of color images often takes up a lot of space, color images are often converted into grayscale images in processing such as image recognition to speed up subsequent processing. The value range of R, G, and B is 0-255, so the gray level is 256 levels.
使用灰度变换的主要目的就是提高图像的对比度,即增强原图像各部分的反差。如果一幅图成像时由于光线过暗或者曝光不足,则整幅图偏暗(如灰度范围从0到63);光线过亮或曝光过度则图像偏亮(如灰度范围从200到255),都会造成图像对比度偏低问题,即灰度都挤在一起了,没有拉开。我们可以采用灰度变换的方法来增强图像的对比度。The main purpose of using grayscale transformation is to improve the contrast of the image, that is, to enhance the contrast of each part of the original image. If an image is imaged due to too dark light or underexposure, the entire image will be dark (such as the grayscale range from 0 to 63); if the light is too bright or overexposed, the image will be bright (such as the grayscale range is from 200 to 255) ), will cause the problem of low image contrast, that is, the gray levels are crowded together and not pulled apart. We can use the method of grayscale transformation to enhance the contrast of the image.
所谓灰度变换就是将一个灰度区间映射到另一个灰度区间的变换。一种分段线性灰度变换可以定义为:The so-called grayscale transformation is the transformation that maps one grayscale interval to another grayscale interval. A piecewise linear grayscale transformation can be defined as:
其原理如图2所示。其中横坐标r表示变换前的灰度等级,纵坐标s表示变换后对应灰度等级,图中粗线(折线)表示了输出灰度和输入灰度的函数关系,细虚线(直线)表示不做任何变换时的情况。经过对比可以发现,这样的变换发现可以把我们感兴趣的灰度区间(r0-r1)内的灰度范围拉开,从而达到增强对比度的目的。Its principle is shown in Figure 2. The abscissa r represents the gray level before transformation, and the ordinate s represents the corresponding gray level after transformation. The thick line (broken line) in the figure represents the functional relationship between the output gray level and the input gray level, and the thin dashed line (straight line) represents the different when doing any transformations. After comparison, it can be found that such a transformation can expand the gray scale range in the gray scale interval (r 0 -r 1 ) we are interested in, so as to achieve the purpose of enhancing the contrast.
步骤三、基于DCT域和边缘检测的飞机尾号定位技术。Step 3, aircraft tail number positioning technology based on DCT domain and edge detection.
首先通过像素插值将待处理的机场场面监控RGB图像缩放到240×320像素尺寸,然后将RGB图像转换到YUV空间,抽取图像的亮度分量作为灰度图像,如图3(a)所示。将灰度图像分割为8×8的图像块,对每一图像块做二维DCT变换,从得到的二维DCT矩阵中选取一种合适的DCT域特征,为图像中各个图像块计算特征值,如图3(b)所示。选取适当的分类方法,将图像块分为尾号块和背景块两类,由此从图像中初步定位。由于飞机尾号区域边缘信息比较丰富,使用Canny算子对尾号区域作边缘检测,得到该区域的边缘图,如图3(c)所示,然后对此边缘图进行水平扫描,垂直扫描,计算尾号区域各块的边缘密度,再利用阈值方法排除掉误判块,由此尾号区域优化便得到了图像中的较可靠文本区域。使用基于DCT域和边缘特征的飞机尾号定位技术的飞机尾号定位效果如图3(d)所示。First, the airport scene monitoring RGB image to be processed is scaled to 240×320 pixel size by pixel interpolation, then the RGB image is converted to YUV space, and the brightness component of the image is extracted as a grayscale image, as shown in Figure 3(a). Divide the grayscale image into 8×8 image blocks, perform two-dimensional DCT transformation on each image block, select a suitable DCT domain feature from the obtained two-dimensional DCT matrix, and calculate the eigenvalues for each image block in the image , as shown in Figure 3(b). Choose an appropriate classification method to divide the image block into two types: tail number block and background block, and then locate it initially from the image. Since the edge information of the aircraft tail number area is relatively rich, the Canny operator is used to detect the edge of the tail number area to obtain the edge map of the area, as shown in Figure 3(c), and then scan the edge map horizontally and vertically. Calculate the edge density of each block in the tail number area, and then use the threshold method to eliminate the misjudged blocks, so that the tail number area optimization can get a more reliable text area in the image. The effect of aircraft tail number positioning using the aircraft tail number positioning technology based on DCT domain and edge features is shown in Figure 3(d).
步骤四、采用Otsu动态阈值二值化的飞机尾号分割图像预处理。Step 4: Preprocessing the aircraft tail number segmentation image using Otsu dynamic threshold binarization.
全局动态二值化算法从整个灰度图像的像素分布出发,寻求一个最佳的门限值,其中的经典算法Otsu算法,其基本思想是:取一个阈值值t,将图像像索按灰度大小分为大于等于t和小于t两类,然后求出两类像素的平均值方差,即类间方差和两类像素各自的均方差,即类内方差。找出使两个方差比η最大的阈值t,该阈值值即为二值化图像的最佳阈值。这种二值化方法不论图像的直方图有无明显的双峰,都能得到较为满意的效果。因此这种方法是阈值自动选取的较优方法。The global dynamic binarization algorithm starts from the pixel distribution of the entire grayscale image and seeks an optimal threshold value. The basic idea of the classic algorithm Otsu algorithm is: take a threshold value t, and divide the image into grayscale The size is divided into two categories greater than or equal to t and less than t, and then find the average variance of the two types of pixels, that is, the inter-class variance and the mean square error of each of the two types of pixels, that is, the intra-class variance . Find the threshold t that makes the two variances ratio η the largest, and this threshold value is the optimal threshold of the binarized image. This binarization method can obtain satisfactory results regardless of whether the histogram of the image has obvious double peaks. Therefore, this method is a better method for automatic threshold selection.
Otsu动态二值化算法的具体步骤如下:设图像中的灰度级范围为0-L-1,灰度级为i的像素点数为ni,则图像的全部像素数为:The specific steps of the Otsu dynamic binarization algorithm are as follows: Assuming that the gray level range in the image is 0-L-1, and the number of pixels with gray level i is n i , then the total number of pixels in the image is:
N=n0+n1+...+nL-1 (2)N=n 0 +n 1 +...+n L-1 (2)
归一化直方图得到:Normalizing the histogram yields:
阈值t将灰度划分为两类:The threshold t divides the grayscale into two categories:
C1={1,2,...,t},C2={t+1,t+2,...,L-1} (4)C 1 ={1, 2, . . . , t}, C 2 ={t+1, t+2, . . . , L-1} (4)
由图像的灰度直方图可得C1和C2类的出现概率为:From the gray histogram of the image, the occurrence probability of C1 and C2 classes can be obtained as:
则C1和C2类各自的均值分别为:Then the respective means of C1 and C2 are:
C1和C2类各自的方差分别为:The respective variances of classes C1 and C2 are:
定义C1和C2类的类间方差为:Define the between-class variance for classes C1 and C2 for:
C1和C2类的类内方差为:Intra-class variance for classes C1 and C2 for:
则Otsu动态二值化算法的准则函数为:Then the criterion function of Otsu dynamic binarization algorithm is:
从最小灰度值到最大灰度值遍历t,当t使得η取得最大值时的T即为Otsu最佳分割阈值。Otsu动态二值化算法以阈值划分的两类的类间方差与类内方差的比值反映两类模式在模式空间的分布情况。类间方差越大,类内方差越小,则说明分类结果类与类之间距离大,每类自身各像素性质相似度越大,即阈值分类结果越好。Traversing t from the minimum gray value to the maximum gray value, when t makes η obtain the maximum value, T is the optimal segmentation threshold of Otsu. The ratio of the between-class variance and the intra-class variance of the two classes divided by the threshold value of the Otsu dynamic binarization algorithm reflects the distribution of the two classes of patterns in the pattern space. The greater the inter-class variance and the smaller the intra-class variance, it means that the distance between the classification results is large, and the similarity of each pixel property of each class is greater, that is, the threshold classification results are better.
Otsu动态阈值二值化算法实现较为简单,只要找到遍历所有t的范围,找到一个合适的阈值T即可。在本项目中实际的飞机尾号识别系统中即应用了Otsu动态二值化算法,实验结果发现,该方法对飞机尾号目标和背景分离、图像二值化效果较好。采用Otsu动态阈值二值化算法对飞机尾号字符分割图像预处理的部分实验结果如图4所示。Otsu's dynamic threshold binarization algorithm is relatively simple to implement, as long as the range of traversing all t is found and a suitable threshold T is found. In the actual aircraft tail number recognition system in this project, the Otsu dynamic binarization algorithm is applied. The experimental results show that this method has a better effect on the separation of the aircraft tail number target and the background, and image binarization. Part of the experimental results of using the Otsu dynamic threshold binarization algorithm to preprocess the aircraft tail number character segmentation image are shown in Figure 4.
步骤五、连通区域飞机尾号分割。Step 5: Segmentation of aircraft tail numbers in connected areas.
在理想的情况下,对于飞机尾号的各个字符,每个字符都构成了一个独立的连通区域。连通区域分割法把图像分割成特征相同的小区域(最小的单位是像素),研究与其相邻的各个小区域之间的特征,把具有类似特征的小区域依据一定的判断准则依次合并起来,于是只要得到每个连通区域的最小外界矩阵,我们就可以得到各个飞机尾号字符的位置。具体的字符切分方法如下:In an ideal situation, for each character of the aircraft tail number, each character constitutes an independent connected region. The connected region segmentation method divides the image into small regions with the same characteristics (the smallest unit is a pixel), studies the characteristics between the adjacent small regions, and merges the small regions with similar characteristics according to certain judgment criteria. So as long as we get the minimum external matrix of each connected region, we can get the position of each aircraft tail number character. The specific character segmentation method is as follows:
步骤A1、扫描图像,找到一个当前不属于任何区域的像素,也即要找出一个新的进行区域增长的起点;Step A1, scan the image, find a pixel that does not currently belong to any region, that is, find a new starting point for region growth;
步骤A2、把这个像素的灰度与其周围的4邻域(上、下、左、右。或8邻域,本飞机尾号识别系统中为了减轻字符粘连的影响,采用4领域)不属于任何一个区域的像素灰度相比较,如果满足一定判断准则,就把它作为同一个区域加以合并;Step A2, the gray level of this pixel and its surrounding 4 neighbors (upper, lower, left, right. Or 8 neighbors, in order to reduce the influence of character adhesion in this aircraft tail number recognition system, adopt 4 areas) do not belong to any Compare the pixel gray level of an area, if it meets a certain judgment criterion, it will be merged as the same area;
步骤A3、对于那些新合并的像素,反复进行步骤A2的操作;Step A3, for those newly merged pixels, repeatedly perform the operation of step A2;
步骤A4、反复进行步骤A2、步骤A3的操作,直至区域不能再扩张为止;Step A4, repeating the operations of step A2 and step A3 until the area can no longer be expanded;
步骤A5、返回到步骤A1,寻求能成为新区域出发点的像素。Step A5, return to step A1, and search for a pixel that can be the starting point of the new area.
本发明中,进行区域增长的图像是己经进行了阈值分割的二值化图像,图像中只含有0和255两种像素,因此寻找起点和确定区域增长的判断准则都比较简单。实验中,从图像左上角开始,从左到右,从上到下扫描图像,找到第一个黑像素,即灰度值为0的像素,作为第一个起点,进行区域增长,增长方向为4邻域,即从上、下、左、右四个方向搜索满足条件的下一个像素。而判断准则定义为邻域像素与中心像素的差值为0,即同为黑像素的点合并到同一个区域中。程序中,定义了队列来存放区域中的像素。如果遇到黑像素就入队列,则可能会出现黑像素多次入队列的情况,队列中就会出现很多“冗余”的信息。因此,本文采用了“记号填充”的方法,把第一次压入队列的像素做个标记,使它区别于图像中原有的像素0和255。对已标记的像素,下一次区域增长时,就不需要处理了。这样减少了队列空间,提高了处理速度。由此,得到本系统对阈值分割后的二值化图像的连通域分割算法:In the present invention, the image for region growth is a binarized image that has been subjected to threshold segmentation, and the image only contains two types of pixels, 0 and 255, so the judging criteria for finding the starting point and determining the region growth are relatively simple. In the experiment, start from the upper left corner of the image, scan the image from left to right, and from top to bottom, find the first black pixel, that is, the pixel with a gray value of 0, and use it as the first starting point to perform region growth. The growth direction is 4 Neighborhoods, that is, search for the next pixel that meets the conditions from the four directions of up, down, left, and right. The judgment criterion is defined as the difference between the neighboring pixels and the central pixel is 0, that is, the points that are also black pixels are merged into the same area. In the program, a queue is defined to store pixels in the area. If black pixels are encountered and put into the queue, black pixels may be queued multiple times, and a lot of "redundant" information will appear in the queue. Therefore, this paper adopts the method of "mark filling" to mark the pixel that is pushed into the queue for the first time, so that it is different from the
步骤B1、从左上角开始,扫描二值图像,找到第一个黑色像素,以此像素作为第一个区域进行连通区域的起点,并将其加上标记后,把坐标位置压入队列std::queue regionStack中保存,记录当前区域大小的记数值nRegionSize++;Step B1, start from the upper left corner, scan the binary image, find the first black pixel, use this pixel as the starting point of the first connected area, mark it, and push the coordinate position into the queue std: : save in queue regionStack, record the count value nRegionSize++ of the current region size;
步骤B2、取出队列中的像素,对该像素进行4邻域搜索,找到其他未被标记的黑色像素,加上标记后放入队列尾部,合并到当前连通区域中,记录当前区域大小的记数值nRegionSize++;Step B2, take out the pixel in the queue, perform a 4-neighborhood search on the pixel, find other unmarked black pixels, add the mark and put it at the end of the queue, merge it into the current connected area, and record the count value of the current area size nRegionSize++;
步骤B3、对于队列中那些新合并的像素,反复进行2的操作;Step B3, for those newly merged pixels in the queue, repeatedly perform the operation of 2;
步骤B4、反复进行步骤B2、步骤B3的操作,直至区域不能再扩张,堆栈为空为止;Step B4, repeat the operations of step B2 and step B3 until the area can no longer be expanded and the stack is empty;
以上操作即得到飞机尾号图像中的一个连通域。重复以上步骤,得到飞机尾号图像中的所有连通区域。根据国际民用航空组织对飞机尾号格式的标准定义我们可知,正常的飞机尾号应分为6个连通区域。将各连通区域按区域大小排序,将排序在6以后的连通区域消除,这些区域被认为是噪声。对于排序前6位的区域按从左到右的顺序,即为飞机尾号的国籍标志、“-”号分隔符和登记标志的四位数字。图5就是采用连通区域法进行飞机尾号字符分割的实验结果。The above operation obtains a connected domain in the aircraft tail number image. Repeat the above steps to get all connected regions in the aircraft tail number image. According to the standard definition of the format of the aircraft tail number by the International Civil Aviation Organization, we know that the normal aircraft tail number should be divided into 6 connected areas. The connected regions are sorted according to the size of the region, and the connected regions sorted after 6 are eliminated, and these regions are considered as noise. For the first 6 areas, in order from left to right, they are the nationality mark of the aircraft tail number, the "-" separator and the four digits of the registration mark. Figure 5 is the experimental result of character segmentation of aircraft tail number using the connected region method.
步骤六、飞机尾号识别。Step 6: Identify the tail number of the aircraft.
步骤C1、逆仿射变换的飞机尾号识别预处理Step C1, Aircraft Tail Number Recognition Preprocessing of Inverse Affine Transformation
为了进行飞机尾号图像的逆射影变换,需要获取飞机尾号图像中的4个“信标”点的坐标,和它们在原始图像中相对应的坐标值,即可计算出射影变换矩阵T的值。通过前几部分中对机场场面监控图像进行飞机尾号定位和飞机尾号分割,我们已经得到了飞机尾号中的单个字符,其中包括飞机尾号国际标志和飞机尾号登记标志。本发明即从飞机尾号图像中的国籍标志字母出发,获取影射变换的“信标”控制点,对飞机尾号图片进行逆影射变换。In order to carry out the inverse projective transformation of the aircraft tail number image, it is necessary to obtain the coordinates of the four "beacon" points in the aircraft tail number image, and their corresponding coordinate values in the original image, and then calculate the projective transformation matrix T value. Through the aircraft tail number positioning and aircraft tail number segmentation on the airport scene monitoring images in the previous sections, we have obtained a single character in the aircraft tail number, including the international sign of the aircraft tail number and the registration sign of the aircraft tail number. The present invention starts from the nationality symbol letters in the aircraft tail number image, obtains the "beacon" control point of mapping transformation, and performs inverse mapping transformation on the aircraft tail number image.
首先对经过飞机尾号定位和飞机尾号分割得到的飞机尾号国籍标志部分(即为影射变换后的字母“B”,本发明提出的逆射影变换方法目前仅限在国籍标志为中国注册的飞机尾号)。取国籍标志部分与斜率分别为k1=1、k2=-1的直线的切点,作为射影变换的两个控制点。分析该两个控制点连线的斜率k,以决定是否需要进行逆射影变换。本发明中取当连线与y轴之间的夹角大于10°时即进行逆影射变换。再取斜率为k的直线与国籍标志右侧上、下部分的切点作为另外两个控制点。取控制点的过程如图6所示。采用如上所述的算法,对国籍标志部分取4个控制点,即可得到上述射影变换方程中的射影变换矩阵T。计算出影射变换矩阵T后,即可对整个飞机尾号图片进行逆影射变换。在实际应用中,采用如上算法的逆射影变换飞机尾号图像预处理效果较好,能够有效去除摄像机机场场面监控中采集图像受到的射影变换的影响。具体效果如7图所示。Firstly, the aircraft tail number nationality mark part obtained through aircraft tail number positioning and aircraft tail number segmentation (that is, the letter "B" after insinuation transformation, the inverse projection transformation method proposed by the present invention is currently limited to those whose nationality mark is registered in China aircraft tail number). Take the tangent points between the part of the nationality mark and the straight line whose slopes are k 1 =1, k 2 =-1 respectively, as the two control points of the projective transformation. Analyze the slope k of the line connecting the two control points to determine whether to perform inverse projection transformation. In the present invention, when the angle between the connecting line and the y-axis is greater than 10°, the inverse mapping transformation is performed. Then take the point of tangency between the straight line with slope k and the upper and lower parts on the right side of the nationality mark as the other two control points. The process of obtaining control points is shown in Figure 6. Using the above-mentioned algorithm and taking 4 control points for the part of the nationality mark, the projective transformation matrix T in the above projective transformation equation can be obtained. After the projection transformation matrix T is calculated, the inverse projection transformation can be performed on the entire aircraft tail number picture. In practical applications, the inverse projective transformation using the above algorithm has a good preprocessing effect on the aircraft tail number image, and can effectively remove the influence of the projective transformation on the images collected in the camera airport scene monitoring. The specific effect is shown in Figure 7.
步骤C2、基于支持向量机的飞机尾号字符特征提取Step C2, feature extraction of aircraft tail number characters based on support vector machine
支持向量机的分类方法,基于结构风险最小化原则的统计学习理论是一种专门的小样本统计理论,它为研究有限样本情况下的统计模式识别,并为更广泛的机器学习问题建立了一个较好的理论框架,同时也发展了一种新的模式识别方法——支持向量机(Support VectorMachine,SVM)。这是统计学习理论中最为年轻的一部分,其主要内容在1992-1995年间才基本完成,目前仍处在不断发展阶段。可以说,统计学习理论之所以从20世纪90年代以来受到越来越多的重视,很大程度上是因为其发展出了支持向量机这一通用学习方法。The classification method of support vector machine, the statistical learning theory based on the principle of structural risk minimization is a specialized small-sample statistical theory, which establishes an A better theoretical framework, but also developed a new pattern recognition method - Support Vector Machine (Support VectorMachine, SVM). This is the youngest part of statistical learning theory, its main content was basically completed between 1992 and 1995, and it is still in the stage of continuous development. It can be said that the reason why statistical learning theory has received more and more attention since the 1990s is largely due to the development of support vector machines, a general learning method.
SVM方法是从线性可分情况下的最优分类面(Optimal Hyperplane)提出的。考虑图8所示的二维两类线性可分的样本集的情况,图中带“+”号的点和带“×”号的点分别表示两类的训练样本,H为把两类样本没有错误分开的分类线,H1,H2分别为过各类样本中离分类线最近的点且平行于分类线的直线,H1和H2之间的距离叫做两类的分类空隙或分类间隔(margin)。所谓的最优分类线就是要求分类线不但能将两类无错误的分开,而且要使两类的分类间隔最大。结合前一章的论述,无错划分是使经验风险最小(为0),而使分类间隔最大实际上是使推广性的界中的置信范围最小,从而使真实风险最小。将这一模型推广到高维空间,最优分类线就扩展为最优分类面。The SVM method is proposed from the Optimal Hyperplane in the case of linear separability. Consider the two-dimensional two-class linearly separable sample set shown in Figure 8. The points with "+" and "×" in the figure represent two types of training samples respectively, and H is the two types of samples Classification lines without errors. H 1 and H 2 are straight lines passing through the points closest to the classification lines in various samples and parallel to the classification lines. The distance between H 1 and H 2 is called the classification gap or classification of the two classes. interval (margin). The so-called optimal classification line is the requirement that the classification line can not only separate the two classes without errors, but also maximize the classification interval between the two classes. Combined with the discussion in the previous chapter, the error-free classification is to minimize the empirical risk (0), and to maximize the classification interval is actually to minimize the confidence range in the generalization bound, so that the real risk is minimized. Extending this model to high-dimensional space, the optimal classification line is extended to the optimal classification surface.
基于支持向量机的飞机尾号字符特征提取具体为:The feature extraction of aircraft tail number characters based on support vector machine is as follows:
如果直接将字符图像作为分类器的输入,将会带来较大的计算量。例如,假设一个飞机尾号的单个字符图像的规格为16×16像素,若以每个像素的灰度值作为特征,则每个输入都是多达256维的特征向量。如此庞大的特征量,无论是从计算的复杂度还是对分类器的性能来看都是很不利的。If the character image is directly used as the input of the classifier, it will bring a large amount of calculation. For example, assuming that the specification of a single character image of an airplane tail number is 16×16 pixels, if the gray value of each pixel is used as a feature, each input is a feature vector with up to 256 dimensions. Such a large amount of features is unfavorable both in terms of computational complexity and performance of classifiers.
字符特征提取总的出发点就是找到从识别的角度而言较有效的特征向量,即来自同一类别的不同样本的特征值应该非常相近,而来自不同样本的特征值应该有很大的差异。对于字符识别来说,抽取有效的字符特征是完成字符识别的首要任务。字符的特征可以分为结构特征和统计特征两大类。The general starting point of character feature extraction is to find more effective feature vectors from the perspective of recognition, that is, the feature values of different samples from the same category should be very similar, while the feature values from different samples should be very different. For character recognition, extracting effective character features is the primary task of character recognition. Character features can be divided into two categories: structural features and statistical features.
结构特征的提取重点在于确定以基元表示的结构信息,目前主要有基于骨架、轮廓、笔划等得到的结构特征。骨架是人们对字符的抽象认识,基于骨架的结构特征包括特征点——端点、交叉点、转折点等。基于骨架的特征提取极大地依赖于图像细化质量。由于现有的细化算法都或多或少的会出现一些拓扑结构的改变,如Y形分叉、毛刺、断线等。这就要求后继识别有较大的规则灵活性。轮廓也可以反映字符图像的结构,通过在一定范围内寻找轮廓的最远、最近点和最大、最小点得到一系列的结构特征。轮廓相对于骨架,带入了更精确的位置,也节省了细化的运算量,但它易受到笔划宽度和断线的影响,它较适用于图像质量较好,书写较固定的环境。The focus of the extraction of structural features is to determine the structural information represented by primitives. Currently, there are mainly structural features based on skeletons, outlines, and strokes. Skeleton is people's abstract understanding of characters, and the structural features based on skeleton include feature points—endpoints, intersections, turning points, etc. Skeleton-based feature extraction greatly depends on image refinement quality. Due to the existing thinning algorithms, there will be some topological changes more or less, such as Y-shaped bifurcation, burrs, disconnection, etc. This requires greater rule flexibility for subsequent recognition. The outline can also reflect the structure of the character image, and a series of structural features can be obtained by finding the farthest, closest point, maximum, and minimum point of the outline within a certain range. Compared with the skeleton, the outline brings in a more precise position and saves the amount of thinning calculations, but it is easily affected by stroke width and broken lines. It is more suitable for an environment with better image quality and more fixed writing.
统计特征是从原始数据中提取与分类最相关的信息,使类内差距极小化,类间差距极大化。特征应对同一字符类的形变尽量保持不变。统计特征可分为全局特征和局部特征。全局特征是对整个字符图像进行变换,将变换后的系数作为图像的一种特征,主要包括:KL(Karhunen-Leeve)变换、Fourier变换、Hadamand变换、Hough变换、矩特征等。局部特征是在特定的位置对特定大小的窗内图像进行变换,主要包括局部灰度特征、投影特征、方向线素特征等。局部灰度特征又称粗网格特征,它通过将标准化图像划分成固定或弹性的网格,并求出每个网格中的平均灰度或目标像素点的个数,得到维数为网格数目的特征向量。投影特征通过对标准化的图像求X和Y方向的投影得到两个N维特征向量,投影特征计算简单,在粗分类是有较好的分辨性。方向线素特征将字符划分为一定的网格,在各个网格中对每个点的不同方向的相邻黑点分为若干类。Statistical features are to extract the most relevant information for classification from the original data, so that the intra-class gap is minimized and the inter-class gap is maximized. Features should try to remain unchanged for deformations of the same character class. Statistical features can be divided into global features and local features. The global feature is to transform the entire character image, and the transformed coefficient is used as a feature of the image, mainly including: KL (Karhunen-Leeve) transform, Fourier transform, Hadamand transform, Hough transform, moment feature, etc. The local feature is to transform the image in the window of a specific size at a specific position, mainly including local grayscale features, projection features, direction line features, etc. The local grayscale feature is also called the coarse grid feature. It divides the standardized image into fixed or elastic grids, and calculates the average grayscale or the number of target pixels in each grid, and obtains the dimension as the grid The eigenvector of the grid number. The projection feature obtains two N-dimensional feature vectors by projecting the standardized image in the X and Y directions. The calculation of the projection feature is simple, and it has better resolution in rough classification. The directional line element feature divides the character into a certain grid, and in each grid, the adjacent black dots in different directions of each point are divided into several categories.
结构特征和统计特征提取特征方法各有其优缺点:对于统计特征方法,在确定了某种特征后,特征提取算法简单,易于训练,给定的训练集上能得到相对较高的识别率。但最大的缺点是在特征选择上较难;结构特征方法的主要优点之一是能够描述字符的结构,将字符模式整体分解为笔划、笔段和字根等子模式。在识别过程中能有效地结合几何和结构的知识,因此能得到可靠性较高的识别结果,但计算量大,对归属不明确的线段难以表征或易于产生错误编码。Structural features and statistical feature extraction feature methods have their own advantages and disadvantages: for the statistical feature method, after a certain feature is determined, the feature extraction algorithm is simple and easy to train, and a relatively high recognition rate can be obtained on a given training set. But the biggest disadvantage is that it is difficult to select features; one of the main advantages of the structural feature method is that it can describe the structure of characters, and decompose the character pattern into sub-patterns such as strokes, strokes and radicals. In the process of recognition, the knowledge of geometry and structure can be effectively combined, so the recognition results with high reliability can be obtained, but the calculation amount is large, and it is difficult to characterize the line segment with unclear attribution or easy to generate error coding.
事实表明,任何一种特征都很难完美地代表任意模式。因为在实际问题中不容易找到那些最重要的特征,或者受条件所限不能对它们进行测量,这就使得特征的选择和提取任务复杂化,进而成为构造模式识别系统最困难的任务之一。因此,如何寻求将上述二种方法的优点有机结合的特征提取方法是值得深入研究的课题。It turns out that it is difficult for any one feature to perfectly represent an arbitrary pattern. Because it is not easy to find the most important features in practical problems, or they cannot be measured due to conditions, which complicates the task of feature selection and extraction, and becomes one of the most difficult tasks in constructing a pattern recognition system. Therefore, how to find a feature extraction method that combines the advantages of the above two methods is a topic worthy of further study.
本发明最终选用了网格重心法来提取字符图像的特征。重心特征提取的步骤如下:The present invention finally selects the grid centroid method to extract the features of the character image. The steps of centroid feature extraction are as follows:
步骤(1)、计算图像的重心O0;Step (1), calculating the center of gravity O 0 of the image;
步骤(2)、以O0将图像分成四部分,并计算每个部分的重心,得到O1,O2,O3,O4;Step (2), divide the image into four parts with O 0 , and calculate the center of gravity of each part to obtain O 1 , O 2 , O 3 , O 4 ;
步骤(3)、分别以O1,O2,O3,O4将每个部分再次划分成四个子部分,这样共形成16个小部分,取每个小部分的重心O5,O6,...,O20。Step (3), each part is divided into four sub-parts by O 1 , O 2 , O 3 , O 4 respectively, so that 16 small parts are formed in total, and the center of gravity O 5 , O 6 of each small part is taken, ..., O20 .
本文采用了两种征维数分别为24和48的重心特征:24维的特征取归一化后的O1,O2,O3,O4的X,Y坐标和图像被划分为的16个部分的像素数作为飞机尾号特征;48维的特征取O5,O6,O7,...,O20的X,Y坐标和图像被划分为的16部分的像素数作为飞机尾号特征。In this paper, two centroid features with dimensionality of 24 and 48 are used: the 24-dimensional feature takes the normalized X, Y coordinates of O 1 , O 2 , O 3 , O 4 and the image is divided into 16 The number of pixels of each part is used as the feature of the aircraft tail number; the 48-dimensional feature takes the X, Y coordinates of O 5 , O 6 , O 7 , ..., O 20 and the number of pixels of the 16 parts that the image is divided into is used as the aircraft tail number features.
这种网格重心特征能较好的反应字符在图像中的分布情况,因此区分不同类别字符的能力较强,受噪声影响较小,而且由于特征维数较小,空间和时间的复杂度都比较小,是一种较好的字符图像特征提取方法。假设一幅灰度飞机尾号图像大小为M×N,f(x,y)表示位于图像中(x-1)行,(y-1)列的像素的灰度值。图像的重心计算公式为:This grid center of gravity feature can better reflect the distribution of characters in the image, so it has a strong ability to distinguish different types of characters and is less affected by noise, and because the feature dimension is small, the complexity of space and time are both It is relatively small, and it is a better character image feature extraction method. Assuming a grayscale image of the tail number of an airplane is M×N, f(x, y) represents the grayscale value of the pixel located in the (x-1) row and (y-1) column in the image. The formula for calculating the center of gravity of an image is:
重心特征提取的效果如图9所示。The effect of center of gravity feature extraction is shown in Figure 9.
步骤C3、多分类器识别Step C3, multi-classifier recognition
支持向量机本身是一个两类问题的判别方法,实际应用中经常需要对多类问题进行分类,这就涉及到多类问题到二类问题的转换。多类SVM算法的实现思想主要有以下两种:The support vector machine itself is a discriminative method for two types of problems. In practical applications, it is often necessary to classify multi-type problems, which involves the conversion of multi-type problems to two-type problems. There are two main ideas for implementing the multi-class SVM algorithm:
1、通过某种方式把多类分类问题分解为多个二类分类问题,该方法需要对训练样本进行重新分配,使得样本符合新的多个二类问题分类的需要,同时根据算法的实现采取不同策略决定测试样本所属类别。1. Decompose the multi-class classification problem into multiple second-class classification problems in a certain way. This method needs to redistribute the training samples so that the samples meet the needs of new multi-class classification problems. At the same time, according to the implementation of the algorithm, adopt Different strategies determine the category to which the test sample belongs.
2、把多个分类面的求解合并到一个最优化问题中,该方法是二类问题的推广,一次性的求解一个大的二次规划问题,直接将多类问题同时分开。该方法虽然比第一种思想简单,但其算法复杂度大大增加,需要更多的变量,训练时间较长,推广能力也不比第一种方法更优,不适合应用于大规模的数据样本。2. Combine the solutions of multiple classification surfaces into one optimization problem. This method is an extension of the second-class problem. It solves a large quadratic programming problem at one time, and directly separates multiple types of problems at the same time. Although this method is simpler than the first method, its algorithm complexity is greatly increased, more variables are required, the training time is longer, and the generalization ability is not better than the first method, so it is not suitable for large-scale data samples.
下面基于第一种思想,介绍几种比较常用的多类分类方法。Based on the first idea, several commonly used multi-class classification methods are introduced below.
1、一类对余类法。一类对余类法(One Versus Rest,OVR)是最早出现也是目前应用最为广泛的方法之一,其步骤是构造n个两类分类器(共有n个类别),其中第i个分类器把第i类同余下的各类划分开,训练时第i个分类器取训练集中第i类为正类,其余类别点为负类进行训练。判别时,输入信号分别经过n个分类器共得到n个输出值fi(x)=sgn(gi(x)),若只有一个+1出现,则其对应类别为输入信号类别;若输出不只一个+1(不只一类声称它属于自己),或者没有一个输出为+1(即没有一个类声称它属于自己),则比较g(x)输出值,最大者对应类别为输入的类别。该方法实现比较简单,只需要训练n个两类分类支持向量机,故其所得到分类函数个数较少(n个),所以识别速度较快。但是该算法的缺点也很是明显,每个分类器的训练都是将全部的样本作为训练样本,这样就需要求解n个二次规划问题,因为每个支持向量机的训练速度随着训练样本的数量急剧减慢,因此,这种方法的训练时间比较长。1. One class versus remainder class method. One Versus Rest (OVR) is one of the earliest and most widely used methods. Its steps are to construct n two-class classifiers (a total of n classes), in which the i-th classifier takes The i-th class is divided from the rest of the classes. When training, the i-th classifier takes the i-th class in the training set as the positive class, and the rest of the class points are the negative class for training. When discriminating, the input signal passes through n classifiers to obtain a total of n output values f i (x)=sgn(g i (x)), if only one +1 appears, then its corresponding category is the input signal category; if the output If there is more than one +1 (more than one class claims that it belongs to itself), or none of the outputs is +1 (that is, no class claims that it belongs to itself), compare the g(x) output values, and the largest corresponding category is the input category. This method is relatively simple to implement, and only needs to train n two-class classification support vector machines, so the number of classification functions obtained is less (n), so the recognition speed is faster. However, the shortcomings of this algorithm are also very obvious. The training of each classifier uses all samples as training samples, so n quadratic programming problems need to be solved, because the training speed of each support vector machine increases with the training sample The number of s slows down dramatically, so the training time of this method is relatively long.
2、一对一分类法。一对一分类法(One Versus One,OVO)也称为成对分类法。在训练集T(共有k个不同类别)中找出所有不同类别的两两组合,共有P=k(k-1)/2个,分别用这两个类别样本点组成两类问题训练集T(i,j),然后用求解两类问题的支持向量机分类法分别求得P个判别函数f(i,j)(x)=sgn(gi,j(x))。判别时将输入信号X分别送到P个判别函数f(i, j)(x),若f(i,j)(x)=1,判X为i类,i类获得一票,否则判为j类,j类获得一票。分别统计k个类别在P个判别函数结果中的得票数,得票数最多的类别就是最终判定类别。在这种方法中,对k类问题就有k(k-1)/2个两类分类器,比上面的“一对多”方法得到的分类器的数目大很多。尽管如此,“一对一”方法的每个分类问题的规模却小了很多,要学习的问题也比较简单。但是如果k很大,需要的分类器的数目就会非常大,这时此方法的速度就会慢许多。2. One-to-one classification. One-to-one taxonomy (One Versus One, OVO) is also known as pair taxonomy. Find all pairwise combinations of different categories in the training set T (a total of k different categories), a total of P=k(k-1)/2, and use these two categories of sample points to form two types of problem training sets T (i, j), and then obtain P discriminant functions f (i, j) (x)=sgn(g i, j (x)) by using the support vector machine classification method for solving two types of problems. When discriminating, the input signal X is sent to P discriminant functions f (i, j) (x), if f (i, j) (x) = 1, judge X as class i, and class i gets one vote, otherwise judge For class j, class j gets one vote. Count the number of votes of the k categories in the P discriminant function results, and the category with the most votes is the final decision category. In this method, there are k(k-1)/2 two-class classifiers for k-class problems, which is much larger than the number of classifiers obtained by the above "one-to-many" method. Nevertheless, the size of each classification problem is much smaller and the problem to learn is simpler for the one-vs-one approach. But if k is very large, the number of classifiers required will be very large, and the speed of this method will be much slower.
上述两种多类分类器方法的图示如图10所示。A graphical representation of the above two multiclass classifier approaches is shown in Figure 10.
步骤七、识别结果后处理。Step 7, post-processing the recognition result.
最后,本发明设计了飞机尾号识别系统演示软件用来演示提出的飞机尾号识别算法。在本发明中提到的飞机尾号分割、飞机尾号识别等模块均嵌入该演示软件。由于视频监视方法不仅能提供机场范围飞机、车辆精确检测和位置测量,而且能对滑行道、停机坪上的飞机尾号进行图像自动识别,所以我们将识别结果与融入到最终显示界面中,为监视结果自动加入标签。Finally, the present invention designs the aircraft tail number recognition system demonstration software to demonstrate the proposed aircraft tail number recognition algorithm. Modules such as aircraft tail number segmentation and aircraft tail number recognition mentioned in the present invention are all embedded in the demonstration software. Because the video surveillance method can not only provide accurate detection and position measurement of aircraft and vehicles within the airport range, but also can automatically recognize the image of the tail number of the aircraft on the taxiway and the apron, so we integrate the recognition results into the final display interface, for Monitoring results are automatically tagged.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110388239 CN102509091B (en) | 2011-11-29 | 2011-11-29 | Airplane tail number recognition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110388239 CN102509091B (en) | 2011-11-29 | 2011-11-29 | Airplane tail number recognition method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102509091A CN102509091A (en) | 2012-06-20 |
CN102509091B true CN102509091B (en) | 2013-12-25 |
Family
ID=46221172
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201110388239 Expired - Fee Related CN102509091B (en) | 2011-11-29 | 2011-11-29 | Airplane tail number recognition method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102509091B (en) |
Families Citing this family (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103500323B (en) * | 2013-09-18 | 2016-02-17 | 西安理工大学 | Based on the template matching method of self-adaptation gray level image filtering |
CN103971091B (en) * | 2014-04-03 | 2017-04-26 | 北京首都国际机场股份有限公司 | Automatic plane number recognition method |
CN105335688B (en) * | 2014-08-01 | 2018-07-13 | 深圳中集天达空港设备有限公司 | A kind of aircraft model recognition methods of view-based access control model image |
CN104156706A (en) * | 2014-08-12 | 2014-11-19 | 华北电力大学句容研究中心 | Chinese character recognition method based on optical character recognition technology |
CN104243935B (en) * | 2014-10-10 | 2018-02-16 | 南京莱斯信息技术股份有限公司 | Airport field prison aims of systems monitoring method based on video identification |
CN104408678A (en) * | 2014-10-31 | 2015-03-11 | 中国科学院苏州生物医学工程技术研究所 | Electronic medical record system for personal use |
CN105512682B (en) * | 2015-12-07 | 2018-11-23 | 南京信息工程大学 | A kind of security level identification recognition methods based on Krawtchouk square and KNN-SMO classifier |
CN105957238B (en) | 2016-05-20 | 2019-02-19 | 聚龙股份有限公司 | A kind of paper currency management method and its system |
CN106056751B (en) * | 2016-05-20 | 2019-04-12 | 聚龙股份有限公司 | The recognition methods and system of serial number |
CN106374394A (en) * | 2016-09-28 | 2017-02-01 | 刘子轩 | Pipeline robot based on image recognition technology and control method |
CN108734158B (en) * | 2017-04-14 | 2020-05-19 | 成都唐源电气股份有限公司 | Real-time train number identification method and device |
CN108090442A (en) * | 2017-12-15 | 2018-05-29 | 四川大学 | A Method of Airport Scene Surveillance Based on Convolutional Neural Network |
CN108256493A (en) * | 2018-01-26 | 2018-07-06 | 中国电子科技集团公司第三十八研究所 | A kind of traffic scene character identification system and recognition methods based on Vehicular video |
CN108564064A (en) * | 2018-04-28 | 2018-09-21 | 北京宙心科技有限公司 | A kind of efficient OCR recognizers of view-based access control model |
CN109409373A (en) * | 2018-09-06 | 2019-03-01 | 昆明理工大学 | A kind of character recognition method based on image procossing |
CN109299743B (en) * | 2018-10-18 | 2021-08-10 | 京东方科技集团股份有限公司 | Gesture recognition method and device and terminal |
CN109850518B (en) * | 2018-11-12 | 2022-01-28 | 太原理工大学 | Real-time mining adhesive tape early warning tearing detection method based on infrared image |
CN109858484B (en) * | 2019-01-22 | 2022-10-14 | 电子科技大学 | Multi-class transformation license plate correction method based on deflection evaluation |
CN110097052A (en) * | 2019-04-22 | 2019-08-06 | 苏州海赛人工智能有限公司 | A kind of true and false license plate method of discrimination based on image |
CN111110189B (en) * | 2019-11-13 | 2021-11-09 | 吉林大学 | Anti-snoring device and method based on DSP sound and image recognition technology |
CN111429403B (en) * | 2020-02-26 | 2022-11-08 | 北京航空航天大学杭州创新研究院 | Automobile gear finished product defect detection method based on machine vision |
CN113449574A (en) * | 2020-03-26 | 2021-09-28 | 上海际链网络科技有限公司 | Method and device for identifying content on target, storage medium and computer equipment |
CN111582237B (en) * | 2020-05-28 | 2022-08-12 | 国家海洋信息中心 | ATSM model-based high-resolution image airplane type identification method |
CN116912845B (en) * | 2023-06-16 | 2024-03-19 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090003449A1 (en) * | 2007-06-28 | 2009-01-01 | Mitsubishi Electric Corporation | Image encoding device, image decoding device, image encoding method and image decoding method |
CN100590675C (en) * | 2007-12-28 | 2010-02-17 | 北京航空航天大学 | A fixed bayonet electronic police capture device |
CN101789080B (en) * | 2010-01-21 | 2012-07-04 | 上海交通大学 | Detection method for vehicle license plate real-time positioning character segmentation |
CN101859382B (en) * | 2010-06-03 | 2013-07-31 | 复旦大学 | License plate detection and identification method based on maximum stable extremal region |
CN101976340B (en) * | 2010-10-13 | 2013-04-24 | 重庆大学 | License plate positioning method based on compressed domain |
-
2011
- 2011-11-29 CN CN 201110388239 patent/CN102509091B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN102509091A (en) | 2012-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102509091B (en) | Airplane tail number recognition method | |
CN109840521B (en) | Integrated license plate recognition method based on deep learning | |
CN101859382B (en) | License plate detection and identification method based on maximum stable extremal region | |
Pan et al. | A robust system to detect and localize texts in natural scene images | |
Jiao et al. | A configurable method for multi-style license plate recognition | |
WO2019169816A1 (en) | Deep neural network for fine recognition of vehicle attributes, and training method thereof | |
Zhang et al. | Study on traffic sign recognition by optimized Lenet-5 algorithm | |
CN104766046B (en) | One kind is detected using traffic mark color and shape facility and recognition methods | |
CN103310195B (en) | Based on LLC feature the Weakly supervised recognition methods of vehicle high score remote sensing images | |
CN112200186B (en) | Vehicle logo identification method based on improved YOLO_V3 model | |
CN107103317A (en) | Fuzzy license plate image recognition algorithm based on image co-registration and blind deconvolution | |
CN108009518A (en) | A kind of stratification traffic mark recognition methods based on quick two points of convolutional neural networks | |
CN103218621B (en) | The recognition methods of multiple dimensioned vehicle in a kind of life outdoor videos monitoring | |
CN102682287A (en) | Pedestrian detection method based on saliency information | |
CN102609686A (en) | Pedestrian detection method | |
CN106529532A (en) | License plate identification system based on integral feature channels and gray projection | |
CN105719285A (en) | Pedestrian detection method based on directional chamfering distance characteristics | |
CN110008900B (en) | Method for extracting candidate target from visible light remote sensing image from region to target | |
CN109886086B (en) | Pedestrian detection method based on HOG feature and linear SVM cascade classifier | |
CN102799879A (en) | Method for identifying multi-language multi-font characters from natural scene image | |
CN111259796A (en) | A Lane Line Detection Method Based on Image Geometric Features | |
Cai et al. | Traffic sign recognition algorithm based on shape signature and dual-tree complex wavelet transform | |
CN108427919A (en) | A kind of unsupervised oil tank object detection method guiding conspicuousness model based on shape | |
Liang et al. | Deep infrared pedestrian classification based on automatic image matting | |
CN109753962A (en) | Processing method of text region in natural scene image based on hybrid network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20131225 Termination date: 20211129 |
|
CF01 | Termination of patent right due to non-payment of annual fee |