CN104751171A - Method of classifying Naive Bayes scanned certificate images based on feature weighting - Google Patents

Method of classifying Naive Bayes scanned certificate images based on feature weighting Download PDF

Info

Publication number
CN104751171A
CN104751171A CN201510100700.2A CN201510100700A CN104751171A CN 104751171 A CN104751171 A CN 104751171A CN 201510100700 A CN201510100700 A CN 201510100700A CN 104751171 A CN104751171 A CN 104751171A
Authority
CN
China
Prior art keywords
image
certificate
probability
feature
certificate image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510100700.2A
Other languages
Chinese (zh)
Other versions
CN104751171B (en
Inventor
龙军
祝莉媛
张昊
刘献如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201510100700.2A priority Critical patent/CN104751171B/en
Publication of CN104751171A publication Critical patent/CN104751171A/en
Application granted granted Critical
Publication of CN104751171B publication Critical patent/CN104751171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开一种基于特征加权的朴素贝叶斯扫描证书图像分类方法,通过对经过预处理的证书图像利用Hough变换进行圆章定位、分割、大小调整,提取圆章区域的HSV空间的颜色特征向量及图像长宽比;建立证书图像数据库,对数据库中的每一幅证书图像按照上述步骤进行处理,得到数据库中每幅扫描证书图像的圆章HSV颜色特征向量及图像长宽比,根据得到的特征向量计算证书图像数据库中不同数据组合出现的概率,加权处理后保存数据;根据朴素贝叶斯算法及证书图像数据库中不同数据组合出现的概率计算待分类图像最有可能的图像类别,并且该概率满足设定的阈值要求的,判断图片的分类;本方法能简单快速地对证书图像分类,提高证书图像检索的效率。

The invention discloses a naive Bayesian scanning certificate image classification method based on feature weighting, which extracts the color features of the HSV space of the circular stamp area by using the Hough transform to perform circular seal positioning, segmentation and size adjustment on the preprocessed certificate image vector and image aspect ratio; establish a certificate image database, process each certificate image in the database according to the above steps, and obtain the round stamp HSV color feature vector and image aspect ratio of each scanned certificate image in the database, according to the obtained The eigenvectors calculate the probability of different data combinations in the certificate image database, and save the data after weighting processing; calculate the most likely image category of the image to be classified according to the naive Bayesian algorithm and the probability of different data combinations in the certificate image database, and If the probability meets the set threshold requirement, the classification of the picture is judged; the method can simply and quickly classify the certificate image, and improve the efficiency of certificate image retrieval.

Description

基于特征加权的朴素贝叶斯扫描证书图像分类方法Naive Bayesian Image Classification Method for Scanning Certificates Based on Feature Weighting

技术领域technical field

本发明涉及一种图像分类方法,特别涉及的是一种扫描证书图像分类方法。The invention relates to an image classification method, in particular to a scanning certificate image classification method.

背景技术Background technique

最近几年来,图像检索是一个非常受欢迎的话题,其检索对象包括海里游的,在天空中飞翔的及地上走的。图像分类是图像检索的一个预处理过程,可以有效提高图像检索的准确性。尽管已有针对不同种类图像数据集的众多图像分类检索系统,但是扫描证书图像分类检索方面则关注较少,而这些扫描证书图像往往是申请奖励或公司拓展的重要辅助材料。为了保证这类证书图像的合法利用,避免同张证书被多次利用,在特殊的扫描证书数据集里的扫描图像查重对于某些检索系统是非常重要的,这有点类似于文件的相似性检查。目前适用于比较流行的基于内容的图像分类检索系统的图像特征有颜色、纹理、形状以及空间位置关系,但扫描证书图像质量低,种类繁多,版面形式多样,既包含具有特定意义的图像标志,同时又包含对于获奖情况的简明扼要描述,因此,仅仅利用现有算法要实现从海量图像库中查找是否存在与待测证书相似的图像文件是有困难的。因此,我们就得具体分析扫描图像的特征,选取能更好地表述证书图像特点的特征。如何借助计算机技术快速准确对附件证明材料—扫描图像--进行相似性检测是国家科学技术奖励评审迫切需要解决的问题。In recent years, image retrieval is a very popular topic, and its retrieval objects include objects swimming in the sea, flying in the sky and walking on the ground. Image classification is a preprocessing process of image retrieval, which can effectively improve the accuracy of image retrieval. Although there are many image classification and retrieval systems for different kinds of image datasets, less attention has been paid to the classification and retrieval of scanned certificate images, which are often important auxiliary materials for applications for awards or company expansion. In order to ensure the legitimate use of such certificate images and avoid multiple use of the same certificate, it is very important for some retrieval systems to check the scanned images in the special scanned certificate data set, which is similar to the similarity of documents examine. At present, the image features suitable for the popular content-based image classification retrieval system include color, texture, shape and spatial position relationship, but the image quality of the scanned certificate is low, there are many types, and the layout forms are diverse, including image signs with specific meanings, At the same time, it contains a concise description of the award-winning situation. Therefore, it is difficult to find whether there is an image file similar to the certificate to be tested from the massive image library by using the existing algorithm. Therefore, we have to specifically analyze the features of the scanned image and select features that can better describe the features of the certificate image. How to use computer technology to quickly and accurately conduct similarity detection on the attached certification materials—scanned images—is an urgent problem to be solved in the review of national science and technology awards.

发明内容Contents of the invention

本发明提供一种扫描证书图像分类方法,能对证书图像进行快速有效的分类,并可以显著提高证书图像检索的准确率。The invention provides a scanning certificate image classification method, which can quickly and effectively classify certificate images, and can significantly improve the accuracy of certificate image retrieval.

为实现上述目的,本发明的技术方案如下:To achieve the above object, the technical scheme of the present invention is as follows:

一种基于特征加权的朴素贝叶斯扫描证书图像分类方法,包括如下步骤:A feature-weighted naive Bayesian scanning certificate image classification method, comprising the following steps:

步骤1:建立一个扫描证书图像不同数据组合的似然概率索引;Step 1: Build a likelihood probability index for different data combinations of scanned certificate images;

步骤2:读取待分类扫描证书图像,进行预处理;Step 2: Read the scanned certificate image to be classified and perform preprocessing;

步骤3:对经过预处理的证书图像利用Hough变换进行圆章定位,得到圆章外接矩形区域,提取圆章区域的HSV颜色特征向量;Step 3: Use the Hough transform to locate the seal on the preprocessed certificate image, obtain the circumscribed rectangular area of the seal, and extract the HSV color feature vector of the seal area;

步骤4:对HSV颜色特征向量显著特征项进行加权;Step 4: weighting the salient feature items of the HSV color feature vector;

步骤5:计算并记录提取圆章区域的HSV颜色特征向量中不同数据组合出现的概率;Step 5: Calculate and record the probability of occurrence of different data combinations in the HSV color feature vector of the extracted circle stamp area;

步骤6:根据待分类图像的HSV颜色特征向量、每类扫描证书图像的先验概率及训练过程得到的扫描证书图像不同数据组合的似然概率索引,利用朴素贝叶斯算法计算待分类图像的分类情况,返回满足设定的阈值要求的扫描证书图像作为分类的结果。本发明的有益效果是:本发明基于特征加权的朴素贝叶斯扫描证书图像分类方法,通过对经过预处理的证书图像利用Hough变换进行圆章定位、分割、大小调整,提取圆章区域的HSV空间的颜色特征向量及图像长宽比;建立证书图像数据库,对数据库中的每一幅证书图像按照上述步骤进行处理,得到数据库中每幅扫描证书图像的圆章HSV颜色特征向量及图像长宽比,根据得到的特征向量计算证书图像数据库中不同数据组合出现的概率,加权处理后保存数据;根据朴素贝叶斯算法及证书图像数据库中不同数据组合出现的概率计算待分类图像最有可能的图像类别,并且该概率满足设定的阈值要求的,判断图片的分类;通过本分类方法,能简单快速地对证书图像进行分类,有效提高证书图像检索的效率。Step 6: According to the HSV color feature vector of the image to be classified, the prior probability of each type of scanned certificate image and the likelihood probability index of different data combinations of the scanned certificate image obtained during the training process, the naive Bayesian algorithm is used to calculate the probability of the image to be classified In case of classification, return scanned certificate images that meet the set threshold requirements as classification results. The beneficial effects of the present invention are: the present invention is based on the feature-weighted Naive Bayesian scanning certificate image classification method, and the HSV of the circular stamp area is extracted by using the Hough transform on the preprocessed certificate image to perform seal positioning, segmentation, and size adjustment Space color feature vector and image aspect ratio; establish a certificate image database, process each certificate image in the database according to the above steps, and obtain the round stamp HSV color feature vector and image length and width of each scanned certificate image in the database According to the obtained eigenvectors, the probability of different data combinations in the certificate image database is calculated, and the data is saved after weighting processing; the most likely probability of the image to be classified is calculated according to the naive Bayesian algorithm and the probability of different data combinations in the certificate image database. image category, and the probability meets the set threshold requirements, the classification of the picture is judged; through this classification method, the certificate image can be classified simply and quickly, and the efficiency of certificate image retrieval can be effectively improved.

附图说明Description of drawings

图1为本发明实施例图像分类方法的流程图。FIG. 1 is a flowchart of an image classification method according to an embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图及实例,对本发明做进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and examples.

参见图1,本实施例基于特征加权的朴素贝叶斯扫描证书图像分类方法含有以下步骤:一种基于特征加权的朴素贝叶斯扫描证书图像分类方法,包括如下步骤:Referring to Fig. 1, the Naive Bayesian scanning certificate image classification method based on feature weighting in this embodiment contains the following steps: A kind of feature weighting-based Naive Bayesian scanning certificate image classification method includes the following steps:

A:输入待分类扫描证书图像,进行预处理;A: Input the scanned certificate image to be classified for preprocessing;

B:对经过预处理的证书图像利用Hough变换进行圆章定位,得到圆章外接矩形区域,提取圆章区域的HSV颜色特征向量;B: Use the Hough transform to locate the seal on the preprocessed certificate image, obtain the circumscribed rectangular area of the seal, and extract the HSV color feature vector of the seal area;

C:对HSV颜色特征向量显著特征项进行加权;C: Weighting the significant feature items of the HSV color feature vector;

D:计算并记录提取圆章区域的HSV颜色特征向量中不同数据组合出现的概率;D: Calculate and record the probability of different data combinations appearing in the HSV color feature vector of the extracted circle stamp area;

证书图像数据库中的每一幅证书图像按照上述步骤A~D进行处理,计算并记录数据库中每类扫描证书图像的先验概率和提取圆章区域的HSV颜色特征向量中不同数据组合出现的概率,即建立一个扫描证书图像不同数据组合的似然概率索引;Each certificate image in the certificate image database is processed according to the above steps A to D, and the prior probability of each type of scanned certificate image in the database and the probability of different data combinations in the HSV color feature vector extracted from the stamp area are calculated and recorded , that is, to establish a likelihood probability index of different data combinations of scanned certificate images;

E:根据待分类图像的HSV颜色特征向量、每类扫描证书图像的先验概率及训练过程得到的扫描证书图像不同数据组合的似然概率索引,利用朴素贝叶斯算法计算待分类图像的分类情况,返回满足设定的阈值要求的扫描证书图像作为分类的结果;E: According to the HSV color feature vector of the image to be classified, the prior probability of each type of scanned certificate image and the likelihood probability index of different data combinations of the scanned certificate image obtained during the training process, the classification of the image to be classified is calculated using the naive Bayesian algorithm case, return the scanned certificate image that meets the set threshold requirements as the classification result;

本方法利用的朴素贝叶斯算法如下:The naive Bayesian algorithm used in this method is as follows:

vv NBNB == argarg maxmax PP (( vv jj )) ΠΠ ii PP (( aa ii || vv jj ))

PP (( vv jj || LL kk )) == PP (( vv jj )) ΠΠ ii PP (( LL ii || vv jj ))

本分类方法的目标是在根据待分类图像的圆章特征向量得到证书图像最可能的类别,P(vj)是先验概率,只要计算每个类别出现在证书图像数据库的频率就可以。vNB表示朴素贝叶斯分类器输出的目标值。概括的讲,基于它们在训练数据上的概率,朴素贝叶斯学习方法需要估计不同的P(vj)和P(ai|vj)项,这些估计对应了待学习的假设,然后使用朴素贝叶斯提出的规则来分类。我们使用的朴素贝叶斯算法同其他的分类算法不同之处就在于只需要简单地计算训练样例中不同数据组合的出现频率就可以,不需要搜索。The goal of this classification method is to obtain the most probable category of the certificate image according to the feature vector of the seal of the image to be classified, P(v j ) is the prior probability, and it is enough to calculate the frequency of each category appearing in the certificate image database. v NB represents the target value output by the Naive Bayesian classifier. In summary, based on their probabilities on the training data, Naive Bayesian learning methods need to estimate different P(v j ) and P(a i |v j ) terms corresponding to the hypotheses to be learned, and then use The rules proposed by Naive Bayes to classify. The difference between the Naive Bayesian algorithm we use and other classification algorithms is that it only needs to simply calculate the frequency of occurrence of different data combinations in the training samples, without searching.

(Lk0,Lk1...Lk16)是待查询图像的圆章区域的HSV颜色特征向量及图片长宽比,(Li0,Li2...Li16)是数据库中扫描证书图像的圆章区域的HSV颜色特征向量及图片长宽比。(L k0 ,L k1 ...L k16 ) are the HSV color feature vectors and aspect ratios of the round stamp area of the image to be queried, and (L i0 ,L i2 ...L i16 ) are the scanned certificate images in the database The HSV color feature vector and image aspect ratio of the stamp area.

所述步骤A中预处理是利用现有噪声滤除和倾斜校正方法进行预处理;The preprocessing in the step A is to use the existing noise filtering and tilt correction method for preprocessing;

在所述步骤B中对经过预处理的证书图像利用现有圆章定位的方法,对定位得到的圆章所在的外接矩形进行分割提取,得到圆章区域,提取圆章区域的HSV颜色特征向量;In the step B, the preprocessed certificate image is utilized for the existing seal positioning method, and the circumscribed rectangle where the seal obtained by positioning is segmented and extracted to obtain the seal area, and the HSV color feature vector of the seal area is extracted ;

具体操作步骤如下:The specific operation steps are as follows:

1)利用现有圆章定位的方法,对定位得到的圆章所在的外接矩形进行分割提取,得到圆章区域;1) Utilize the method for existing badge location, segment and extract the circumscribed rectangle where the badge obtained by positioning obtains the badge area;

2)将色度H、饱和度S及亮度V三个分量分别非均匀量化为8份、4份和4份:2) The three components of chroma H, saturation S and brightness V are non-uniformly quantized into 8 parts, 4 parts and 4 parts respectively:

Hh == 00 Hh ∈∈ [[ 315,23315,23 ]] 11 Hh ∈∈ [[ 24,5024,50 ]] 22 Hh ∈∈ [[ 51,7551,75 ]] 33 Hh ∈∈ [[ 76,15576,155 ]] 44 Hh ∈∈ [[ 156,195156,195 ]] 55 Hh ∈∈ [[ 196,275196,275 ]] 66 Hh ∈∈ [[ 276,290276,290 ]] 77 Hh ∈∈ [[ 290,316290,316 ]] SS == 00 SS ∈∈ [[ 0,0.080,0.08 ]] 11 SS ∈∈ (( 0.08,0.40.08,0.4 ]] 22 SS ∈∈ (( 0.4,0.670.4,0.67 ]] 33 SS ∈∈ (( 0.67,1.00.67,1.0 ]] VV == 00 VV ∈∈ [[ 0,0.080,0.08 ]] 11 VV ∈∈ (( 0.08,0.40.08,0.4 ]] 22 VV ∈∈ (( 0.4,0.670.4,0.67 ]] 33 VV ∈∈ (( 0.67,1.00.67,1.0 ]] ;;

这样圆章区域的HSV空间被分成LH+LS+LV个区间,LH、LS、LV分别是H、S及V的量化级数,于是我们得到一个十六维的颜色特征向量,加上扫描图像图片长宽比,最终提取一个十七维特征向量;In this way, the HSV space of the round stamp area is divided into L H + L S + L V intervals, L H , L S , and L V are the quantization series of H, S, and V respectively, so we get a sixteen-dimensional color feature vector, plus the aspect ratio of the scanned image, and finally extract a seventeen-dimensional feature vector;

3)朴素贝叶斯方法是对出现的每一个数据进行统计,统计其出现的频率。为了便于计算,经过反复试验,对所有特征值提取一位数的整数能得到最好的效果。本方法选取的十七维特征用(Lk0,Lk1...Lk16)表示,取值范围为[0,9]之间的整数。3) The naive Bayesian method is to count each data that appears, and to count the frequency of its occurrence. In order to facilitate the calculation, after trial and error, the best effect can be obtained by extracting single-digit integers for all eigenvalues. The seventeen-dimensional feature selected by this method is represented by (L k0 , L k1 ... L k16 ), and the value range is an integer between [0, 9].

所述步骤C中对特征向量显著特征项进行加权。In the step C, the significant feature items of the feature vector are weighted.

图像特征分布具有这样的特性:在同一个图像类别中,如果某个特征的统计分布比较密集,离散程度比较小,那么这个特征相对与这个类别是起支配作用的,是一个重要的特征。相反,如果某个特征统计比较分散,离散程度比较高,就是一个不重要的特征。数据的标准差可以很好地描述数据的离散情况。本方法采用标准差来衡量图像特征权重。wi={wko,wk1...wk16}表示特征向量的权重。样本集中类别为j的第i维的标准差σi,其计算公式为:The distribution of image features has such characteristics: in the same image category, if the statistical distribution of a certain feature is relatively dense and the degree of dispersion is relatively small, then this feature is relatively dominant with this category and is an important feature. On the contrary, if the statistics of a certain feature are scattered and the degree of dispersion is relatively high, it is an unimportant feature. The standard deviation of the data can well describe the discrete situation of the data. This method uses standard deviation to measure the image feature weights. w i ={w ko ,w k1 ...w k16 } represents the weight of the feature vector. The standard deviation σ i of the i-th dimension of category j in the sample set, its calculation formula is:

σσ ii == ΣΣ kk == 11 nno jj (( LL kithe ki -- xx ii ‾‾ )) // (( nno jj -- 11 ))

nj为j类样本数,Lki为图像类别为j的第k个样本的第i维特征值,为该维特征的平均值。用ei表示特征重要性,ei∈[0,1]为公式:从而得到每个样本每维特征加权的计算方法为: w ki = e i / Σ i = 0 16 e i . n j is the number of samples of class j, L ki is the i-th dimension feature value of the k-th sample of image category j, is the average value of the feature of this dimension. Use e i to represent feature importance, e i ∈ [0,1] is the formula: Thus, the calculation method for the weighting of each dimension feature of each sample is: w the ki = e i / Σ i = 0 16 e i .

其中,计算并记录提取圆章区域的特征向量中不同数据组合出现的概率,其具体操作步骤如下:Among them, the probability of occurrence of different data combinations in the feature vector of the extracted circle stamp area is calculated and recorded, and the specific operation steps are as follows:

1)统计特征向量中不同数据出现的概率,例如第1类第2维出现4的概率为30%;1) The probability of occurrence of different data in the statistical feature vector, for example, the probability of occurrence of 4 in the second dimension of the first category is 30%;

2)得到的概率值乘以步骤C中计算出的权重,作为不同数据组合出现的概率保存。2) The obtained probability value is multiplied by the weight calculated in step C, and stored as the probability of occurrence of different data combinations.

基于特征加权的朴素贝叶斯扫描证书图像分类方法,其具体操作步骤如下:Naive Bayesian scanning certificate image classification method based on feature weighting, the specific operation steps are as follows:

1)根据步骤D中得到的不同数据组合出现的概率和朴素贝叶斯算法,计算待分类证书图像为每类图像的概率。例如假定A图像为第1类图像,第2维出现数字4,在步骤D保存的概率中找到对应的概率值,将所有出现的数据组合根据步骤D的概率查找并计算出来;1) According to the probability of occurrence of different data combinations obtained in step D and the naive Bayesian algorithm, calculate the probability that the certificate image to be classified is an image of each type. For example, assuming that image A is the first type of image, and the number 4 appears in the second dimension, find the corresponding probability value in the probability saved in step D, and search and calculate all the data combinations that appear according to the probability of step D;

2)得到证书为每一类的概率,并且最大值大于阈值,则判断证书为概率最大的类别。阈值设定为0.048。2) Obtain the probability that the certificate belongs to each category, and the maximum value is greater than the threshold, then it is judged that the certificate belongs to the category with the highest probability. The threshold was set at 0.048.

本实施例扫描证书图像分类结果如下表。The classification results of scanned certificate images in this embodiment are shown in the following table.

测试图片数Number of test pictures 分类正确张数The number of correct classification 分类错误张数Misclassified sheets 准确率Accuracy 一类软件著作权扫描证书图像A class of software copyright scan certificate image 1010 1010 00 100%100% 二类软件著作权扫描证书图像Class II software copyright scanning certificate image 1010 1010 00 100%100% 专利扫描证书图像Patent scan certificate image 1010 1010 00 100%100% 其他干扰图像other interfering images 1010 99 11 90%90%

Claims (7)

1. a naive Bayesian scanning certificate image classification method for feature based weighting, is characterized in that, comprise the steps:
Step 1: set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
Step 2: read scanning certificate graphs picture to be sorted, carry out pre-service;
Step 3: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
Step 4: hsv color proper vector notable feature item is weighted;
Step 5: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Step 6: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting.
2. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, each the width certificate graphs picture in certificate image data base carries out processing obtaining to 5 according to step 2 by the likelihood probability index that step 1 foundation scanning certificate graphs combines as different pieces of information.
3. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, in described step 2, pre-service utilizes existing noise filtering and sloped correcting method.
4. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, the concrete operation step of described step 3 is as follows:
1) utilize the method for existing round Zhang Dingwei, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region;
2) by colourity H, saturation degree S and brightness V tri-components respectively non-uniform quantizing be 8 parts, 4 parts and 4 parts:
H = 0 H ∈ [ 315,23 ] 1 H ∈ [ 24,50 ] 2 H ∈ [ 51,75 ] 3 H ∈ [ 76,155 ] 4 H ∈ [ 156,195 ] 5 H ∈ [ 196,275 ] 6 H ∈ [ 276,290 ] 7 H ∈ [ 290,316 ] S = 0 S ∈ [ 0,0.08 ] 1 S ∈ ( 0.08,0.4 ] 2 S ∈ ( 0.4,0.67 ] 3 S ∈ ( 0.67,1.0 ] V = 0 V ∈ [ 0,0.08 ] 1 V ∈ ( 0.08,0.4 ] 2 V ∈ ( 0.4,0.67 ] 3 V ∈ ( 0.67,1.0 ] ;
The HSV space in so round chapter region is divided into L h+ L s+ L vindividual interval, L h, L s, L vbe the quantification progression of H, S and V respectively, obtain the color feature vector of ten 6 DOFs, add scan image picture length breadth ratio, final extraction ten 7 degree of freedom proper vectors;
3) the ten 7 degree of freedom feature (L extracted k0, L k1... L k16) represent, span is the integer between [0,9].
5. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, the described step 4 pair concrete operation step that proper vector notable feature item is weighted is: adopt standard deviation to weigh characteristics of image weight, w i={ w ko, w k1... w k16the weight of representation feature vector, in sample set, classification is the standard deviation sigma of i-th dimension of j i, its computing formula is:
σ i = Σ k = 1 n j ( L ki - x i ‾ ) / ( n j - 1 )
N jfor j class sample number, L kibe the i-th dimensional feature value of a kth sample of j for image category, for the mean value of this dimensional feature, use e irepresentation feature importance, e i∈ [0,1] is formula: thus the computing method obtaining the every dimensional feature weighting of each sample are: w ki = e i / Σ i = 0 16 e i .
6. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, described step 5 calculates and records the concrete operation step of probability that in the proper vector extracting circle chapter region, different pieces of information combination occurs and is: the probability that in statistical nature vector, different pieces of information occurs; The probable value obtained is multiplied by the weight calculated in step 4, and the probability occurred as different pieces of information combination is preserved.
7. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, described step 6 is specially: the probability occurred according to the different pieces of information combination obtained in step 5 and NB Algorithm, calculate the probability that certificate graphs picture to be sorted is every class image; Obtain the probability that certificate is each class, and maximal value is greater than threshold value, then judge that certificate is the classification of maximum probability, threshold value is set as 0.048.
CN201510100700.2A 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting Active CN104751171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510100700.2A CN104751171B (en) 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510100700.2A CN104751171B (en) 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting

Publications (2)

Publication Number Publication Date
CN104751171A true CN104751171A (en) 2015-07-01
CN104751171B CN104751171B (en) 2016-04-20

Family

ID=53590824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510100700.2A Active CN104751171B (en) 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting

Country Status (1)

Country Link
CN (1) CN104751171B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117732A (en) * 2015-07-24 2015-12-02 中南大学 Scanned certificate image recognition method based on extreme learning machine
CN108416316A (en) * 2018-03-19 2018-08-17 中南大学 A kind of detection method and system of black smoke vehicle
CN108596276A (en) * 2018-05-10 2018-09-28 重庆邮电大学 The naive Bayesian microblog users sorting technique of feature based weighting
CN110659654A (en) * 2019-09-24 2020-01-07 福州大学 A method for checking and anti-plagiarism of painting based on computer vision
CN110907909A (en) * 2019-10-30 2020-03-24 南京市德赛西威汽车电子有限公司 Radar target identification method based on probability statistics
CN112150445A (en) * 2020-09-27 2020-12-29 西安工程大学 Yarn hairiness detection method based on Bayesian threshold
US11080379B2 (en) 2019-02-13 2021-08-03 International Business Machines Corporation User authentication

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103745201A (en) * 2014-01-06 2014-04-23 Tcl集团股份有限公司 Method and device for program recognition
CN104079587A (en) * 2014-07-21 2014-10-01 深圳天祥质量技术服务有限公司 Certificate identification device and certificate check system
KR101477649B1 (en) * 2013-10-08 2014-12-30 재단법인대구경북과학기술원 Object detection device of using sampling and posterior probability, and the method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101477649B1 (en) * 2013-10-08 2014-12-30 재단법인대구경북과학기술원 Object detection device of using sampling and posterior probability, and the method thereof
CN103745201A (en) * 2014-01-06 2014-04-23 Tcl集团股份有限公司 Method and device for program recognition
CN104079587A (en) * 2014-07-21 2014-10-01 深圳天祥质量技术服务有限公司 Certificate identification device and certificate check system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117732A (en) * 2015-07-24 2015-12-02 中南大学 Scanned certificate image recognition method based on extreme learning machine
CN105117732B (en) * 2015-07-24 2018-09-07 中南大学 Scanning certificate image-recognizing method based on extreme learning machine
CN108416316A (en) * 2018-03-19 2018-08-17 中南大学 A kind of detection method and system of black smoke vehicle
CN108596276A (en) * 2018-05-10 2018-09-28 重庆邮电大学 The naive Bayesian microblog users sorting technique of feature based weighting
US11080379B2 (en) 2019-02-13 2021-08-03 International Business Machines Corporation User authentication
CN110659654A (en) * 2019-09-24 2020-01-07 福州大学 A method for checking and anti-plagiarism of painting based on computer vision
CN110907909A (en) * 2019-10-30 2020-03-24 南京市德赛西威汽车电子有限公司 Radar target identification method based on probability statistics
CN110907909B (en) * 2019-10-30 2023-09-12 南京市德赛西威汽车电子有限公司 Radar target identification method based on probability statistics
CN112150445A (en) * 2020-09-27 2020-12-29 西安工程大学 Yarn hairiness detection method based on Bayesian threshold
CN112150445B (en) * 2020-09-27 2023-12-15 西安工程大学 Yarn hairiness detection method based on Bayes threshold

Also Published As

Publication number Publication date
CN104751171B (en) 2016-04-20

Similar Documents

Publication Publication Date Title
CN104751171B (en) The naive Bayesian scanning certificate image classification method of feature based weighting
EP2701098B1 (en) Region refocusing for data-driven object localization
CN102609681B (en) Face recognition method based on dictionary learning models
Seo et al. Training-free, generic object detection using locally adaptive regression kernels
EP2808827B1 (en) System and method for OCR output verification
US8503792B2 (en) Patch description and modeling for image subscene recognition
US8620087B2 (en) Feature selection device
CN111881933B (en) A hyperspectral image classification method and system
CN112949572A (en) Slim-YOLOv 3-based mask wearing condition detection method
US8463050B2 (en) Method for measuring the dissimilarity between a first and a second images and a first and second video sequences
Moorthy et al. Statistics of natural image distortions
CN110008844B (en) A KCF long-term gesture tracking method based on SLIC algorithm
CN104751475B (en) A kind of characteristic point Optimum Matching method towards still image Object identifying
CN106228182B (en) SAR Image Classification Method Based on SPM and Depth Incremental SVM
CN111339924B (en) Polarized SAR image classification method based on superpixel and full convolution network
CN106778837A (en) SAR image target recognition method based on polyteny principal component analysis and tensor analysis
CN110276746B (en) Robust remote sensing image change detection method
Su et al. Hyperspectral image classification based on volumetric texture and dimensionality reduction
CN116796248A (en) Forest health care environment assessment system and method
CN108241663A (en) An Image Classification Method Based on Image Retrieval
CN104574352A (en) Crowd density grade classification method based on foreground image
CN114373079A (en) A Fast and Accurate Ground Penetrating Radar Target Detection Method
CN109740692A (en) A kind of target classifying method of the logistic regression based on principal component analysis
Isnanto et al. Determination of the optimal threshold value and number of keypoints in scale invariant feature transform-based copy-move forgery detection
CN104361354B (en) A kind of large nuber of images sorting technique based on sparse coding K arest neighbors histograms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant