CN104751171A - Method of classifying Naive Bayes scanned certificate images based on feature weighting - Google Patents
Method of classifying Naive Bayes scanned certificate images based on feature weighting Download PDFInfo
- Publication number
- CN104751171A CN104751171A CN201510100700.2A CN201510100700A CN104751171A CN 104751171 A CN104751171 A CN 104751171A CN 201510100700 A CN201510100700 A CN 201510100700A CN 104751171 A CN104751171 A CN 104751171A
- Authority
- CN
- China
- Prior art keywords
- certificate
- probability
- image
- feature
- scanning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method of classifying Naive Bayes scanned certificate images based on feature weighting. The method comprises the steps of carrying out round seal locating, dividing and sizing on certificate images processed, and extracting color feature vectors of an HSV (Hue, Saturation, Value) space of a round seal area and the length-width ratio of the images; building a certificate image database, processing each certificate image in the database according to the above steps, so as to obtain the round seal HSV color feature vector and the length-width ratio of each scanned certificate image in the data base, calculating the probability of different data combinations in the certificate image database according to the obtained feature vectors, and storing the data after the feature weighting; calculating an image category which is most possible to appear according to a Naive Bayes algorithm and the probability of different data combinations in the certificate image database, and judging the classification of the images when the probability meets a set threshold requirement. According to the method, the certificate images can be simply and quickly classified, and the certificate image retrieval efficiency can be improved.
Description
Technical field
The present invention relates to a kind of image classification method, particularly be a kind of scanning certificate image classification method.
Background technology
Recent years, image retrieval is a very welcome topic, its searching object comprises trip in the sea, circle in the air on high and walk on the ground.Images Classification is a preprocessing process of image retrieval, effectively can improve the accuracy of image retrieval.Although existing numerous Images Classification searching systems for variety classes image data set, scanning certificate graphs is then paid close attention to less as systematic searching aspect, and the important auxiliary material that award or company expand applied for often by these scanning certificate graphs pictures.In order to ensure the legal utilization of this kind of certificate graphs picture, avoid repeatedly being utilized with a certificate, it is very important for some searching system that the scan image in special scanning certificate data collection is looked into heavy, and this is similar to a little the similarity inspection of file.The characteristics of image being applicable to popular content-based image classification searching system at present has color, texture, shape and spatial relation, but scanning certificate picture quality is low, of a great variety, format is various, both the logos with certain sense had been comprised, comprise again the brief and concise description for prize-winning situation, therefore, only utilizing existing algorithm will realize searching from large nuber of images storehouse, whether to there is the image file similar to certificate to be measured be inconvenient simultaneously.Therefore, we must make a concrete analysis of the feature of scan image, choose the feature stating certificate feature of image better.How computer technology is quick and precisely to annex testimonial material-scan image--and carrying out that similarity detects is national science technology evaluation of award problem in the urgent need to address.
Summary of the invention
The invention provides a kind of scanning certificate image classification method, can classify fast and effectively to certificate image, and the accuracy rate of certificate image retrieval can be significantly improved.
For achieving the above object, technical scheme of the present invention is as follows:
A naive Bayesian scanning certificate image classification method for feature based weighting, comprises the steps:
Step 1: set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
Step 2: read scanning certificate graphs picture to be sorted, carry out pre-service;
Step 3: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
Step 4: hsv color proper vector notable feature item is weighted;
Step 5: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Step 6: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting.The invention has the beneficial effects as follows: the naive Bayesian scanning certificate image classification method that the present invention is based on characteristic weighing, by to justifying Zhang Dingwei, segmentation, size adjustment through pretreated certificate imagery exploitation Hough transform, extract color feature vector and the image length breadth ratio of the HSV space in circle chapter region; Set up certificate image data base, each width certificate graphs picture in database is processed according to above-mentioned steps, obtain round chapter hsv color proper vector and the image length breadth ratio of every width scanning certificate graphs picture in database, calculate according to the proper vector obtained the probability that in certificate image data base, different pieces of information combination occurs, after weighting process, preserve data; The image category most possible according to the probability calculation image to be classified of different pieces of information combination appearance in NB Algorithm and certificate image data base, and this probability meets the threshold requirement of setting, judges the classification of picture; By this sorting technique, can classify to certificate image quickly and easily, effectively improve certificate graphs as effectiveness of retrieval.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of embodiment of the present invention image classification method.
Embodiment
Below in conjunction with accompanying drawing and example, the present invention will be further described.
See Fig. 1, the naive Bayesian scanning certificate image classification method of the present embodiment feature based weighting contains following steps: a kind of naive Bayesian scanning certificate image classification method of feature based weighting, comprises the steps:
A: input scanning certificate graphs picture to be sorted, carry out pre-service;
B: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
C: hsv color proper vector notable feature item is weighted;
D: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Each width certificate graphs picture in certificate image data base is according to above-mentioned steps A ~ D process, calculate and in database of record every class scanning certificate graphs picture prior probability and extract circle chapter region hsv color proper vector in the probability that occurs of different pieces of information combination, namely set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
E: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting;
The NB Algorithm that this method utilizes is as follows:
The target of this sorting technique obtains certificate graphs as most probable classification, P (v in the round chapter proper vector according to image to be classified
j) be prior probability, as long as calculate each classification, to appear at the frequency of certificate image data base just passable.V
nBrepresent the desired value that Naive Bayes Classifier exports.Generally, based on they probability on the training data, naive Bayesian learning method needs to estimate different P (v
j) and P (a
i| v
j) item, these estimate correspondence hypothesis to be learned, and the rule then using naive Bayesian to propose is classified.The NB Algorithm that we use just is only to need the frequency of occurrences of different pieces of information combination in calculation training sample simply just passable with other sorting algorithm difference, does not need search.
(L
k0, L
k1... L
k16) be hsv color proper vector and the picture length breadth ratio in the round chapter region of image to be checked, (L
i0, L
i2... L
i16) be hsv color proper vector and the picture length breadth ratio in the round chapter region scanning certificate graphs picture in database.
In described steps A, pre-service utilizes existing noise filtering and sloped correcting method to carry out pre-service;
To the method through the existing round Zhang Dingwei of pretreated certificate imagery exploitation in described step B, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region, extract the hsv color proper vector in circle chapter region;
Concrete operation step is as follows:
1) utilize the method for existing round Zhang Dingwei, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region;
2) by colourity H, saturation degree S and brightness V tri-components respectively non-uniform quantizing be 8 parts, 4 parts and 4 parts:
The HSV space in so round chapter region is divided into L
h+ L
s+ L
vindividual interval, L
h, L
s, L
vbe the quantification progression of H, S and V respectively, so we obtain the color feature vector of ten 6 DOFs, add scan image picture length breadth ratio, final extraction ten 7 degree of freedom proper vectors;
3) Nae Bayesianmethod adds up each data occurred, adds up the frequency that it occurs.For the ease of calculating, through repetition test, best effect can be obtained to the integer of all characteristics extraction one digit numbers.The ten 7 degree of freedom feature (L that this method is chosen
k0, L
k1... L
k16) represent, span is the integer between [0,9].
In described step C, proper vector notable feature item is weighted.
Characteristics of image distribution has such characteristic: in same image category, if the statistical distribution of certain feature is than comparatively dense, dispersion degree is smaller, so this feature relatively and this classification be reigning, be an important feature.On the contrary, if certain characteristic statistics compares dispersion, dispersion degree is higher, is exactly a unessential feature.The standard deviation of data can the discrete case of data of description well.This method adopts standard deviation to weigh characteristics of image weight.W
i={ w
ko, w
k1... w
k16the weight of representation feature vector.In sample set, classification is the standard deviation sigma of i-th dimension of j
i, its computing formula is:
N
jfor j class sample number, L
kibe the i-th dimensional feature value of a kth sample of j for image category,
for the mean value of this dimensional feature.Use e
irepresentation feature importance, e
i∈ [0,1] is formula:
thus the computing method obtaining the every dimensional feature weighting of each sample are:
Wherein, calculate and record the probability that in the proper vector extracting circle chapter region, different pieces of information combination occurs, its concrete operation step is as follows:
1) probability that in statistical nature vector, different pieces of information occurs, such as the 1st class the 2nd ties up the probability of appearance 4 is 30%;
2) probable value obtained is multiplied by the weight calculated in step C, and the probability occurred as different pieces of information combination is preserved.
The naive Bayesian scanning certificate image classification method of feature based weighting, its concrete operation step is as follows:
1) according to probability and the NB Algorithm of the different pieces of information combination appearance obtained in step D, the probability that certificate graphs picture to be sorted is every class image is calculated.For example, assuming that A image is the 1st class image, there is numeral 4 in the 2nd dimension, in the probability that step D preserves, find corresponding probable value, calculated by occurred data assemblies according to the probability search of step D;
2) obtain the probability that certificate is each class, and maximal value is greater than threshold value, then judges that certificate is the classification of maximum probability.Threshold value is set as 0.048.
The present embodiment scans certificate graphs as classification results as following table.
Test picture number | To classify correct number | Classification error number | Accuracy rate | |
One class software copyright scanning certificate graphs picture | 10 | 10 | 0 | 100% |
Two class software copyright scanning certificate graphs pictures | 10 | 10 | 0 | 100% |
Patent scanning certificate graphs picture | 10 | 10 | 0 | 100% |
Other interfering pictures | 10 | 9 | 1 | 90% |
Claims (7)
1. a naive Bayesian scanning certificate image classification method for feature based weighting, is characterized in that, comprise the steps:
Step 1: set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
Step 2: read scanning certificate graphs picture to be sorted, carry out pre-service;
Step 3: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
Step 4: hsv color proper vector notable feature item is weighted;
Step 5: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Step 6: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting.
2. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, each the width certificate graphs picture in certificate image data base carries out processing obtaining to 5 according to step 2 by the likelihood probability index that step 1 foundation scanning certificate graphs combines as different pieces of information.
3. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, in described step 2, pre-service utilizes existing noise filtering and sloped correcting method.
4. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, the concrete operation step of described step 3 is as follows:
1) utilize the method for existing round Zhang Dingwei, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region;
2) by colourity H, saturation degree S and brightness V tri-components respectively non-uniform quantizing be 8 parts, 4 parts and 4 parts:
The HSV space in so round chapter region is divided into L
h+ L
s+ L
vindividual interval, L
h, L
s, L
vbe the quantification progression of H, S and V respectively, obtain the color feature vector of ten 6 DOFs, add scan image picture length breadth ratio, final extraction ten 7 degree of freedom proper vectors;
3) the ten 7 degree of freedom feature (L extracted
k0, L
k1... L
k16) represent, span is the integer between [0,9].
5. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, the described step 4 pair concrete operation step that proper vector notable feature item is weighted is: adopt standard deviation to weigh characteristics of image weight, w
i={ w
ko, w
k1... w
k16the weight of representation feature vector, in sample set, classification is the standard deviation sigma of i-th dimension of j
i, its computing formula is:
N
jfor j class sample number, L
kibe the i-th dimensional feature value of a kth sample of j for image category,
for the mean value of this dimensional feature, use e
irepresentation feature importance, e
i∈ [0,1] is formula:
thus the computing method obtaining the every dimensional feature weighting of each sample are:
6. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, described step 5 calculates and records the concrete operation step of probability that in the proper vector extracting circle chapter region, different pieces of information combination occurs and is: the probability that in statistical nature vector, different pieces of information occurs; The probable value obtained is multiplied by the weight calculated in step 4, and the probability occurred as different pieces of information combination is preserved.
7. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, described step 6 is specially: the probability occurred according to the different pieces of information combination obtained in step 5 and NB Algorithm, calculate the probability that certificate graphs picture to be sorted is every class image; Obtain the probability that certificate is each class, and maximal value is greater than threshold value, then judge that certificate is the classification of maximum probability, threshold value is set as 0.048.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510100700.2A CN104751171B (en) | 2015-03-09 | 2015-03-09 | The naive Bayesian scanning certificate image classification method of feature based weighting |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510100700.2A CN104751171B (en) | 2015-03-09 | 2015-03-09 | The naive Bayesian scanning certificate image classification method of feature based weighting |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104751171A true CN104751171A (en) | 2015-07-01 |
CN104751171B CN104751171B (en) | 2016-04-20 |
Family
ID=53590824
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510100700.2A Active CN104751171B (en) | 2015-03-09 | 2015-03-09 | The naive Bayesian scanning certificate image classification method of feature based weighting |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104751171B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105117732A (en) * | 2015-07-24 | 2015-12-02 | 中南大学 | Scanned certificate image recognition method based on extreme learning machine |
CN108416316A (en) * | 2018-03-19 | 2018-08-17 | 中南大学 | A kind of detection method and system of black smoke vehicle |
CN108596276A (en) * | 2018-05-10 | 2018-09-28 | 重庆邮电大学 | The naive Bayesian microblog users sorting technique of feature based weighting |
CN110659654A (en) * | 2019-09-24 | 2020-01-07 | 福州大学 | Drawing duplicate checking and plagiarism preventing method based on computer vision |
CN110907909A (en) * | 2019-10-30 | 2020-03-24 | 南京市德赛西威汽车电子有限公司 | Radar target identification method based on probability statistics |
CN112150445A (en) * | 2020-09-27 | 2020-12-29 | 西安工程大学 | Yarn hairiness detection method based on Bayesian threshold |
US11080379B2 (en) | 2019-02-13 | 2021-08-03 | International Business Machines Corporation | User authentication |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103745201A (en) * | 2014-01-06 | 2014-04-23 | Tcl集团股份有限公司 | Method and device for program recognition |
CN104079587A (en) * | 2014-07-21 | 2014-10-01 | 深圳天祥质量技术服务有限公司 | Certificate identification device and certificate check system |
KR101477649B1 (en) * | 2013-10-08 | 2014-12-30 | 재단법인대구경북과학기술원 | Object detection device of using sampling and posterior probability, and the method thereof |
-
2015
- 2015-03-09 CN CN201510100700.2A patent/CN104751171B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101477649B1 (en) * | 2013-10-08 | 2014-12-30 | 재단법인대구경북과학기술원 | Object detection device of using sampling and posterior probability, and the method thereof |
CN103745201A (en) * | 2014-01-06 | 2014-04-23 | Tcl集团股份有限公司 | Method and device for program recognition |
CN104079587A (en) * | 2014-07-21 | 2014-10-01 | 深圳天祥质量技术服务有限公司 | Certificate identification device and certificate check system |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105117732A (en) * | 2015-07-24 | 2015-12-02 | 中南大学 | Scanned certificate image recognition method based on extreme learning machine |
CN105117732B (en) * | 2015-07-24 | 2018-09-07 | 中南大学 | Scanning certificate image-recognizing method based on extreme learning machine |
CN108416316A (en) * | 2018-03-19 | 2018-08-17 | 中南大学 | A kind of detection method and system of black smoke vehicle |
CN108596276A (en) * | 2018-05-10 | 2018-09-28 | 重庆邮电大学 | The naive Bayesian microblog users sorting technique of feature based weighting |
US11080379B2 (en) | 2019-02-13 | 2021-08-03 | International Business Machines Corporation | User authentication |
CN110659654A (en) * | 2019-09-24 | 2020-01-07 | 福州大学 | Drawing duplicate checking and plagiarism preventing method based on computer vision |
CN110907909A (en) * | 2019-10-30 | 2020-03-24 | 南京市德赛西威汽车电子有限公司 | Radar target identification method based on probability statistics |
CN110907909B (en) * | 2019-10-30 | 2023-09-12 | 南京市德赛西威汽车电子有限公司 | Radar target identification method based on probability statistics |
CN112150445A (en) * | 2020-09-27 | 2020-12-29 | 西安工程大学 | Yarn hairiness detection method based on Bayesian threshold |
CN112150445B (en) * | 2020-09-27 | 2023-12-15 | 西安工程大学 | Yarn hairiness detection method based on Bayes threshold |
Also Published As
Publication number | Publication date |
---|---|
CN104751171B (en) | 2016-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104751171B (en) | The naive Bayesian scanning certificate image classification method of feature based weighting | |
CN107563442B (en) | Hyperspectral image classification method based on sparse low-rank regular graph tensor embedding | |
CN103699523A (en) | Product classification method and device | |
US8724850B1 (en) | Small object detection using meaningful features and generalized histograms | |
CN105184298A (en) | Image classification method through fast and locality-constrained low-rank coding process | |
CN102609716A (en) | Pedestrian detecting method based on improved HOG feature and PCA (Principal Component Analysis) | |
CN103678274A (en) | Feature extraction method for text categorization based on improved mutual information and entropy | |
CN103473545A (en) | Text-image similarity-degree measurement method based on multiple features | |
CN111339924B (en) | Polarized SAR image classification method based on superpixel and full convolution network | |
Gurnani et al. | Flower categorization using deep convolutional neural networks | |
CN104463200A (en) | Satellite remote sensing image sorting method based on rule mining | |
CN104317946A (en) | Multi-key image-based image content retrieval method | |
CN108763262A (en) | A kind of brand logo search method | |
CN112633392A (en) | Terahertz human body security inspection image target detection model training data augmentation method | |
CN104008394A (en) | Semi-supervision hyperspectral data dimension descending method based on largest neighbor boundary principle | |
CN116796248A (en) | Forest health environment assessment system and method thereof | |
Wang et al. | SAR target discrimination based on BOW model with sample-reweighted category-specific and shared dictionary learning | |
Huang et al. | Superpixel-based change detection in high resolution sar images using region covariance features | |
CN108985346A (en) | Fusion low layer pictures feature and showing for CNN feature survey image search method | |
Zhang et al. | A training-free, one-shot detection framework for geospatial objects in remote sensing images | |
CN116844040A (en) | Small sample remote sensing image scene classification method based on double-flow structure | |
Liang et al. | Multi-resolution local binary patterns for image classification | |
CN105303199A (en) | Data fragment type identification method based on content characteristics and K-means | |
Yin et al. | Multispectral remote sensing image classification with multiple features | |
CN103020977B (en) | SAR (synthetic aperture radar) segmentation method based on polychotomy weighting segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |