CN109816025A

CN109816025A - An Image Retrieval Method Based on Image Classification

Info

Publication number: CN109816025A
Application number: CN201910086700.XA
Authority: CN
Inventors: 田东平; 张莹
Original assignee: Baoji University of Arts and Sciences
Current assignee: Baoji University of Arts and Sciences
Priority date: 2019-01-29
Filing date: 2019-01-29
Publication date: 2019-05-28

Abstract

本发明公开了一种基于图像分类的图像检索方法，属于图像分类技术领域，所述的图像检索方法包括如下步骤：A：图像预处理；B：图像特征提取；C：核函数选择：选定RBF核为所需的核函数，确定RBF核的参数δ以及惩罚因子C的值，取一组值分别求其分类函数，根据分类函数的预测准确率和经验来调整值，找出预测准确率最高的值作为核参数，基于网格搜索法的参数选择方法需要事先给定参数的选择范围，即解区间，在此区间内以一定的步长逐个试验，找到适应度最高的参数作为算法输出；D：图像分类；E：输入测试图片到处理中，从而判断其归属类别，测试无误后，可进行图片检索。本发明所述的检索方法不但算法简单，而且具有较好的鲁棒性。The invention discloses an image retrieval method based on image classification, belonging to the technical field of image classification. The image retrieval method comprises the following steps: A: image preprocessing; B: image feature extraction; C: kernel function selection: selection The RBF kernel is the required kernel function, determine the parameter δ of the RBF kernel and the value of the penalty factor C, take a set of values to find its classification function, adjust the value according to the prediction accuracy and experience of the classification function, and find out the prediction accuracy The highest value is used as the kernel parameter. The parameter selection method based on the grid search method needs to give the selection range of the parameters in advance, that is, the solution interval. In this interval, test one by one with a certain step size, and find the parameter with the highest fitness as the algorithm output. ; D: Image classification; E: Input the test image into the processing, so as to determine the category it belongs to. After the test is correct, the image retrieval can be carried out. The retrieval method of the invention not only has a simple algorithm, but also has better robustness.

Description

A kind of image search method based on image classification

Technical field

The present invention relates to Image Classfication Technology field more particularly to a kind of image search methods based on image classification.

Background technique

The number letter such as rapid development of adjoint network and multimedia technology, including sound, figure, image, video and animation Breath sharply expands.Image is concerned by people as a kind of abundant in content, intuitive media information of performance.In actual life In at every moment there is a large amount of image to generate, the image for meeting user's requirement how is found out from these image informations, is to grind The person's of studying carefully problem to be solved.Image classification is exactly the process of pattern-recognition, carries out quantitative analysis to image using computer, In image each pixel or region incorporate into as one of several classifications, to replace the vision interpretation of people.The content of image Rich and varied, the content abstraction for being included is complicated.Since the level that current image understanding and computer vision develop is limited, There are larger differences for description of the people to the understanding and computer of image to image.And different people is to the reason of same piece image There is also gap or even far from each other, such problems for solution and description, are all the difficulties that image classification needs to consider and solve Topic.In order to quickly tell the classification of image, needs to carry out image classification first, then carry out image retrieval.

Summary of the invention

(1) the technical issues of solving

In view of the deficiencies of the prior art, it the present invention provides a kind of image search method based on image classification, solves The problem of existing search method inconvenience.

(2) technical solution

To achieve the above object, the invention provides the following technical scheme: a kind of image retrieval side based on image classification Method, image search method include the following steps:

A: it image preprocessing: downloads different pictures and is input in processor by scanner, a part of picture is used as Training picture, the picture of another part is as test picture；

B: image characteristics extraction: selection LIBSVM software carries out image classification, to the color and textural characteristics of training picture It extracts, and using color characteristic and textural characteristics as the classification of LIBSVM software；

C: Selection of kernel function: selected RBF core is required kernel function, determines the parameter δ and penalty factor of RBF core Value, takes a class value to seek its classification function respectively, according to the predictablity rate of classification function and experience come adjusted value, finds out prediction The highest value of accuracy rate is used as nuclear parameter, and the parameter selection method based on grid data service needs the selection of prior given parameters Range, i.e. solution section, are tested one by one with certain step-length in this section, it is defeated as algorithm to find the highest parameter of fitness Out；

D: N (N-1)/2 SVM two classification device, the instruction of each classifier image classification: are constructed for N class classification problem Practicing sample is relevant two classes, combines these two classification devices and classification number is a classification Support matrix, and use mould Plate similarity mode method, the highest class of similarity are classification belonging to sample；

E: input test picture is into processing, to judge its belonging kinds, after test is errorless, can carry out picture retrieval.

Preferably, a kind of image search method based on image classification according to claim 1, it is characterised in that: In step C, the calculation of the kernel function of the RBF core is as follows:Wherein δ is RBF The parameter of kernel function, x represent different data, x-x_iRepresent data subspace.

Preferably, in step C, the method that grid data service finds optimal nuclear parameter includes the following steps: selected C and δ Range, choose C ∈ (2^-5, 2^-3... 2¹⁵), l/ δ²∈(2^-15,2^-13..., 2³), progress coarse grid search first, setting is searched Suo Buchang is 1, in this way, constituting a two-dimensional grid, each group of C on corresponding grid on δ coordinate system in C, δ value is all one group latent It is solving, is representing one group of SVM parameter, the mean value of each group parameter prediction accuracy rate is calculated according to K folding cross validation method, use is contour Line is drawn, and a contour map is obtained, and determines optimal C, δ parameter pair.

Preferably, optimal C is determined, δ parameter is to rear, then has carried out carrying out a refined net again after coarse grid search searching Rope selectes a region of search that is, on existing contour map, selectes the highest region of predictablity rate, reduces search step It is long to carry out binary search.

Preferably, reducing search section step-length is 0.1.

Preferably, in step D, it is equipped with R classifier D={ D₀, D₁... D_R-1), problem to be processed has C classification, If input sample X, the output of classifier mouth is C dimensional vector: D_i(x)=[d_i,0(x),d_i,1(x),…d_i,c-1(x)], In, wherein d_i,j(x) (j=0,1 ..., C-1) presentation class device d_iThe support for being class j to sample X judgement, by all classification The output result of device builds up classification Support matrix.

Preferably, in step D, template similarity matching method refers to classification Support matrix to be sorted and owns The decision template of classification compares, most like classification, that is, current sample classification.

(3) beneficial effect

The present invention provides a kind of image search methods based on image classification, have following the utility model has the advantages that of the invention Statistical method using SVM classifier as image classification, therefore it is different from existing statistical method.It has the following advantages: The terminal decision function of SVM classifier is only determined that the complexity of calculating depends on supporting vector by a small number of supporting vectors Number, rather than the dimension of sample space, this avoids " dimension disaster " in some sense.A small number of supporting vectors determine Final result, this can not only help we grasp the key link sample, reject bulk redundancy sample, and be doomed this method not But algorithm is simple, and has preferable robustness.Guarantee due to there is more stringent Statistical Learning Theory to do, using SVM The model that classifier is established has preferable Generalization Ability.SVM classifier can provide the determination of the Generalization Ability of model built The upper bound, this is not available for current other any learning methods.Any one data model is established, artificial intervention is got over It is few more objective.It is compared with other methods, it is less to establish the intervention of priori required for SVM model.

Specific embodiment

Below in conjunction with the embodiment of the present invention, technical solution in the embodiment of the present invention is clearly and completely retouched It states, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on the present invention In embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

A kind of image search method based on image classification, image search method include the following steps:

C: Selection of kernel function: selected RBF core is required kernel function, determines the parameter δ and penalty factor of RBF core Value, the calculation of the kernel function of the RBF core are as follows:Wherein δ is RBF kernel function Parameter, x represent different data, x-x_iData subspace is represented, takes a class value to seek its classification function respectively, according to classification function Predictablity rate and experience carry out adjusted value, find out the highest value of predictablity rate as nuclear parameter, based on grid data service Parameter selection method needs the range of choice of prior given parameters, i.e. solution section, is tried one by one in this section with certain step-length It tests, finds the highest parameter of fitness and exported as algorithm；The method that grid data service finds optimal nuclear parameter includes following step Rapid: the range of selected C and δ chooses C ∈ (2^-5, 2^-3... 2¹⁵), l/ δ²∈(2^-15,2^-13..., 2³), coarse grid is carried out first Search, set step-size in search is 1, in this way, in C, one two-dimensional grid of composition on δ coordinate system, and each group of C, δ on correspondence grid Value is all one group of potential solution, represents one group of SVM parameter, calculates each group parameter prediction accuracy rate according to K folding cross validation method Mean value, drawn with contour, obtain a contour map, determine optimal C, δ parameter pair；Determine optimal C, δ parameter pair Afterwards, then after having carried out coarse grid search a refined net search is carried out again, i.e., a search is selected on existing contour map The highest region of predictablity rate is selected in region, is reduced step-size in search and is carried out binary search, and reducing step-size in search is 0.1.

D: N (N-1)/2 SVM two classification device, the instruction of each classifier image classification: are constructed for N class classification problem Practicing sample is relevant two classes, combines these two classification devices and classification number is a classification Support matrix, be equipped with R Classifier D={ D₀, D₁... D_R-1), problem to be processed has C classification, if input sample X, the output of classifier mouth is one C dimensional vector: D_i(x)=[d_i,0(x),d_i,1(x),…d_i,c-1(x)], wherein wherein d_i,j(x) (j=0,1 ..., C-1) table Show classifier d_iThe support for being class j to sample X judgement, builds up classification Support matrix for the output result of all classifiers, And template similarity matching method is used, the highest class of similarity is classification belonging to sample, and template similarity matching method refers to The decision template of classification Support matrix and all categories to be sorted is compared, most like classification, that is, current sample Classification；

The present invention uses statistical method of the SVM classifier as image classification, therefore is different from existing statistical method. It has the following advantages: the terminal decision function of SVM classifier is only determined by a small number of supporting vectors, the complexity of calculating Depending on the number of supporting vector, rather than the dimension of sample space, this avoids " dimension disaster " in some sense.It is few Number supporting vectors determine final result, this can not only help we grasp the key link sample, reject bulk redundancy sample, and And this method has been doomed it not only algorithm is simple, and there is preferable robustness.Due to there is more stringent Statistical Learning Theory It does and guarantees, there is preferable Generalization Ability using the model that SVM classifier is established.SVM classifier can provide model built Generalization Ability determination the upper bound, this is not available for current other any learning methods.Establish any one data mould The fewer type, artificial intervention the more objective.It is compared with other methods, it is less to establish the intervention of priori required for SVM model.

It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to Cover non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", not There is also other identical elements in the process, method, article or apparatus that includes the element for exclusion.

It although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with A variety of variations, modification, replacement can be carried out to these embodiments without departing from the principles and spirit of the present invention by understanding And modification, the scope of the present invention is defined by the appended.

Claims

1. an image retrieval method based on image classification, is characterized in that: the image retrieval method comprises the steps:

A: Image preprocessing: download different images and input them into the processor through the scanner, some of the images are used as training images, and the other part of the images are used as test images;

B: Image feature extraction: Select LIBSVM software for image classification, extract color and texture features of training images, and use color features and texture features as the classification of LIBSVM software;

C: Kernel function selection: Select the RBF kernel as the required kernel function, determine the parameter δ of the RBF kernel and the value of the penalty factor C, take a set of values to calculate its classification function, and determine the classification function according to the prediction accuracy and experience of the classification function. Adjust the value and find the value with the highest prediction accuracy as the kernel parameter. The parameter selection method based on the grid search method needs to give the selection range of the parameters in advance, that is, the solution interval. The parameter with the highest fitness is used as the algorithm output;

D: Image classification: Construct N(N-1)/2 SVM two-class classifiers for N-class classification problems. The training samples of each classifier are two related classes, and the combination of these two-class classifiers and the number of classes is A category support matrix, and the template similarity matching method is used, and the class with the highest similarity is the category to which the sample belongs;

E: Input the test picture into the processing, so as to judge its belonging category. After the test is correct, the picture retrieval can be carried out.

2. a kind of image retrieval method based on image classification according to claim 1, is characterized in that: in step C, the calculation mode of the kernel function of described RBF kernel is as follows: where δ is the parameter of the RBF kernel function, x represents different data, and x _i represents the data subspace.

3. a kind of image retrieval method based on image classification according to claim 2, is characterized in that: in step C, the method that grid search method finds optimal kernel parameter comprises the steps: select the scope of C and δ , select C∈(2 ^-5 , 2 ^-3 ,...2 ¹⁵ ), l/δ ² ∈(2 ^-15 , 2 ^-13 ,..., 2 ³ ), first perform a coarse grid search, set The search step is set to 1, so that a two-dimensional grid is formed on the C and δ coordinate systems, and each group of C and δ values on the corresponding grid is a group of potential solutions, representing a group of SVM parameters, which are folded according to K The cross-validation method calculates the mean value of the prediction accuracy of each group of parameters, draws it with contour lines, and obtains a contour map to determine the best C, δ parameter pair.

4. a kind of image retrieval method based on image classification according to claim 3, it is characterized in that: after determining the best C, after the delta parameter pair, carry out the coarse grid search again and then carry out the fine grid search again, That is, a search area is selected on the existing contour map, the area with the highest prediction accuracy is selected, and the search step size is reduced to perform a secondary search.

5 . The image retrieval method based on image classification according to claim 4 , wherein the step size of the reduced search part is 0.1. 6 .

6. An image retrieval method based on image classification according to claim 1, characterized in that: in step D, there are R classifiers D={D ₀ , D ₁ ,...D _R-1 ,) (the right bracket should be }), the problem to be processed has C categories, set the input sample X, the output of the classifier is a C-dimensional vector: D _i (x)=[d _i,0 (x),d _i,1 (x),...d _i,c-1 (x)], where d _i,j (x)(j=0,1,...,C-1) represents the classifier d _i pair The sample X is judged as the support degree of class j, and the output results of all classifiers are built into a class support degree matrix.

7. A kind of image retrieval method based on image classification according to claim 1, it is characterized in that: in step D, template similarity matching method refers to the category support matrix to be classified and all categories of decision templates. For comparison, the most similar category is the category of the current sample.