CN103034871B - A kind of image classification method based on space aggregation - Google Patents

A kind of image classification method based on space aggregation Download PDF

Info

Publication number
CN103034871B
CN103034871B CN201210560743.5A CN201210560743A CN103034871B CN 103034871 B CN103034871 B CN 103034871B CN 201210560743 A CN201210560743 A CN 201210560743A CN 103034871 B CN103034871 B CN 103034871B
Authority
CN
China
Prior art keywords
feature
image
local feature
congregating
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210560743.5A
Other languages
Chinese (zh)
Other versions
CN103034871A (en
Inventor
王亮
黄永祯
刘锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201210560743.5A priority Critical patent/CN103034871B/en
Publication of CN103034871A publication Critical patent/CN103034871A/en
Application granted granted Critical
Publication of CN103034871B publication Critical patent/CN103034871B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of image classification method based on space aggregation.In the method, first, extract local feature and obtain visual dictionary with clustering algorithm; Then, with encryption algorithm, feature is encoded, the feature in different spaces region is assembled and is connected in series together; Finally, carry out feature selecting with general feature selection approach, and be used for classifying to image with training classifier with the expression of the feature selected as image.The inventive method selects there is differentiation power most from the expression of different spaces areas combine, and the feature of robust is as the expression of picture more, thus can reflect the space distribution of feature in given classification picture and symbiosis information.The inventive method can reach the effect that nicety of grading is better than conventional algorithm with few feature quantity.

Description

A kind of image classification method based on space aggregation
Technical field
The present invention relates to mode identification technology, especially a kind of image classification method based on space aggregation.
Background technology
At present, traditional image classification method lacks the ability effectively expressing image space information.This be also computer vision system compared with human visual system on accuracy of identification still one of major reason that there is huge spread.Conventional image classification method often effectively can not utilize spatial information, as pyramid spatial match algorithm, it is only the simple series connection expressed a small amount of area of space, although there is certain robustness, efficiency and the judgement index of the spatial information of reflection are more weak.Some method directly utilizes the absolute spatial position of feature, but is easy to offset due to the locus of feature, such method often on the database alignd performance good, and very poor in unjustified database performance.
Therefore, in view of method is in the past difficult to the needs met Images Classification, the present invention proposes a kind of image classification method based on space aggregation and carry out Expressive Features spatial information in the picture, the method is not only insensitive to the skew of Individual features but also can describe its space distribution flexibly.The dimension of the image expression obtained due to this method is very high, and conventional feature selection approach therefore can be used to carry out feature selecting, using the final expression as image of the feature selected.
Summary of the invention
In order to solve prior art Problems existing, the object of this invention is to provide a kind of image classification method based on space aggregation, the method comprises the following steps:
Step S1, collects multiple image, sets up image classification data storehouse, and described database is divided into training set and test set;
Step S2, extracts the local feature of all images in described database;
Step S3, randomly draws out the local feature of some from the local feature of the image of described training set, utilizes clustering algorithm to learn to obtain a visual dictionary D=[d 1, d 2..., d k], wherein, K represents the size of visual dictionary, i.e. the number of cluster centre; d ibe a column vector, represent vision word, i.e. a cluster centre;
Step S4, the local feature described step S2 being extracted to all images obtained is encoded;
Step S5, is spatially divided into multiple rectangular block by each image in described database, and assembles respectively the local feature in each rectangular block, as the feature representation of this rectangular block;
Step S6, by the method for assembling, is merged into a region by rectangular block adjacent for space, and the result of being assembled by the several rectangular blocks participating in merging is as the feature representation merging the region obtained;
Identical for all sizes obtained in described step S6 and non-overlapping two regions are assembled, and are connected in series together as the feature representation of this image using the result of gathering by step S7;
Step S8, is similar to described step S5-S7, obtains the feature representation of all images in described training set, and chooses the final feature representation as image in described training set and test set of the feature wherein most with differentiation power;
Step S9, based on selecting the features training support vector machine most with differentiation power in described step S8, obtains Image Classifier;
Step S10, extracts the feature most with differentiation power of image in described test set, and this feature is inputted in described sorter and classify, thus obtains the classification results of this image in described test set.
According to method of the present invention, can be described the symbiosis information of the space distribution of single feature and multiple feature.By taking block as the description robust more that primitive makes for feature space position, and consider that the spacial combi nation form in various region can contribute to excavating more spatial information.
Accompanying drawing explanation
Fig. 1 is the image classification method process flow diagram that the present invention is based on space aggregation.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Traditional image classification method can be divided into extraction local feature, and training visual dictionary, expresses image, training classifier and detection new images five parts.On this basis, first the present invention adopts the method for space aggregation to express image, and then does feature selecting to the expression of all images, only using the final expression of selected feature as image.
Fig. 1 is the image classification method process flow diagram that the present invention is based on space aggregation, and as shown in Figure 1, the method comprises the following steps:
Step S1, collects multiple image, sets up image classification data storehouse, and described database is divided into training set and test set;
Step S2, extracts the local feature of all images in described database;
In this step, local feature description's or local feature can be used to detect the local feature that son obtains this image in the mode of intensive sampling, such as Scale invariant features transform SIFT, acceleration robust feature SURF etc.
Step S3, randomly draws out the local feature of some from the local feature of the image of described training set, utilizes clustering algorithm to learn to obtain a visual dictionary;
In this step, after randomly drawing and obtaining local feature, adopt clustering algorithm of the prior art (such as K means clustering algorithm) to train and obtain a visual dictionary D=[d 1, d 2..., d k], wherein, K represents the size of visual dictionary, i.e. the number of cluster centre, d ibe a column vector, represent vision word, i.e. a cluster centre.
Step S4, the local feature described step S2 being extracted to all images obtained is encoded;
Described coding can use Multi-encoding mode conventional in prior art to local feature f iencode, be encoded to example with local linear below and be described.
To i-th local feature f ithe step of carrying out local linear coding comprises further:
Step S41, utilizes following formula to calculate intermediate variable
α i * = ( Δ i T Δ i + βI ) - 1 1 ,
Wherein, Δ i=[f i-c 1, f i-c 2..., f i-c m], 1 ∈ R m × 1to be an all elements be all 1 column vector, R m × 1for the vector space of M × 1, c kfor distance feature f ia kth word in M nearest word; represent Δ itransposition; β is a constant, usually elects 10 as -4; I ∈ R m × Mbe a unit matrix, R m × Mfor the space of matrices of M × M; represent inverse;
Step S42, because α meets 1 tα=1, then right be normalized, obtain another intermediate variable α;
Step S43, according to this another intermediate variable α ivalue obtain local feature f ifinal coding express v i, wherein, v imiddle distance local feature f iresponse on nearest several words is corresponding α respectively iin value, on all the other words response be zero.
Step S5, is spatially divided into multiple rectangular block by each image in described database, and assembles respectively the local feature in each rectangular block, as the feature representation of this rectangular block;
In this step, such as along long and the wide rectangular block (as 4 × 4) image being divided into several rules, then respectively maximum gathering can be carried out to the local feature of each rectangular block inside, obtains the response b of vision word on 16 rectangular blocks 1, b 2..., b 16, i.e. the feature representation of rectangular block.
Step S6, by the method for assembling, is merged into a region by rectangular block adjacent for space, and the result of being assembled by the several rectangular blocks participating in merging is as the feature representation merging the region obtained;
Described gathering can use method for congregating conventional in prior art, is described below for maximum method for congregating.
Described agglomeration step is specially: adopt maximum method for congregating that spatially adjacent rectangular block is merged into a region, this region can be arbitrary size, as one by b 1, b 2the region of form 1 × 2, then the feature representation in this region can be obtained by following formula:
r 1=max(b 1,b 2),
Wherein, max represents the maximal value of getting two vectorial corresponding elements.
Identical for all sizes obtained in described step S6 and non-overlapping two regions (between two) are assembled, and are connected in series together as the feature representation of this image using the result of gathering by step S7;
Described gathering can use method for congregating conventional in prior art, is described below for minimized aggregation method.
In this step, for any formed objects and non-overlapping two areas combine, carry out minimized aggregation operation, i.e. p k=min (r i, r j), wherein, k represents combination sequence number, r i, r jthe feature representation in described two regions, then the final feature representation x of described image ifor the series connection of various areas combine, that is: x i=(p 1; p 2; p n), wherein, N is the quantity of areas combine.
Each step aggregation operator all can be replaced other clustered pattern above, as summation gathering, weighted sum gathering etc., to reflect other spatial relationship.
Step S8, is similar to described step S5-S7, obtains the feature representation of all images in described training set, and chooses the final feature representation as image in described training set and test set of the feature wherein most with differentiation power;
Wherein, select the method most with the feature of differentiation power can be any conventional feature selection approach, such as grafting in prior art, a kind of algorithm of increment, it can process large-scale data very easily, finally obtains the feature most with differentiation power chosen.
Step S9, based on selecting the features training support vector machine most with differentiation power in described step S8, obtains Image Classifier;
Step S10, extracts the feature most with differentiation power of image in described test set, and this feature is inputted in described sorter and classify, thus obtains the classification results of this image in described test set.
In order to verify implementation result of the present invention, be next described for certain scene classification database.This database comprises more than 4000 images, respectively show 15 kinds of different scenes.The present invention can according to the content of these images, provide image show the class label of scene.Concrete steps are as follows:
Step S1, from every class scene, random choose goes out 100 images, forms training plan image set, remaining all picture composition test sets;
Step S2, extracts SIFT local feature in the mode of intensive sampling from all images;
Step S3, randomly draws out 1,000,000 local features from training set, utilizes K mean algorithm to learn to obtain the visual dictionary that comprises 1024 vision word;
Step S4, extracts the local feature of all images, encodes to extracted feature by the mode of local linear coding;
Step S5, is spatially divided into the rectangular block of 4 × 4 by each image, carry out maximum gathering respectively, as the expression of this block to each piece of inner feature;
Step S6, adopts the method for maximum gathering spatially adjacent merged block to be become 1 × 1,1 × 2,1 × 3,1 × 4,2 × 1,2 × 2,2 × 3,2 × 4,3 × 1,3 × 2, the region of 4 × Isosorbide-5-Nitrae × 2, obtains the expression of zones of different;
Identical for all sizes and non-overlapping two regions are carried out minimized aggregation, are linked togather by the gathering resultant string of often kind of areas combine by step S7, as the expression of image to carry out feature selecting;
Step S8, adopts grafting algorithm to carry out feature selecting to the expression of image in training set, chooses the feature having differentiation power most wherein, with the final expression of these features as training and testing image;
Step S9, sends the final expression (feature by selecting) of training image into support vector machine training classifier;
Step S10, classifies the final sorter obtained in feeding S8 of expressing of test pattern.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (8)

1. based on an image classification method for space aggregation, it is characterized in that, the method comprises the following steps:
Step S1, collects multiple image, sets up image classification data storehouse, and described database is divided into training set and test set;
Step S2, extracts the local feature of all images in described database;
Step S3, randomly draws out the local feature of some from the local feature of the image of described training set, utilizes clustering algorithm to learn to obtain a visual dictionary D=[d 1, d 2..., d k], wherein, K represents the size of visual dictionary, i.e. the number of cluster centre; d ibe a column vector, represent vision word, i.e. a cluster centre;
Step S4, the local feature described step S2 being extracted to all images obtained is encoded;
Step S5, is spatially divided into multiple rectangular block by each image in described database, and assembles respectively the local feature in each rectangular block, as the feature representation of this rectangular block;
Step S6, by the method for assembling, is merged into a region by rectangular block adjacent for space, and the result of being assembled by the several rectangular blocks participating in merging is as the feature representation merging the region obtained;
Identical for all sizes obtained in described step S6 and non-overlapping two regions are assembled, and are connected in series together as the feature representation of this image using the result of gathering by step S7;
Step S8, is similar to described step S5-S7, obtains the feature representation of all images in described training set, and chooses the final feature representation as image in described training set and test set of the feature wherein most with differentiation power;
Step S9, based on selecting the features training support vector machine most with differentiation power in described step S8, obtains Image Classifier;
Step S10, extracts the feature most with differentiation power of image in described test set, and this feature is inputted in described sorter and classify, thus obtains the classification results of this image in described test set;
In described step S4, local linear coding is used to encode to the local feature of all images, to i-th local feature f ithe step of carrying out local linear coding comprises:
Step S41, utilizes following formula to calculate intermediate variable
α i * = ( Δ i T Δ i + βI ) - 1 1 ,
Wherein, Δ i=[f i-c 1, f i-c 2..., f i-c m], 1 ∈ R m × 1to be an all elements be all 1 column vector, R m × 1for the vector space of M × 1; c kfor distance local feature f ia kth word in M nearest word; represent Δ itransposition; β is a constant; I ∈ R m × Mbe a unit matrix, R m × Mfor the space of matrices of M × M; represent inverse;
Step S42 is right be normalized, obtain another intermediate variable α i;
Step S43, according to this another intermediate variable α ivalue obtain local feature f ifinal coding express v i, wherein, v imiddle distance local feature f iresponse on nearest several words is corresponding α respectively iin value, on all the other words response be zero.
2. method according to claim 1, is characterized in that, extracts the local feature of all images in described database in the mode of intensive sampling.
3. method according to claim 1, is characterized in that, in described step S2, described local feature is Scale invariant features transform feature or accelerates robust feature.
4. method according to claim 1, is characterized in that, in described step S3, described clustering algorithm is K means clustering algorithm.
5. method according to claim 1, is characterized in that, in described step S5, adopts maximum method for congregating, summation method for congregating or weighted sum method for congregating to assemble the local feature in each rectangular block.
6. method according to claim 1, is characterized in that, in described step S6, adopts maximum method for congregating, summation method for congregating or weighted sum method for congregating that rectangular block adjacent for space is merged into a region.
7. method according to claim 1, is characterized in that, in described step S7, adopts minimized aggregation method, summation method for congregating or weighted sum method for congregating to be assembled in identical for all sizes obtained in described step S6 and non-overlapping two regions.
8. method according to claim 1, is characterized in that, in described step S8, uses grafting algorithm to choose the feature most with differentiation power.
CN201210560743.5A 2012-12-20 2012-12-20 A kind of image classification method based on space aggregation Active CN103034871B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210560743.5A CN103034871B (en) 2012-12-20 2012-12-20 A kind of image classification method based on space aggregation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210560743.5A CN103034871B (en) 2012-12-20 2012-12-20 A kind of image classification method based on space aggregation

Publications (2)

Publication Number Publication Date
CN103034871A CN103034871A (en) 2013-04-10
CN103034871B true CN103034871B (en) 2015-09-23

Family

ID=48021749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210560743.5A Active CN103034871B (en) 2012-12-20 2012-12-20 A kind of image classification method based on space aggregation

Country Status (1)

Country Link
CN (1) CN103034871B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325290B (en) * 2020-03-20 2023-06-06 西安邮电大学 Traditional Chinese painting image classification method based on multi-view fusion multi-example learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254194A (en) * 2011-07-19 2011-11-23 清华大学 Supervised manifold learning-based scene classifying method and device
CN102609718A (en) * 2012-01-15 2012-07-25 江西理工大学 Method for generating vision dictionary set by combining different clustering algorithms

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254194A (en) * 2011-07-19 2011-11-23 清华大学 Supervised manifold learning-based scene classifying method and device
CN102609718A (en) * 2012-01-15 2012-07-25 江西理工大学 Method for generating vision dictionary set by combining different clustering algorithms

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
《Coding Images with Local Features》;Timo Dickscheid el at.;《Int J Comput Vis》;20111231;正文第154-174页 *
《一种基于多特征的高光谱遥感图像分类方法》;刘峰等;《地理与地理信息科学》;20090531;第25卷(第3期);正文第19-22、41页 *
《一种快速局部特征描述算法》;刘萍萍等;《自动化学报》;20100131;第36卷(第1期);正文第40-45页 *
《基于均值-标准差的K 均值初始聚类中心选取算法》;张文君等;《遥感学报》;20060930;第10卷(第5期);正文第715-721页 *
《基于支持向量机与无监督聚类相结合的中文网页分类器》;李晓黎等;《计算机学报》;20010131;第24卷(第1期);正文第62-68页 *

Also Published As

Publication number Publication date
CN103034871A (en) 2013-04-10

Similar Documents

Publication Publication Date Title
CN110533045B (en) Luggage X-ray contraband image semantic segmentation method combined with attention mechanism
CN105574550B (en) A kind of vehicle identification method and device
CN102662949B (en) Method and system for retrieving specified object based on multi-feature fusion
CN107679250A (en) A kind of multitask layered image search method based on depth own coding convolutional neural networks
CN104408483B (en) SAR texture image classification methods based on deep neural network
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN104268593A (en) Multiple-sparse-representation face recognition method for solving small sample size problem
CN110210534B (en) Multi-packet fusion-based high-resolution remote sensing image scene multi-label classification method
CN104167013B (en) Volume rendering method for highlighting target area in volume data
CN107292336A (en) A kind of Classification of Polarimetric SAR Image method based on DCGAN
CN104751175B (en) SAR image multiclass mark scene classification method based on Incremental support vector machine
CN103177265B (en) High-definition image classification method based on kernel function Yu sparse coding
CN107958067A (en) It is a kind of based on without mark Automatic Feature Extraction extensive electric business picture retrieval system
CN111652273B (en) Deep learning-based RGB-D image classification method
Dewi et al. Taiwan stop sign recognition with customize anchor
CN105740790A (en) Multicore dictionary learning-based color face recognition method
CN107767416A (en) The recognition methods of pedestrian's direction in a kind of low-resolution image
CN107085731A (en) A kind of image classification method based on RGB D fusion features and sparse coding
CN103440508A (en) Remote sensing image target recognition method based on visual word bag model
CN105740917B (en) The semi-supervised multiple view feature selection approach of remote sensing images with label study
CN114332544A (en) Image block scoring-based fine-grained image classification method and device
CN108229505A (en) Image classification method based on FISHER multistage dictionary learnings
CN103246895B (en) Based on the image classification method of depth information
Dong et al. New quantitative approach for the morphological similarity analysis of urban fabrics based on a convolutional autoencoder
CN103839074A (en) Image classification method based on matching of sketch line segment information and space pyramid

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant