CN109325434A

CN109325434A - A kind of image scene classification method of the probability topic model of multiple features

Info

Publication number: CN109325434A
Application number: CN201811077631.8A
Authority: CN
Inventors: 孙雪莹; 陈锦言
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2018-09-15
Filing date: 2018-09-15
Publication date: 2019-02-12

Abstract

The present invention relates to a kind of image scene classification methods of the probability topic model of multiple features, comprising: training image is concentrated all image blocks, piecemeal size is 9 × 9, and local color, SIFT and the textural characteristics of image are extracted to each piece；The feature of extraction is indicated with vector, generates visual dictionary with K- mean algorithm；Quantify the visual word in visual dictionary, obtains the characteristic visual word distribution in training image collection；Using LDA model, the distribution and the distribution situation of each training image theme probability θ of each affiliated theme Z of visual word in training sample is obtained；Obtained visual theme is distributed input KNN-SVM classifier, optimizes parameter K, V and Z of KNN-SVM classifier；To the unfiled image that test image is concentrated, corresponding visual theme is obtained with LDA model learning and is distributed；The theme distribution that LDA is handled inputs KNN-SVM classifier.

Description

A kind of image scene classification method of the probability topic model of multiple features

Technical field

The present invention relates to a kind of Fusion of Color features, the image scene classification method of SIFT feature and textural characteristics, especially It is a kind of image scene classification method based on LDA probability topic model.

Background technique

Scene image classification is one of a basic problem and computer vision field that robotics research faces Vital task.In recent years, with the fast development of machine vision technique, numerous scene classification methods and technology are emerged, these It is very extensive that method is related to face.So-called scene image classification, refers to given image, the content for being included by observing it, And then judge the classification of its photographed scene.In robotics research field, in order to estimate robot in locating ring in real time Position and direction in border, it usually needs the system that can be completed at the same time map and positioning is established for it, and scene image divides Class is exactly the key link in the system development.In computer vision field, with the rapid development of internet multimedia technology, Emerge the complicated image data of magnanimity, in order to which effectively these data are analyzed and are managed, need be according to picture material It sticks semantic label, and scene image classification is by chance a kind of important channel for solving the problems, such as such.

Common scene can be roughly divided into 4 classes: natural scene, City scenarios, indoor scene and event scenarios.Due to difference The difference of scene constitution element is larger, classifying quality of the same classification method on different contextual data collection be frequently present of compared with Big difference, and this species diversity is especially pronounced between outdoor scene and indoor scene.In early stage, scene image classification is mainly adopted With the method based on low-level features and the method based on scene image structure；And in the later period, scene image classification mainly uses base In the method for visual vocabulary.Therefore, the research method of scene image classification can substantially be divided into three classes: the side based on low-level features Method, the method for method and view-based access control model vocabulary based on scene image structure.

In the history of scene classification, SIFT (scale-invariant feature transform) is a kind of comparison Popular iamge description.It is to identify the same target occurred in different images, it is for translation, scaling, rotation, light Certain stability can be kept according to situations such as even blocking, there is resolving ability powerful, outstanding.Textural characteristics are also a kind of Global characteristics, it also illustrates the surface nature of scenery corresponding to image or image-region but since texture is a kind of object The characteristic on surface can not reflect the essential attribute of object completely, so being that can not obtain high level just with textural characteristics The of secondary picture material is different from color characteristic, and textural characteristics are not based on the feature of pixel, it needs including multiple pixels Statistics is carried out in the region of point and calculates in pattern match, and this zonal feature has biggish superiority, will not be due to The deviation of part and can not successful match as a kind of statistical nature, textural characteristics are often with there is rotational invariance, and for making an uproar Sound has stronger resistivity

Summary of the invention

The object of the present invention is to provide the probability topic model of a kind of Fusion of Color feature, textural characteristics and SIFT feature into The classification method of row scene image, the present invention can significantly improve the effect of classification.Technical solution is as follows:

A kind of image scene classification method of the probability topic model of multiple features, including the following steps:

Step 1: the image that 2/3rds are randomly choosed in data set is used as training set；Training image is concentrated all Image block, piecemeal size are 9 × 9, and local color, SIFT and the textural characteristics of image are extracted to each piece；

Step 2: the feature of extraction being indicated with vector, generates visual dictionary with K- mean algorithm；

Step 3: the visual word in quantization visual dictionary obtains the characteristic visual word distribution in training image collection；

Step 4: applying LDA model, obtain the distribution of each affiliated theme Z of visual word and each training image master in training sample Inscribe the distribution situation of probability θ；

Step 5: by obtained visual theme be distributed input KNN-SVM classifier, using training image concentrate image into Row experiment, optimizes parameter K, V and Z of KNN-SVM classifier；

Step 6: the unfiled image concentrated to test image repeats (1) to (3) step, obtains the distribution of characteristic visual word, Theme distribution situation of the distribution as word each in test image for using the z that training image obtains in (4), to test image, Corresponding visual theme distribution is obtained with LDA model learning；

Step 7: the theme distribution that LDA is handled inputs KNN-SVM classifier.

Detailed description of the invention

Fig. 1 is hsv color space structure schematic diagram

The mentioned method flow diagram of Fig. 2

Specific embodiment

1. color feature extracted and expression

Using hsv color space, hsv color space is defined according to the visual perception of people, including tone, saturation degree With three color attributes of brightness, these three dimensions are irrelevant.In hsv color space the colour information of luminance component and image without It closes, and tone and saturation degree component meet people to the visual perception of color, more can express face from the vision system of people Color characteristic, this is two distinguishing features of HSV.These features make hsv color space be very suitable to description digital picture.HSV face It is independent mutually between three components of the colour space, different visual characteristics of human eyes is respectively indicated, therefore human eye can also be independently Perceive the variation of each color component, especially tone variations, the vision energy accurate judgement of human eye.In summary feature, in number In the application such as image procossing and image scene classification, hsv color model is more suitable for describing the content of image.

General digital picture is all made of RGB color model expression, therefore with hsv color model extraction and is indicating color spy When sign, needing to convert RGB color value to hsv color value is indicated, the formula mutually converted are as follows:

V=max (R, G, B)

It enables:

Then have:

H=H' × 60

Wherein, H ∈ [0 °, 360 °], S ∈ [0,1], V ∈ [0,1]

2. SIFT feature indicates

SIFT (Scale Invariant Feature Transform) feature, there is good adaptability and robustness, Good invariance can be kept to the scaling of image, rotation, translation and the transformation such as affine.It is the figure based on scale space As local feature representation method.The algorithm steps of SIFT feature are described in detail as follows:

Step 1: constructing and initializing scale space

The Analysis On Multi-scale Features of image data are simulated with Scale-space theory, and the scale space of a width two dimensional image can use height This convolution kernel realizes that change of scale formula is as follows:

L (x, y, σ)=G (x, y, σ) * I (x, y)

D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ)

Wherein, G (x, y, σ) indicates changeable scale Gaussian function.The space coordinate of (x, y) expression two dimensional image pixel.σ Indicating scale coordinate, its size embodies the smoothness of image, and scale is smaller, and the minutia of correspondence image is more obvious, Finer, resolution ratio is higher, and image is more smooth.

Step 2: detection DOG scale space extreme point

Each sampled point will be adjacent with it all the points (scale domain and image area) be compared, to find scale space Extreme point.In this layer of scale space all the points corresponding with the minimum that adjacent layer is got or maximum value, all as image at this Characteristic point under scale.Since first layer does not have upper layer field, the last layer does not have lower layer field, the first and last two of each group of image Layer cannot carry out extreme value comparison, and for the two special layers, this chapter finds image Min-max using difference of Gaussian algorithm Approximation characteristic point, to guarantee the continuity of scale space variation.

Step 3: the bad point of effect in removal characteristic point

DOG operator can generate larger fluctuation in edge, remove the characteristic point of low contrast and the biggish edge that floats is rung Ying Dian can obtain more accurate key point position and scale.

Step 4: the direction of distribution key point

The directioin parameter of each key point is calculated and is specified by the gradient direction distribution of key point neighborhood territory pixel, It can guarantee the rotational invariance of characteristic point operator in this way.

Specifically, indicating the gradient direction of the neighborhood territory pixel of key point with statistic histogram.The peak value of histogram represents The principal direction of neighborhood gradient at the key point, the peak value that main peak value 80% energy is approximately equal in histogram can be used as the pass The auxiliary direction of key point.Only one possible principal direction of one key point, it is also possible to have a principal direction and multiple auxiliary directions.Pass through After above 4 step calculates, each key point is there are three dimension, scale, position and direction where respectively indicating.And piece image SIFT feature size determined by the keypoint quantity chosen.In practical application, Lowe suggests (each using 4 × 4 seed points Seed point has 8 direction gradient values) each key point is described, and each key point includes scale, position and direction, most end form The SIFT feature vector tieed up at one 3 × 128.Using normalization and standardization, illumination variation can be eliminated to image Bring influences.So far, SIFT feature vector eliminate scaling, translation, rotation and illumination factor influence.

3. the extraction of textural characteristics

Texture is showed by pixel and its intensity profile of surrounding space neighborhood, i.e. local grain information.In addition, part The repeatability of texture information in varying degrees is exactly global texture information.While textural characteristics embody the property of global characteristics, It also illustrates the surface nature of scenery corresponding to image or image-region.But since texture is a kind of spy of body surface Property, it can not reflect the essential attribute of object completely, so just with textural characteristics can not being obtained in high level diagram picture Hold.Different from color characteristic, textural characteristics are not based on the feature of pixel, it is needed in the region comprising multiple pixels In carry out statistics calculating.In pattern match, this zonal feature have biggish superiority, will not due to part it is inclined Difference and can not successful match.

Using gray level co-occurrence matrixes method.Co-occurrence matrix is defined with the joint probability density of the pixel of two positions, it is not Only reflect the distribution character of brightness, also reflection has same brightness or close to the position distribution characteristic between the pixel of brightness, is The second-order statistics feature of related brightness of image variation.The gray level co-occurrence matrixes of piece image can reflect image grayscale about side To, adjacent spaces, the integrated information of amplitude of variation, it be analyze image local mode and their queueing disciplines basis.

It takes any point (x, y) in image (N × N) and deviates its another point (x+a, y+b), if the gray value of the point pair For (g₁,g₂).It enables point (x, y) move on entire picture, then can obtain various (g₁,g₂) value, if the series of gray value is k, then (g₁,g₂) combination share k²Kind.For entire picture, each (g is counted₁,g₂) value occur number, be then arranged in one A square matrix, then with (g₁,g₂) occur total degree by they be normalized to occur probability P (g₁,g₂), such square matrix is known as Gray level co-occurrence matrixes.

4. hidden Di Li Cray distribution (Latent Dirichlet Allocation, LDA) model scene classification

According to the color of extraction, SIFT and textural characteristics, then Fusion Features are quantized into visual word, being modeled with LDA will Image carries out dimensionality reduction, and the visual theme distribution for obtaining image indicates, uses KNN (K arest neighbors sorting algorithm (k- NearestNeighbor)-SVM classifier, K arest neighbors (kNN, k-NearestNeighbor) sorting algorithm, according to image Visual theme distribution probability carries out scene classification to test image.

Step 1: the image of random selection 2/3rds is as training set in data set.Training image is concentrated into institute There is image block, piecemeal size is 9 × 9, and local color, SIFT and the textural characteristics of image are extracted to each piece.

Step 2: the feature of extraction is indicated with vector, visual dictionary is generated with K- mean algorithm.

Step 3: the visual word in quantization visual dictionary, obtains the characteristic visual word distribution in training image collection.

Step 4: obtaining the distribution of each affiliated theme Z of visual word and each training image in training sample using LDA model The distribution situation of theme probability θ

Step 5: by obtained visual theme be distributed input KNN-SVM classifier, using training image concentrate image into Row experiment, optimizes parameter K, V and Z of KNN-SVM classifier.

Step 6: repeating (1) to (3) step to the unfiled image that test image is concentrated, point of characteristic visual word is obtained Cloth uses theme distribution situation of the distribution for the z that training image obtains in (4) as word each in test image, to test chart Picture obtains corresponding visual theme with LDA model learning and is distributed.

Step 7: the theme distribution that LDA is handled inputs KNN-SVM classifier.

Claims

1. a kind of image scene classification method of the probability topic model of multiple features, including the following steps:

Step 1: the image that 2/3rds are randomly choosed in data set is used as training set；Training image is concentrated into all images Piecemeal, piecemeal size are 9 × 9, and local color, SIFT and the textural characteristics of image are extracted to each piece.

Step 4: applying LDA model, it is general to obtain the distribution of each affiliated theme Z of visual word and each training image theme in training sample The distribution situation of rate θ；

Step 5: obtained visual theme being distributed input KNN-SVM classifier, is carried out using the image that training image is concentrated real It tests, optimizes parameter K, V and Z of KNN-SVM classifier；

Step 6: the unfiled image concentrated to test image repeats (1) to (3) step, obtains the distribution of characteristic visual word, uses (4) distribution for the z that training image obtains in uses LDA to test image as the theme distribution situation of word each in test image Model learning obtains corresponding visual theme distribution；

Step 7: the theme distribution that LDA is handled inputs KNN-SVM classifier.