CN103440501A

CN103440501A - Scene classification method based on nonparametric space judgment hidden Dirichlet model

Info

Publication number: CN103440501A
Application number: CN2013103928915A
Authority: CN
Inventors: 牛振兴; 王斌; 高新波; 宗汝; 郑昱; 李洁
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2013-09-01
Filing date: 2013-09-01
Publication date: 2013-12-11

Abstract

The invention discloses a scene classification method based on a non-parametric space judgment hidden Dirichlet model. It mainly solves the shortcomings of existing classification methods that do not contain scene space information. The implementation steps are: (1) input image; (2) extract image block features; (3) initialize non-parametric space decision hidden Dirichlet model parameters; (4) establish non-parametric space decision hidden Dirichlet model; (5) ) image scene classification. The invention utilizes image blocks containing spatial information, can describe image scenes more abundantly, and improves the accuracy rate of image scene classification.

Description

A Scene Classification Method Based on Nonparametric Spatial Decision Hidden Dirichlet Model

技术领域technical field

本发明属于图像处理技术领域，更进一步涉及模式识别技术领域中的一种基于非参数空间判决隐狄利克雷（Nonparametric Spatial Discriminative LatentDirichlet Allocation，NS-DiscLDA）模型的场景分类方法。本发明可用于对自然图像的场景分类，提高场景分类准确率。The invention belongs to the technical field of image processing, and further relates to a scene classification method based on a Nonparametric Spatial Discriminative Latent Dirichlet Allocation (NS-DiscLDA) model in the technical field of pattern recognition. The invention can be used for scene classification of natural images and improves the accuracy of scene classification.

背景技术Background technique

场景分类是图像理解的基本任务之一，它在场景识别中有非常重要的作用。传统的场景分类通常基于三种方法：其一，基于图谱分析的图像集合的场景分类方法；其二，基于监督流形学习的场景分类方法；其三，基于目标及其空间关系特性的场景分类方法。Scene classification is one of the basic tasks of image understanding, and it plays a very important role in scene recognition. Traditional scene classification is usually based on three methods: first, a scene classification method based on image collections based on graph analysis; second, a scene classification method based on supervised manifold learning; third, a scene classification method based on objects and their spatial relationship characteristics method.

清华大学申请的专利“基于谱图分析的图像集合的场景分类方法及装置”（申请号：201110221407.3申请日：2011-08-03申请公布号：CN102542285A）中公开了一种场景分类方法。该方法通过交互时间确定隶属度，主要解决现有方法中非线性数据丢失的问题，进而提高分类准确率。该方法实施方式是：首先提取图像集合的SIFT特征集合，并得到K个聚类和K个码字，根据任意图像的SIFT特征和码字建立有权图谱，确定有权图谱与任意节点的欧氏距离最近的K’个节点，得到节点集合对应的权重矩阵，然后根据权重矩阵得到拉普拉斯算子矩阵，对拉普拉斯算子矩阵进行运算得到任意图像的每一个SIFT特征与K个码字之间的交互时间，根据交互时间确定隶属度，最后根据隶属度确定码字分配结果，根据分配结果对场景进行分类。但是，该专利申请的方法存在的不足之处是：单纯利用分类器对图像场景进行分类，缺失了场景中语义信息，进而降低了场景分类的准确率。A scene classification method is disclosed in the patent "Scene classification method and device based on image collection based on spectrogram analysis" (application number: 201110221407.3 application date: 2011-08-03 application publication number: CN102542285A) applied by Tsinghua University. This method determines the degree of membership through interaction time, which mainly solves the problem of non-linear data loss in existing methods, thereby improving the classification accuracy. The implementation of the method is as follows: first extract the SIFT feature set of the image set, and obtain K clusters and K codewords, establish a weighted map according to the SIFT features and codewords of any image, and determine the weighted map and the European node of any node. K' nodes with the closest distance to obtain the weight matrix corresponding to the node set, and then obtain the Laplacian operator matrix according to the weight matrix, and perform operations on the Laplacian operator matrix to obtain each SIFT feature of any image and K The interaction time between codewords, the membership degree is determined according to the interaction time, and finally the codeword assignment result is determined according to the membership degree, and the scene is classified according to the assignment result. However, the disadvantage of the method of this patent application is that simply using a classifier to classify image scenes lacks semantic information in the scenes, thereby reducing the accuracy of scene classification.

清华大学申请的专利“基于监督流形学习的场景分类方法及装置”（申请号：201110202756.0申请日：2011-07-19申请公布号：CN102254194A）中公开了一种场景分类方法。该方法利用流形学习对图像场景进行了分类，主要解决现有方法没有考虑高维特征点的流形特征的问题。该方法实施方式是：首先提取图像特征并获取特征的聚类中心组成的码本，然后获取每个流形结构上的各个特征到码字上的度量，计算测试图像的特征到码字的隶属度并得到直方图向量，最后利用支持向量机对直方图向量进行学习，得到图像的场景类别。但是，该专利申请的方法存在的不足之处是：该方法采用了流形学习技术，然而流形学习技术的分类能力较弱，从而导致了场景分类的准确率降低，另外该方法的计算复杂度太高，导致场景分类速度降低。A scene classification method is disclosed in the patent "Scene Classification Method and Device Based on Supervised Manifold Learning" (application number: 201110202756.0 application date: 2011-07-19 application publication number: CN102254194A) applied by Tsinghua University. This method uses manifold learning to classify image scenes, and mainly solves the problem that existing methods do not consider the manifold characteristics of high-dimensional feature points. The implementation of the method is as follows: first extract image features and obtain the codebook composed of the cluster centers of the features, then obtain the metrics of each feature on each manifold structure to the codeword, and calculate the membership of the feature of the test image to the codeword degree and get the histogram vector, and finally use the support vector machine to learn the histogram vector to get the scene category of the image. However, the disadvantages of the method of this patent application are: the method uses manifold learning technology, but the classification ability of manifold learning technology is weak, which leads to a decrease in the accuracy of scene classification, and the calculation of this method is complicated If the degree is too high, the speed of scene classification will be reduced.

中国科学院电子学研究所申请的专利“一种基于目标及其空间关系特性的图像场景分类方法”（申请号：201110214985.4申请日：2011-07-29申请公布号：CN102902976A）中公开了一种场景分类方法。该方法通过融合主题之间的空间关系特性提高场景分类准确率。该方法实施方式是：首先定义一种空间关系直方图，表征目标之间的空间关系，然后采用融合主题之间空间关系特性的概率隐含语义分析模型，建立图像模型，最后用支持向量机分类场景图像。但是，该专利申请的方法存在的不足之处是：由于该方法采用了pLSA主题模型建模的方法，然而pLSA主题模型缺失先验信息，导致细节信息丢失进而降低了场景分类的准确率。A scene is disclosed in the patent "A method for classifying image scenes based on objects and their spatial relationship characteristics" (application number: 201110214985.4 application date: 2011-07-29 application publication number: CN102902976A) applied by the Institute of Electronics, Chinese Academy of Sciences Classification. This method improves the accuracy of scene classification by fusing the spatial relationship characteristics between topics. The implementation of the method is as follows: firstly define a spatial relationship histogram to characterize the spatial relationship between objects, then adopt a probabilistic implicit semantic analysis model that integrates the spatial relationship characteristics between topics, establish an image model, and finally use a support vector machine to classify scene image. However, the disadvantage of the method of this patent application is that since the method uses the pLSA topic model modeling method, the pLSA topic model lacks prior information, resulting in the loss of detailed information and reducing the accuracy of scene classification.

发明内容Contents of the invention

本发明的目的在于克服上述现有技术的不足，提出基于非参数空间判决隐狄利克雷模型的场景分析方法，更为全面地描述图像，提高场景分类准确率。The purpose of the present invention is to overcome the shortcomings of the above-mentioned prior art, and propose a scene analysis method based on a non-parametric spatial decision hidden Dirichlet model, to describe images more comprehensively and improve the accuracy of scene classification.

实现本发明的技术思路是，将图像均匀分割为许多8×8的图像块，提取图像块的SIFT特征，获取图像块的空间坐标，利用图像块的特征和空间坐标建立非参数空间判决隐狄利克雷模型，使得模型中包含图像块的空间信息，达到更为全面地描述图像，提高场景分类准确率的目的。The technical idea of realizing the present invention is to divide the image evenly into many 8×8 image blocks, extract the SIFT features of the image blocks, obtain the spatial coordinates of the image blocks, and use the features and spatial coordinates of the image blocks to establish a non-parametric spatial decision hidden The Likelet model enables the model to include the spatial information of the image block, so as to achieve a more comprehensive description of the image and improve the accuracy of scene classification.

为实现上述目的，本发明包括如下主要步骤：To achieve the above object, the present invention comprises the following main steps:

（1）输入图像：输入人工标注场景类别的训练图像。(1) Input image: Input the training image manually labeled scene category.

（2）提取图像块特征。(2) Extract image block features.

将训练图像分成多个8×8的图像块，对每个图像块提取SIFT特征，记录每个图像块的空间坐标。Divide the training image into multiple 8×8 image blocks, extract SIFT features for each image block, and record the spatial coordinates of each image block.

（3）初始化模型参数：对非参数空间判决隐狄利克雷模型进行手工初始化，获得场景元素空间分布参数。(3) Initialize model parameters: Manually initialize the non-parametric spatial decision hidden Dirichlet model to obtain the spatial distribution parameters of scene elements.

（4）建立非参数空间判决隐狄利克雷模型。(4) Establish a non-parametric spatial decision hidden Dirichlet model.

估计模型中每个主题的单词分布的参数，对图像块的特征和空间坐标进行统计建模，建立非参数空间判决隐狄利克雷模型。The parameters of word distribution for each topic in the model are estimated, the features and spatial coordinates of image patches are statistically modeled, and a non-parametric spatial decision latent Dirichlet model is established.

（5）图像场景分类。(5) Image scene classification.

根据非参数空间判决隐狄利克雷模型，预测测试图像的类别标记，完成图像场景的分类。According to the non-parametric space decision hidden Dirichlet model, the category label of the test image is predicted, and the classification of the image scene is completed.

本发明与现有方法相比具有如下优点：Compared with existing methods, the present invention has the following advantages:

第一，本发明在提取图像块特征时记录图像块的空间坐标，克服了现有技术中不包含空间信息的缺点，使得本发明的图像信息更加完整，提高了图像场景分类准确率。First, the present invention records the spatial coordinates of the image block when extracting the features of the image block, which overcomes the disadvantage of not including spatial information in the prior art, makes the image information of the present invention more complete, and improves the accuracy of image scene classification.

第二，本发明对图像块的特征和空间坐标进行统计建模，使图像块的特征通过空间坐标相互联系，克服了现有技术图像表示方法中图像块的信息不相关的缺点，使得本发明的图像表示方法保持了图像信息的全面性。Second, the present invention performs statistical modeling on the features and spatial coordinates of the image blocks, so that the features of the image blocks are related to each other through the spatial coordinates, which overcomes the disadvantage that the information of the image blocks in the image representation method of the prior art is irrelevant, and makes the present invention The image representation method keeps the comprehensiveness of image information.

第三，本发明所建立的非参数空间判决隐狄利克雷模型利用了主题的单词分布进行建模，使建模更加容易，克服了现有技术中图像建模能力差的缺点，使得本发明表现出了更好的建模能力。Third, the non-parametric space decision hidden Dirichlet model established by the present invention utilizes the word distribution of the topic for modeling, which makes modeling easier, overcomes the shortcomings of poor image modeling capabilities in the prior art, and makes the present invention Demonstrated better modeling ability.

附图说明Description of drawings

图1为本发明的流程图；Fig. 1 is a flowchart of the present invention;

图2为本发明的非参数空间判决隐狄利克雷模型图。Fig. 2 is a diagram of the hidden Dirichlet model of non-parametric space decision in the present invention.

具体实施方式Detailed ways

下面结合附图对本发明作进一步的详细描述。The present invention will be described in further detail below in conjunction with the accompanying drawings.

参照附图1，对本发明实现的步骤如下：With reference to accompanying drawing 1, the step that the present invention is realized is as follows:

步骤1，输入图像。Step 1, input image.

输入人工标注场景类别的训练图像。本发明的人工标注是指对所有训练图像分别标注自然图像类别标记。Input training images with human-annotated scene categories. The manual labeling in the present invention refers to labeling the natural image category marks on all the training images respectively.

步骤2，提取图像块特征。Step 2, extract image block features.

SIFT特征提取的步骤如下：The steps of SIFT feature extraction are as follows:

第一步，将拟提取SIFT特征的图像块组成图像块集合。In the first step, the image blocks to be extracted with SIFT features are composed into an image block set.

第二步，为尺度值σ选取0.5、0.8、1.1、1.4、1.7五个尺度值，将五个尺度值分别带入下式，得到不同尺度的五个高斯函数。In the second step, five scale values of 0.5, 0.8, 1.1, 1.4, and 1.7 are selected for the scale value σ, and the five scale values are respectively brought into the following formula to obtain five Gaussian functions of different scales.

$G G ((x x,, y the y,, σ σ)) = = \frac{11}{22 π π {σ σ}^{22}} {e e}^{- - (({x x}^{22} + + {y the y}^{22}))}$

其中，G(x,y,σ)表示在σ尺度值下的高斯函数，x、y分别表示图像块像素点对应的横、纵坐标值。Among them, G(x, y, σ) represents the Gaussian function under the σ scale value, and x and y represent the abscissa and ordinate values corresponding to the pixels of the image block, respectively.

第三步，将第一步图像块集合中每个图像块分别与五个不同尺度的高斯函数卷积，获得第一阶五层图像集。In the third step, each image block in the image block set in the first step is convoluted with five Gaussian functions of different scales to obtain a first-order five-layer image set.

第四步，将第一阶五层图像集的每幅图像隔点采样，获得第二阶五层图像集。In the fourth step, each image in the first-order five-layer image set is sampled at intervals to obtain a second-order five-layer image set.

第五步，将第二阶五层图像集的每幅图像隔点采样，获得第三阶五层图像集。In the fifth step, each image in the second-order five-layer image set is sampled at intervals to obtain a third-order five-layer image set.

第六步，将同层相邻阶的图像相减，获得二阶五层差分图像集。The sixth step is to subtract images of adjacent stages in the same layer to obtain a second-order five-layer differential image set.

第七步，获得所有图像的二阶五层差分图像集，所有图像的二阶五层差分图像集就是高斯差分尺度空间。The seventh step is to obtain the second-order five-layer difference image set of all images, and the second-order five-layer difference image set of all images is the Gaussian difference scale space.

第八步，将高斯差分尺度空间中图像的每个像素点，分别与该像素点位置相邻的8个像素点和同阶上下层图像位置相邻的18个像素点进行灰度值大小比较，判断该像素点是否是极值点，如果该像素点是极值点，则标记为特征点，否则，不标记。In the eighth step, each pixel of the image in the Gaussian difference scale space is compared with the 8 pixels adjacent to the pixel and the 18 pixels adjacent to the upper and lower images of the same order to compare the gray value , to determine whether the pixel is an extreme point, if the pixel is an extreme point, it is marked as a feature point, otherwise, it is not marked.

第九步，按照下式，计算高斯差分尺度空间中每个特征点的主曲率比值。The ninth step is to calculate the principal curvature ratio of each feature point in the Gaussian difference scale space according to the following formula.

$C C = = \frac{{((α α + + β β))}^{22}}{αβ αβ}$

其中，C表示高斯差分尺度空间中特征点的主曲率比值，α，β分别表示高斯差分尺度空间中的特征点在图像像素点横、纵坐标方向的梯度值。Among them, C represents the principal curvature ratio of the feature points in the Gaussian difference scale space, and α and β respectively represent the gradient values of the feature points in the Gaussian difference scale space in the horizontal and vertical coordinate directions of the image pixel.

第十步，判断高斯差分尺度空间中每个特征点的主曲率比值是否小于主曲率比值阈值10，如果小于，则标记该特征点为关键点，否则，不标记。The tenth step is to judge whether the principal curvature ratio of each feature point in the Gaussian difference scale space is less than the principal curvature ratio threshold 10, and if so, mark the feature point as a key point, otherwise, do not mark it.

第十一步，按照下式，计算高斯差分尺度空间中图像每个关键点的梯度幅值。In the eleventh step, calculate the gradient magnitude of each key point of the image in the Gaussian difference scale space according to the following formula.

$m m ((x x,, y the y)) = = \sqrt{{[[L L ((x x + + 11,, y the y)) - - L L ((x x - - 11,, y the y))]]}^{22} + + {[[L L ((x x,, y the y + + 11)) - - L L ((x x,, y the y - - 11))]]}^{22}}$

其中，m(x,y)表示关键点的梯度幅值，x、y分别表示关键点对应的横、纵坐标值，L表示关键点在尺度空间中的尺度。Among them, m(x, y) represents the gradient magnitude of the key point, x and y represent the horizontal and vertical coordinate values corresponding to the key point, respectively, and L represents the scale of the key point in the scale space.

第十二步，按照下式，计算高斯差分尺度空间中图像每个关键点的梯度方向。In the twelfth step, calculate the gradient direction of each key point of the image in the Gaussian difference scale space according to the following formula.

θ(x,y)=αtan2{[L(x,y+1)-L(x,y-1)]/[L(x+1,y)-L(x-1,y)]}θ(x,y)=αtan2{[L(x,y+1)-L(x,y-1)]/[L(x+1,y)-L(x-1,y)]}

其中，θ(x,y)表示关键点的梯度方向，x、y分别表示关键点对应的横、纵坐标值，α表示关键点在横坐标方向的梯度值，L表示关键点在尺度空间中的尺度。Among them, θ(x, y) represents the gradient direction of the key point, x and y represent the horizontal and vertical coordinate values corresponding to the key point respectively, α represents the gradient value of the key point in the horizontal coordinate direction, and L represents the key point in the scale space scale.

第十三步，统计高斯差分尺度空间中每个关键点周边8×8个像素点的梯度方向和幅值，获得梯度直方图，其中，梯度直方图的横轴是梯度方向角，纵轴是梯度方向角对应的幅值。The thirteenth step is to count the gradient direction and magnitude of 8×8 pixels around each key point in the Gaussian difference scale space to obtain a gradient histogram, where the horizontal axis of the gradient histogram is the gradient direction angle, and the vertical axis is The magnitude corresponding to the gradient orientation angle.

第十四步，将高斯差分尺度空间的坐标轴旋转为关键点的方向，计算每个关键点子区域的8维向量表示，将所有关键点子区域的8维向量组合，获得中每个关键点的128维SIFT特征。The fourteenth step is to rotate the coordinate axis of the Gaussian difference scale space to the direction of the key point, calculate the 8-dimensional vector representation of each key point sub-region, combine the 8-dimensional vectors of all key point sub-regions, and obtain the value of each key point in 128-dimensional SIFT features.

步骤3，初始化模型参数。Step 3, initialize the model parameters.

对非参数空间判决隐狄利克雷模型进行手工初始化，获得场景元素空间分布参数。The non-parametric spatial decision latent Dirichlet model is manually initialized to obtain the spatial distribution parameters of scene elements.

模型初始化的步骤如下：The steps of model initialization are as follows:

第一步，查找训练图像中是否存在场景元素空间布局信息，如果存在，转到第二步，否则，转到第三步。The first step is to find out whether the spatial layout information of scene elements exists in the training image, if it exists, go to the second step, otherwise, go to the third step.

第二步，将训练图像的场景元素空间布局信息作为场景元素空间分布参数。In the second step, the spatial layout information of the scene elements of the training image is used as the spatial distribution parameter of the scene elements.

第三步，将训练图像均匀分成多个8×8的图像块，由统计图像块的方法得到场景元素空间分布参数。In the third step, the training image is evenly divided into multiple 8×8 image blocks, and the spatial distribution parameters of scene elements are obtained by the method of statistical image blocks.

步骤4，建立非参数空间判决隐狄利克雷模型。Step 4, establishing a non-parametric spatial decision hidden Dirichlet model.

下面结合附图2，对本发明建立非参数空间判决隐狄利克雷模型的过程作进一步的详细描述。The process of establishing the hidden Dirichlet model of non-parametric space decision in the present invention will be further described in detail below in conjunction with FIG. 2 .

本发明建立非参数空间判决隐狄利克雷模型的步骤如下：The steps of the present invention to establish the hidden Dirichlet model of non-parametric space decision are as follows:

第一步，根据场景元素空间的分布参数，估计图像主题的概率分布R。In the first step, the probability distribution R of image subjects is estimated according to the distribution parameters of the scene element space.

第二步，对图像d(d=1,2,...,D)在主题上的概率分布θ_d随机采样，得到图像块主题的采样样本z_dn，α表示变量θ_d的分布参数,k(k=1,2,...,K₁)表示主题。The second step is to randomly sample the probability distribution θ _d of the image d (d=1,2,...,D) on the topic, and obtain the sampling sample z _dn of the image block topic, and α represents the distribution parameter of the variable θ _d , k (k=1,2,...,K ₁ ) represents a topic.

第三步，根据图像块主题的采样样本z_dn，估计图像块主题的单词分布参数φ_k，β表示变量φ_k的分布参数。The third step is to estimate the word distribution parameter φ _k of the image block topic according to the sampling sample z _dn of the image block topic, and β represents the distribution parameter of the variable φ _k .

第四步，根据图像块主题的单词分布参数φ_k，建立非参数空间判决隐狄利克雷模型，其中，π表示类别标记的分布，本发明中为均匀分布，y_d表示图像d的类别标记，T表示场景的映射矩阵，N_d表示图像块，u_dn表示图像块dn隐主题，w_dn表示图像块dn单词号，l_dn表示图像块dn的空间坐标。The fourth step is to establish a non-parametric spatial decision hidden Dirichlet model according to the word distribution parameter φ _k of the subject of the image block, where π represents the distribution of category labels, which is uniform distribution in the present invention, and y _d represents the category label of image d , T represents the mapping matrix of the scene, N _d represents the image block, u _dn represents the hidden theme of the image block dn, w _dn represents the word number of the image block dn, l _dn represents the space coordinates of the image block dn.

步骤5，图像场景分类。Step 5, image scene classification.

第一步，将测试图像带入非参数空间判决隐狄利克雷模型，得到测试图像类别概率分布。In the first step, the test image is brought into the non-parametric space decision hidden Dirichlet model, and the probability distribution of the test image category is obtained.

第二步，将测试图像类别概率分布中概率最大的类别作为测试图像的类别标记。In the second step, the category with the highest probability in the category probability distribution of the test image is used as the category label of the test image.

本发明的效果可以通过以下仿真实验做进一步的说明。The effects of the present invention can be further illustrated by the following simulation experiments.

1.仿真条件1. Simulation conditions

本发明是在中央处理器为Intel(R)Core i3-5302.93GHZ、内存4G、WINDOWS7操作系统上，运用MATLAB软件进行的仿真。数据库采用LabelMe数据库和UIUC-Sports数据库。The present invention is the emulation that uses MATLAB software to carry out on the central processor being Intel (R) Core i3-5302.93GHZ, internal memory 4G, WINDOWS7 operating system. The database uses LabelMe database and UIUC-Sports database.

2.仿真内容2. Simulation content

本发明在图像场景数据库上进行场景分类仿真实验。测试图像采用LabelMe数据库和UIUC-Sports数据库。LabelMe数据库包含8个场景分类，分别是“高速公路”，“城市”，“高建筑物”，“街道”，“森林”，“海岸”，“山脉”和“乡村”。UIUC-Sports数据库包含8个场景分类，分别是“羽毛球”，“地掷球”，“槌球”，“马球”，“攀岩”，“赛艇”，“帆船”和“滑雪”。The invention performs scene classification simulation experiments on the image scene database. The test images use LabelMe database and UIUC-Sports database. The LabelMe database contains 8 scene classifications, namely "highway", "city", "high building", "street", "forest", "coast", "mountain" and "country". The UIUC-Sports database contains 8 scene categories, namely "badminton", "boccia", "croquet", "polo", "rock climbing", "rowing", "sailing" and "skiing".

本发明以分类准确率为指标对方法性能进行评测，仿真对比了不同场景分类方法对图像场景进行分类的准确率，对比的多种场景分类方法包括空间隐狄利克雷（Spatial Latent Dirichlet Allocation，sLDA）方法、判决隐狄利克雷（Discriminative Latent Dirichlet Allocation，DiscLDA）方法、空间判决隐狄利克雷（Spatial Discriminative Latent Dirichlet Allocation，S-DiscLDA）方法和本发明方法。对比实验结果如下表所示。The present invention evaluates the performance of the method with the classification accuracy rate as an index, and compares the accuracy rate of different scene classification methods for classifying image scenes in simulation. The comparison of multiple scene classification methods includes spatial latent Dirichlet Allocation (SLDA) ) method, Discriminative Latent Dirichlet Allocation (DiscLDA) method, Spatial Discriminative Latent Dirichlet Allocation (S-DiscLDA) method and the method of the present invention. The comparison experiment results are shown in the table below.

由上表可见，在两种数据库上进行场景分类实验，本发明的分类准确率是四种方法中最高的。这是因为本发明突出了包含空间信息的图像块，所以能够更好地描述图像场景，由此获得在分类准确率上优于其他场景分类方法的效果，进一步验证了本发明的先进性。It can be seen from the above table that the classification accuracy rate of the present invention is the highest among the four methods in the scene classification experiments on the two databases. This is because the present invention highlights the image blocks containing spatial information, so it can better describe the image scene, thereby obtaining the effect of classification accuracy superior to other scene classification methods, which further verifies the advanced nature of the present invention.

Claims

1. A scene classification method based on non-parametric space judgment hidden Dirichlet model, it is characterized in that, comprising the following steps:

(1) Input image: input the training image manually marked with the scene category;

(2) Extract image block features:

Divide the training image into multiple 8×8 image blocks, extract SIFT features for each image block, and record the spatial coordinates of each image block;

(3) Initialize model parameters:

Manually initialize the non-parametric spatial decision hidden Dirichlet model to obtain the spatial distribution parameters of scene elements;

(4) Establish a non-parametric spatial decision hidden Dirichlet model:

Estimate the parameters of the word distribution for each topic in the model, statistically model the features and spatial coordinates of image patches, and build a non-parametric spatial decision latent Dirichlet model;

(5) Image scene classification:

According to the non-parametric space decision hidden Dirichlet model, the category label of the test image is predicted, and the classification of the image scene is completed.

2. the scene classification method based on non-parametric space judgment hidden Dirichlet model according to claim 1, is characterized in that, the labeling scene category described in step (1) refers to labeling natural image category mark respectively to all training images .

3. the scene classification method based on non-parametric space judgment hidden Dirichlet model according to claim 1, is characterized in that, the described step of extracting SIFT feature in step (2) is as follows:

In the first step, the image blocks to be extracted with SIFT features are composed into an image block set;

In the second step, five scale values of 0.5, 0.8, 1.1, 1.4, and 1.7 are selected for the scale value σ, and the five scale values are respectively brought into the following formula to obtain five Gaussian functions of different scales;

G G ((x x,, y the y,, σ σ)) = = \frac{11}{22 π π {σ σ}^{22}} {e e}^{- - (({x x}^{22} + + {y the y}^{22}))}

Among them, G(x, y, σ) represents the Gaussian function under the σ scale value, and x and y represent the horizontal and vertical coordinate values corresponding to the pixels of the image block, respectively;

In the third step, each image block in the image block set in the first step is convoluted with five Gaussian functions of different scales to obtain a first-order five-layer image set;

In the fourth step, each image of the first-order five-layer image set is sampled at intervals to obtain a second-order five-layer image set;

In the fifth step, each image of the second-order five-layer image set is sampled at intervals to obtain a third-order five-layer image set;

The sixth step is to subtract images of adjacent stages in the same layer to obtain a second-order five-layer differential image set;

The seventh step is to obtain the second-order five-layer difference image set of all images, and the second-order five-layer difference image set of all images is the Gaussian difference scale space;

The eighth step is to compare the gray value of each pixel of the image in the Gaussian difference scale space, the 8 pixels adjacent to the pixel and the 18 pixels adjacent to the upper and lower images of the same order , to determine whether the pixel is an extreme point, if the pixel is an extreme point, mark it as a feature point, otherwise, do not mark it;

The ninth step is to calculate the principal curvature ratio of each feature point in the Gaussian difference scale space according to the following formula;

C C = = \frac{{((α α + + β β))}^{22}}{αβ αβ}

Among them, C represents the principal curvature ratio of the feature points in the Gaussian difference scale space, and α and β respectively represent the gradient values of the feature points in the Gaussian difference scale space in the horizontal and vertical coordinate directions of the image pixels;

The tenth step is to judge whether the principal curvature ratio of each feature point in the Gaussian difference scale space is less than the principal curvature ratio threshold 10, and if it is less than, mark the feature point as a key point, otherwise, do not mark;

In the eleventh step, calculate the gradient magnitude of each key point of the image in the Gaussian difference scale space according to the following formula;

m m ((x x,, y the y)) = = \sqrt{{[[L L ((x x + + 11,, y the y)) - - L L ((x x - - 11,, y the y))]]}^{22} + + {[[L L ((x x,, y the y + + 11)) - - L L ((x x,, y the y - - 11))]]}^{22}}

Among them, m(x, y) represents the gradient amplitude of the key point, x and y represent the horizontal and vertical coordinate values corresponding to the key point respectively, and L represents the scale of the key point in the scale space;

In the twelfth step, calculate the gradient direction of each key point of the image in the Gaussian difference scale space according to the following formula;

θ(x,y)=αtan2{[L(x,y+1)-L(x,y-1)]/[L(x+1,y)-L(x-1,y)]}

Among them, θ(x, y) represents the gradient direction of the key point, x and y represent the horizontal and vertical coordinate values corresponding to the key point respectively, α represents the gradient value of the key point in the horizontal coordinate direction, and L represents the key point in the scale space scale;

The thirteenth step is to count the gradient direction and magnitude of 8×8 pixels around each key point in the Gaussian difference scale space to obtain a gradient histogram, where the horizontal axis of the gradient histogram is the gradient direction angle, and the vertical axis is The magnitude corresponding to the gradient direction angle;

The fourteenth step is to rotate the coordinate axis of the Gaussian difference scale space to the direction of the key point, calculate the 8-dimensional vector representation of each key point sub-region, combine the 8-dimensional vectors of all key point sub-regions, and obtain the value of each key point in 128-dimensional SIFT features.

4. the scene classification method based on nonparametric space decision hidden Dirichlet model according to claim 1, is characterized in that, the step of the described model initialization of step (3) is as follows:

The first step is to find out whether there is scene element space layout information in the training image, if it exists, go to the second step, otherwise, go to the third step;

In the second step, the spatial layout information of the scene elements of the training image is used as the spatial distribution parameter of the scene elements;

In the third step, the training image is evenly divided into multiple 8×8 image blocks, and the spatial distribution parameters of scene elements are obtained by the method of statistical image blocks.

5. the scene classification method based on non-parametric space judgment hidden Dirichlet model according to claim 1, is characterized in that, the step of setting up non-parametric space judgment hidden Dirichlet model described in step (4) is as follows:

In the first step, the probability distribution of image topics is estimated according to the distribution parameters of the scene element space;

In the second step, the probability distribution of the image theme is randomly sampled to obtain sampling samples of the image block theme;

The third step is to estimate the word distribution parameters of the image block topic according to the sampling samples of the image block topic;

The fourth step is to establish a non-parametric spatial decision hidden Dirichlet model according to the word distribution parameters of the image block topic.

6. the scene classification method based on non-parametric space judgment hidden Dirichlet model according to claim 1, is characterized in that, the step of the prediction test image classification mark described in step (5) is as follows:

In the first step, the test image is brought into the non-parametric space decision hidden Dirichlet model to obtain the test image category probability distribution;

In the second step, the category with the highest probability in the category probability distribution of the test image is used as the category label of the test image.