CN104408158B - One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion - Google Patents

One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion Download PDF

Info

Publication number
CN104408158B
CN104408158B CN201410733763.7A CN201410733763A CN104408158B CN 104408158 B CN104408158 B CN 104408158B CN 201410733763 A CN201410733763 A CN 201410733763A CN 104408158 B CN104408158 B CN 104408158B
Authority
CN
China
Prior art keywords
th
step
represents
matrix
η
Prior art date
Application number
CN201410733763.7A
Other languages
Chinese (zh)
Other versions
CN104408158A (en
Inventor
汪萌
张鹿鸣
郭丹
田绪婷
Original Assignee
合肥工业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 合肥工业大学 filed Critical 合肥工业大学
Priority to CN201410733763.7A priority Critical patent/CN104408158B/en
Publication of CN104408158A publication Critical patent/CN104408158A/en
Application granted granted Critical
Publication of CN104408158B publication Critical patent/CN104408158B/en

Links

Abstract

本发明公开了一种基于几何重构和语义融合的视点追踪方法,其特征是按如下步骤进行:1、构建子图的视觉特征集合;2、融合语义特征;3、主动学习算法;4、优化更新;5、排序并依次链接子图,获得视点追踪路径。 The present invention discloses a method of tracking based on the geometric view and semantic integration reconstruction, which is characterized as follows: 1, wherein the visual constructs FIG set; 2, fusion semantic features; 3, active learning algorithm; 4, update optimization; 5, sub-sorting and in turn link map, get a viewpoint track path. 本发明能快速准确地检测出图像的各显著区域,并提高视点追踪的准确性,从而提高视点转移路线的预测能力。 The present invention can quickly and accurately detect each of the significant regions of the image, and to improve the accuracy of tracking of the viewpoint, a viewpoint to improve the ability to predict the transfer route.

Description

一种基于几何重构和语义融合的视点追踪方法 One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion

技术领域 FIELD

[0001] 本发明属于图像认知重构、图像增强、图像分类技术领域,主要涉及一种基于几何重构和语义融合的视点追踪方法。 [0001] The present invention belongs to the reconstructed image recognition, image enhancement, image classification technical field relates to a method based on geometrical viewpoint tracking and reconstruction semantic fusion.

背景技术 Background technique

[0002] 视点追踪是一种智能图像分析方法,其目的就是通过模仿人类视觉转移快速的找到用户最感兴趣的视觉信息以及解释复杂的场景,是计算机视觉领域热门研究课题之一。 [0002] viewpoint tracking is an intelligent image analysis, its purpose is to mimic human vision through visual information quickly find the most interested users transfer and interpretation of complex scenes, is one of the hot research topics in computer vision. 视点追踪可应用于图像理解、图像压缩、图像分类、图像的重定向、信息检索等方面。 View tracking can be used in image understanding, image compression, image classification terms, the redirection of the image, and information retrieval.

[0003] 随着现代传感技术和信息处理技术的发展,视点追踪技术也得到了巨大的发展,但是仍然面临以下几点问题: [0003] With the development of modern sensor technology and information processing technology, the viewpoint tracking technology has also been a great development, but still faces the following problems:

[0004] —:视点在追踪的过程中经常面临的提取的图像局部和全局的信息不全。 [0004] -: local and global image information extracted in the process of tracing the viewpoint often face incomplete.

[0005] 例如2010年,Jia Li等作者在顶级国际期刊International Journal ofComputer Vision上发表的文章《Probabilistic Multi-Task Learning for VisualSaliency Estimation in Video》中提出的一种视觉显著性预测方法,该方法采用多任务框架来估计视觉显著性,其中的多任务是基于低维的视觉特征和任务相关因素而实现的,然而该方法中的视觉特征只考虑到低维的视觉特征,缺乏对目标图像中各显著区域之间语义相关性描述,使得显著性模型不能恢复图像的全局和局部信息,从而导致丢失提取的图像局部或全局的信息; [0005] For example in 2010, a visual saliency prediction methods Jia Li and other authors published in top international journals International Journal ofComputer Vision paper presents the "Probabilistic Multi-Task Learning for VisualSaliency Estimation in Video", the method using multi the task framework to estimate the visual significance, which multitasking is based on low-dimensional visual features and tasks related factors achieved, however, the visual features of the method is only taking into account the low-dimensional visual features, the lack of a target image in each significant description semantic correlation between the regions, so that the model does not significantly restore the global and local information of the image, resulting in the loss of local or global image information extracted;

[0006] 二:很多的视点追踪技术缺少图像之间的几何结构,导致了追踪不够精确。 [0006] Two: many viewpoints tracking missing geometry between the image, resulting in a less accurate tracking.

[0007] 例如2011 年,Feng Lud等作者在顶级国际会议IEEE International Conferenceon Computer Vision (ICCV 2011)上发表的《Inferring Human Gaze from Appearancevia Adaptive Linear Regression》。 [0007] For example in 2011, Feng Lud and other authors published in top international conferences IEEE International Conferenceon Computer Vision (ICCV 2011) "Inferring Human Gaze from Appearancevia Adaptive Linear Regression". 这篇文章里提出用自适应线性回归方法预测视点转移,利用自适应线性回归方法学习一种从高维的特征到目标特征空间的低维特征的映射函数,但是该方法在预测视点转移的过程因为缺少关注图像显著区域的几何结构,使得视点的预测只能估计固定的视点,应用范围有一定的局限性,从工程角度来看,这些方法实用性不尚; This article proposes regression method with adaptive linear predictive view transfer, using adaptive linear regression study low-dimensional mapping function from a characteristic feature of high-dimensional feature space to the target, but the method of transfer in the process of predictive view because a significant lack of attention image region geometry, such that only the estimated prediction viewpoint fixed viewpoint, has some limitations range of applications, from the engineering point of view, these methods are still not practical;

[0008] 因此,到目前为止,依然没有出现一种追踪精度高的并且可以工程应用的视点追踪方法。 [0008] Thus far, still no view tracking method for tracking high-precision engineering applications and may appear.

发明内容 SUMMARY

[0009] 本发明为解决上述现有技术存在的不足之处,提出一种基于几何重构和语义融合的视点追踪方法,以期快速准确地检测出图像的各显著区域,并提高视点追踪的准确性,从而提高视点转移路线的预测能力。 [0009] The present invention solves the above-described deficiencies present in the prior art, provides a method of tracking the geometric point of view reconstruction and semantic-based fusion, in order to quickly and accurately detect each of the significant regions of the image, and to improve the accuracy of the viewpoint tracking , resulting in improved ability to predict viewpoints transfer route.

[0010] 本发明为解决方法问题采用如下方法方案: [0010] The method of the present invention employs the following solutions to solve the problem Method:

[0011] 本发明一种基于几何重构和语义融合的视点追踪方法的特点是按如下步骤进行: [0011] The present invention is based on the geometric characteristics of the viewpoint tracking method and semantics reconstruction fusion is as follows:

[0012] 步骤1、构建子图的视觉特征集合: [0012] Step 1, FIG constructs visual feature set:

[0013] 步骤1.1、采用聚类方法将源图像划分成1个子区域,将每个子区域作为一个节点,构建包含若干个节点的子图,从而获得子图集合G= {Gi,G2,…,Gn,…,Gn},N表不子图的总数;Gn表示所述子图集合G中第η个子图,并有 [0013] Step 1.1, using the clustering method to the source image into a sub-areas, each sub-region as a node, FIG constructs comprising several nodes, to obtain a sub-set of graphs G = {Gi, G2, ..., Gn, ..., Gn}, N the total number of sub-picture is not a table; Gn G represents a first subfigure η th set, and there

Figure CN104408158BD00061

表示第η个子图中包含tn个节点;并有 Η represents subgraphs contained tn nodes; and have

Figure CN104408158BD00062

表示Gn中第ti个节点;En表示所述tn 个节点之间的几何连接边集合;K tnd ; Ti Gn represents the first node; En represents the geometric tn between the set of edges connecting nodes; K tnd;

[0014] 步骤1.2、采用经典图像颜色特征提取方法获得第η个子图Gn的颜色特征 [0014] Step 1.2, using the classical method for obtaining an image color feature extracting color characteristics subgraphs Gn, η

Figure CN104408158BD00063

mQ表示第η个子图Gn中第U个节点的颜色矢量,dc表示颜色特征Μ〗的维度,丨< tl< tn; mQ color vectors η represents the first sub FIG Gn U nodes, dc〗 Μ color feature dimensions, Shu <tl <tn;

[0015] 采用经典图像纹理特征提取方法获得第η个子图Gn的纹理特征 [0015] The classical method for obtaining the image texture feature extraction of texture features subgraphs Gn, η

Figure CN104408158BD00064

,表示第η个子图Gn中第U个节点1^的纹理矢量,dTE表示纹理特征的维度; , Η represents the first subgraph Gn ^ U texture node 1 vector, denotes the dimension dTE texture features;

[0016] 利用式⑴提取所述第η个子图Gn的几何结构特征^ „ [0016] extracted using a formula of the structural features of ⑴ η subgraphs Gn ^ geometry "

Figure CN104408158BD00065

[0017] [0017]

Figure CN104408158BD00066

[0018] 式(1)中, [0018] Formula (1),

Figure CN104408158BD00067

表示所述几何结构特征 Representing the geometric structural feature

Figure CN104408158BD00068

的第U行第h列元素, The first U line h column element,

Figure CN104408158BD00069

表示从第η个子图Gn中第^个节点%的区域中心到第k个节点%的区域中心的矢量的水平角度; Η represents subgraphs Gn from the first node ^% to regional centers horizontal angle of the vector the k-th region of the center nodes%;

[0019] 步骤1.3、将所述第η个子图Gn的颜色特征 [0019] Step 1.3, wherein said first color sub-η FIG Gn.

Figure CN104408158BD000610

、纹理特征 Texture features

Figure CN104408158BD000611

和几何结构特征 Geometric and structural characteristics

Figure CN104408158BD000612

分别进行矩阵转置后依次相连,获得视觉特征矩阵 Sequentially connected respectively, after the matrix transpose, visual feature matrix obtained

Figure CN104408158BD000613

采用特征融合方法将所述视觉特征矩阵 Using the fusion method wherein the visual characteristic matrix

Figure CN104408158BD000614

转化成视觉特征矢量yn,yB ei?#1,dY表示所述视觉特征矢量yn的矢量维度; Transforming into a visual feature vector yn, yB ei # 1, dY represents the visual feature vector of the vector yn dimension?;

[0020] 步骤1.4、重复步骤1.2和步骤1.3,依次获得所有子图的视觉特征集合 [0020] Step 1.4, the step of repeating steps 1.2 and 1.3, all the sub-sequence to obtain a set of visual features of FIG.

Figure CN104408158BD000615

[0021] 步骤2、融合语义特征: [0021] Step 2 fusion semantic feature:

[0022] 步骤2.1、利用式⑵计算第i个子图G1和第j个子图&amp;的视觉特征距离CIgw(G1iGj): [0022] Step 2.1, calculated using the formula ⑵ the i-th and j-th graph G1 of FIG. & Amp; Visual characteristic distance CIgw (G1iGj):

[0023] [0023]

Figure CN104408158BD000616

(2): (2):

[0024] 式(2)中,表示第i个子图心的视觉特征矩阵 Visual characteristic matrix [0024] Formula (2), represents the i th centroid

Figure CN104408158BD000617

的标准正交基, M表示第j个子图(^的视觉特征矩阵 Orthonormal group, M represents a j-th images (visual feature matrix ^

Figure CN104408158BD000618

的标准正交基,1彡i,j彡N; Orthonormal basis, San 1 i, j San N;

[0025] 步骤2.2、从网络资源中获得源图像的语义标签集合Tag= {tagl,tag2,···,tag。 [0025] Step 2.2, the source image is obtained from the set of network resources a semantic tag Tag = {tagl, tag2, ···, tag. ,…,tag。 , ..., tag. },tag。 }, Tag. 表示第c个标签,ce [1,C],C表示标签总个数;定义Nc^示利用第c个标签检索到的图像个数;定义第η个子图Gn的标签矢量bn= [bn>1,bn,2, . . .,bn,。 Denotes c-th tag, ce [1, C], C represents the total number of tags; Nc ^ defined number of images illustrating the use of the c-th tag retrieved; Definition η subgraphs Gn label vector bn = [bn> 1, bn, 2,..., bn ,. ,. . .,bn,c] GR1 xe,并有bn,c=l表示第η个子图Gn含有第c个标签tagc,bn,c = 0则表示第η个子图6„不含有第c个标签tagc; ,..., Bn, c] GR1 xe, and there bn, c = l represents η subgraphs Gn containing the c-th tag tagc, bn, c = 0 indicates the first η th FIG. 6, "does not contain a c-th label tagc;

[0026]步骤2.3、构建第i个子图Gi和第j个子图Gj的语义相似矢量[biHbj] GRlxe,并有bi,c Π 13」,。 [0026] Step 2.3, construct the i-th and j-th FIGS Gi Gj FIG semantic similarity vector [biHbj] GRlxe, and has bi, c Π 13 ",. 表不第i个子图Gi标签矢量bi的第c个兀素bi,c和第j个子图Gj标签矢量bj的第c个元素by的逻辑“与”运算;利用式⑶计算第i个子图G1和第j个子图Gj的语义相似性距离Is(i, J'): TABLE not the i-th FIG Gi label vector bi, c-th Wu element bi, c and j-th FIG Gj label vector bj of the c-th element by logical "and" operation; using the formula ⑶ calculating the i-th FIG G1 and FIG Gj j-th semantic similarity distance is (i, J '):

Figure CN104408158BD00071

[0029] 步骤2.4、构建第i个子图G1和第j个子图Gj的语义差异矢量 [0029] Step 2.4, the i-th semantic difference vector construct subgraphs G1 and Gj of the j th FIG.

Figure CN104408158BD00072

,并有表不第i个子图Gi标签矢量bi的第c个兀素bi,c和第j个子图Gj标签矢量bj的第c个兀素by的异或;利用式⑷计算所述第i个子图G1和第j个子图Gj的语义差异性距离Id (i,j): , And the table does not have the i-th label vector Gi FIG bi Wu of the c-th element bi, c and j-th label vector Gj FIG bj Wu of the c-th element by the exclusive OR; is calculated using the i-type ⑷ semantic differences subgraphs G1 and FIG Gj j-th distance Id (i, j):

[0030] [0030]

Figure CN104408158BD00073

[0031] 步骤2.5、利用式(5)依次获得融合变换矩阵En的第i行第j列元素<,从而获得所述融合变换矩阵En, [0031] Step 2.5, using formula (5) the i-th row sequentially obtained fusion transformation matrix j-th column element En <, thereby obtaining the transformation matrix En fusion,

Figure CN104408158BD00074

[0032] [0032]

Figure CN104408158BD00075

[0033] 利用式⑶构建源图像的语义融合特征矩阵R,ReRNXN: [0033] Construction of the source image using the formula ⑶ fusion semantic feature matrix R, ReRNXN:

[0034] R = EiWiEi+,…,+EnWnEn+,…,+EnWnEn ⑶ [0034] R = EiWiEi +, ..., + EnWnEn +, ..., + EnWnEn ⑶

[0035] 式(6)中,Wn表示第η个子图6„的语义融合矩阵,且为对角矩阵,Wne Rnxn;所述语义融合矩阵1的第h个对角元素= [/, (M)-/, (M)],IS /d,; In [0035] formula (6), η represents Wn of FIG. 6 th "semantic integration matrix and a diagonal matrix, Wne Rnxn; the h-th semantic fusion matrix of diagonal elements 1 = [/, (M ) - /, (M)], IS / d ,;

[0036] 步骤2.6、采用拉格朗日数乘法求解式(7),获得线性投影矩阵U的线性近似最优解: [0036] Step 2.6, using the Lagrange multiplication solving the formula (7), the linear projection matrix U to obtain the linear approximation of the optimal solution:

[0037] U = argminutr (UtLU) (7) [0037] U = argminutr (UtLU) (7)

[0038] st UtU= Id GRdxd [0038] st UtU = Id GRdxd

[0039] 式(7 )中,L = YRYt; In [0039] formula (7), L = YRYt;

[0040] 步骤2.7、利用式⑶获得Y的语义编码= [0040] Step 2.7, using the formula obtained ⑶ semantic encoding = Y

[0041] Y ' =UtY ⑶ [0041] Y '= UtY ⑶

[0042] 步骤3、主动学习算法: [0042] Step 3, active learning algorithms:

[0043] 步骤3.1、利用式⑶获得所述语义编码Y'的编码距离矩阵D的第i行第j列元素cUj,从而获得所述Y7的编码距离矩阵D,D e Rnxn : [0043] Step 3.1, is obtained using the semantic encoding of formula ⑶ Y 'coding matrix D from the i-th row j-th column element CUJ, so as to obtain coding matrix D of the distance to Y7, D e Rnxn:

[0044] [0044]

Figure CN104408158BD00076

[0045] 式(9)中,du表示第i个子图G1的语义编码和第j个子图Gj的语义编码y/的几何距离;KiJSN; In [0045] of formula (9), du represents the i-th coding semantic graph G1 of FIG Gj and j-th semantic encoding y / geometric distance; KiJSN;

[0046] 步骤3.2、定义所有子图的显著参数矩阵集合为A,并有A = [ai,a2,…,an,…,aN] e RNXN,an表示第η个子图Gn的显著参数矢量, [0046] Step 3.2, a significant parameter matrix defining all the sub-graph set as A, and has A = [ai, a2, ..., an, ..., aN] e RNXN, an indicates a significant parameter vector of η subgraphs of Gn,

Figure CN104408158BD00081

:将所述第η个子图Gn的显著参数矢量&amp;„ 作为一个块,采用分块坐标迭代下降法对式(10)进行求解,获得第η个子图Gn的显著参数矩阵&amp;„的初始值 : The first η subgraphs Gn significant parameter vector & amp; "as a block, using the block coordinates iterative descent of formula (10) is solved to obtain a first η subgraphs Gn significant parameter matrix & amp;" Initial value

Figure CN104408158BD00082

从而获得所有显著参数矩阵的初始值为 Thereby obtaining an initial value of all the significant parameters of the matrix

Figure CN104408158BD00083

[0047] [0047]

Figure CN104408158BD00084

[0048] 式(10)中: [0048] Formula (10):

Figure CN104408158BD00085

表示优化求解的偏残差; Optimization solution represents the partial residuals;

Figure CN104408158BD00086

表示稀疏诱导正则化的惩罚代价,μ表示控制惩罚度的局部参数,λ表示控制惩罚度的全局参数;cU表示所述编码距离矩阵D中第η行第i列元素;ani表示所述第η个子图Gn的显著参数矢量&amp;„的第i个元素; Represents sparse regularization induced penalties the cost, μ a partial degree penalty parameter control, λ represents a parameter controlling the global penalty degrees; cU distance matrix D represents the encoded first row, i th column element η; η ANI represents a first FIG sub Gn significant parameter vector & amp; "i-th element;

[0049] 步骤4、优化更新: [0049] Step 4, Optimization Update:

[0050] 步骤4 · 1、初始化k = 0; [0050] Step 4. 1, initializes k = 0;

[0051] 步骤4.2、定义第k次更新时,第η个子图Gn的显著参数矢量&amp;„的更新值为 [0051] Step 4.2, the definition of the k th update, the first sub-η Gn FIG significant parameter vector & amp; "update value

Figure CN104408158BD00087

,有 ,Have

Figure CN104408158BD00088

[0052] 设置投影半径为g,g为正整数,ge [1,Ν+1];采用SVD分解方法对第k次更新的更新值a〗的各个元素进行半径为g的I1范数的正交投影,获得所述< 的投影矢量 [0052] Set the projection radius g, g being a positive integer, ge [1, Ν + 1]; using the SVD Method for the k-th update the value of each element a〗 performed radius g of I1 positive norm projection cross, obtaining the <vector projection

Figure CN104408158BD00089

表示所述 Represents the

Figure CN104408158BD000810

的第η个元素; Element of η;

[0053] 定义第k次更新时,源图像的惩罚代价矩阵为 [0053] The definition at the k th update, the cost penalty of the source image matrix

Figure CN104408158BD000811

Figure CN104408158BD000812

表示所述0的第g个惩罚代价矢量,并确 It represents the cost penalty of the g-th vector 0, and indeed

Figure CN104408158BD000813

表示所述第g个子图的惩罚代价矢量 It represents the cost penalty of the vector of the g th FIG.

Figure CN104408158BD000814

的第η个元素, Η of elements,

[0054] 利用式(11)计算所述第η个元素|kg,η: [0054] using the equation (11) calculating said first element [eta] | kg, η:

[0055] [0055]

Figure CN104408158BD000815

(11) (11)

[0056] 步骤4.3、采用原对偶方法对式(12)进行求解获得优化的更新值 [0056] Step 4.3, using the original dual Methods of formula (12) is solved to obtain an updated value of the optimization

Figure CN104408158BD000816

[0057] [0057]

Figure CN104408158BD000817

[0058] 式(12)中,t表示更新收敛步长因子 In the [0058] Formula (12), t represents the update step size convergence factor

Figure CN104408158BD000818

表示函数 It represents a function

Figure CN104408158BD000819

在点4的微分; In point of differentiation 4;

[0059] 步骤4.4、将k+Ι的值赋值给k;重复步骤4.2和步骤4.3对所述更新值af进行更新,直到更新值 [0059] Step 4.4, the value is assigned to k + Ι K; described in Section 4.3 Repeat steps 4.2 and the step of updating the value of af is updated until the update value

Figure CN104408158BD00091

汷敛为止;从而依次获得所述显著参数矩阵集合A的解 Until convergence Zhong; thereby sequentially obtain the significant parameter set A solution matrix

Figure CN104408158BD00092

Figure CN104408158BD00093

Figure CN104408158BD00094

[0060] 步骤5、利用对所述解_ 进行计算,获得的计算结果Y 按降序进行排序,选择前m个计算结果所对应的子图依次进行链接,从而获得视点追踪路径。 [0060] Step 5, using the solution _ calculation, calculation results obtained are sorted in descending order Y, before selecting the m sub-picture corresponding to the calculation result successively on the link and path tracking viewpoint.

[0061] 与已有技术相比,本发明有益效果体现在: [0061] Compared with the prior art, the beneficial effects of the present invention is embodied in:

[0062] 1、和以往研究的视点追踪技术的提取相比,本发明提取的子图的特征信息不仅包括低维的颜色和纹理视觉信息,而且还包括图像各显著区域的几何结构关系以及高维的视觉语义信息,子图包含的信息更加丰富,解决了现有方法中显著性模型不能恢复图像的全局和局部信息,从而导致丢失提取的图像局部或全局的信息的问题,使得在视点追踪过程中更完整的保留了原始图像的局部和全局特征。 [0062] 1, and the viewpoint of previous studies extraction techniques to track compared to the characteristic information subgraph of the present invention is to extract not only include a low-dimensional color and texture visual information, but also structural relationship of the significant area in the image geometry and high dimensional visual semantic information, information subgraph comprising richer solve the conventional method saliency model can not recover the global and local information of the image, thereby causing a problem of information loss images extracted local or global, such tracking viewpoint more complete retention during the local and global features of the original image.

[0063] 2、本发明通过采用基于几何结构保留的主动学习算法去选择显著性子图,克服了以往视点追踪过程中,因没有考虑图像显著区域的几何结构,而导致选取的显著性子图不精确问题,使得选择的显著性子图更具代表性,相比领域内的其它视点追踪方法,本发明采用的子图之间的几何结构特征,使得视点追踪技术更加准确。 [0063] 2, the present invention is to select significant subgraphs active learning algorithm geometry reservation based by using, to overcome the conventional viewpoint during the tracking, because there is no consideration of image saliency geometry region, resulting in significant subgraphs selected inaccuracy problems, such that the selected subgraphs significantly more representative, other viewpoint compared to the tracking method field, the geometric structural feature between the present invention uses sub-picture, so that more accurate tracking technology viewpoint.

[0064] 3、得益于本视点追踪方法的数学模型包含的信息比较全面化并且考虑到了几何结构及视觉语义信息,使得本发明得到的精确度比较高,并且本发明运算简单,降低了目标跟踪方法的计算复杂度,在实验室环境下,跟踪速度可以达到每秒25帧的效果;若将本发明的视点追踪方法移植到DSP芯片上运行,借助于DSP芯片优秀的计算能力,本发明完全可以达到很好的运行效果。 [0064] 3, the benefit of this information the viewpoint tracking method comprises a mathematical model of a more comprehensive and take into account the geometry of the visual and semantic information, so that the accuracy of the present invention was relatively high, and the operation of the present invention is simple, reducing the target computational complexity of the tracking process, in a laboratory environment, tracking speed can achieve 25 frames per second; if the viewpoint tracking method of the present invention is transplanted to the DSP chip to run, by means of a DSP chip excellent computing capability, the present invention can be achieved good operating results.

附图说明 BRIEF DESCRIPTION

[0065] 图1为本发明的视点追踪技术的工作流程图; [0065] FIG viewpoint tracking operation flowchart of the present invention technique;

[0066] 图2为本发明的子图构建过程示意图; Sub-picture [0066] FIG. 2 is a schematic view of the process of the invention is constructed;

[0067] 图3为本发明的视点追踪方法进行视点转移路线预测的效果图。 [0067] FIG. 3 viewpoint tracking method of the present invention the effect of the transfer path prediction view of FIG.

具体实施方式 Detailed ways

[0068] 本实施例中,一种基于几何重构和语义融合的视点追踪方法,如图1所示,按照如下步骤进行: [0068] In the present embodiment, a method of tracking the geometric viewpoint reconstruction and semantic-based fusion, as shown in FIG. 1, according to the following steps:

[0069] 步骤1、构建子图的视觉特征集合:该阶段主要为了获得源图像含有颜色特征、纹理特征和几何结构特征的视觉特征矩阵; [0069] Step 1, FIG constructs visual feature set: In this phase, in order to obtain a source image containing color characteristics, texture characteristics and visual characteristics of the geometric structural feature matrix;

[0070] 步骤1.1、采用聚类方法将源图像划分成1个子区域,将每个子区域作为一个节点,构建包含若干个节点的子图,从而获得子图集合G= {Gi,G2,…,Gn,…,Gn},N表不子图的总数;Gn表示子图集合G中第η个子图,并有 [0070] Step 1.1, using the clustering method to the source image into a sub-areas, each sub-region as a node, FIG constructs comprising several nodes, to obtain a sub-set of graphs G = {Gi, G2, ..., Gn, ..., Gn}, N the total number of sub-picture is not a table; Gn represents G η th first set of sub-figure, and there

Figure CN104408158BD00095

表示第η个子图中包含tn个节点;并有 Η represents subgraphs contained tn nodes; and have

Figure CN104408158BD00096

:v«a表示Gn中第ti个节点;En表示Gn中tn个节点之间的几何连接边集合,连接边要保证6„为连通图,有边数大于等于tn-l; I< tn< 1;本方法设置1后,tn取值于1到1之间的各整数值;以图2为例说明子图的构建过程,1 = 3时所包含的10个子图,图2中大扩幅左边为源图像,大扩幅右边的A行表示源图像中节点为1所对应的3个子图;B行表示源图像中节点为2所对应的3个子图;C行表示源图像中节点为3所对应的4个子图,本方法可以在使用较低的子图数1情形下取得很好的追踪效果,例如1 = 10; : V «a Gn represents the first node ti; En Gn represents the connection between the geometry of the set of edges tn nodes, to ensure that the number of connecting edges 6" is a connected graph, there is greater than or equal sides tn-l; I <tn < 1; the present method of setting 1, the respective values ​​of TN integer between 1-1; FIG. 2 as an example to illustrate the process of constructing a sub-picture, when 1 = 3 in FIG. 10 included in the sub, the large widening FIG. 2 web left source image, a row represents the source image node large widening the right of a corresponding 3 subgraphs; B row represents the source image node 2 corresponding to the three sub-graphs; C row represents the source image node 3 corresponding to FIG. 4 th, the present method can be used in a number of sub-lower case of FIG. 1 to obtain good tracking effect, for example, 1 = 10;

[0071] 聚类方法是图像分割领域中一类极其重要和应用相当广泛的算法,聚类方法以相似性为基础,使得同一个簇内的数据对象的相似性尽可能大,同时不在同一个簇中的数据对象的差异性也尽可能地大,因此本发明中可以利用《SLIC Superpixels Compared toState-of-the-art Superpixel Methods》这篇文章中提到的SLIC分割方法(简单的线性迭代聚类)对源图像进行子区域划分,SLIC分割方法是一种基于聚类算法的超像素分割,由LAB空间以及x,y像素坐标表共5维空间来计算。 [0071] The image segmentation clustering method is extremely important in a class of wide range of applications and algorithms, the similarity clustering method based on such similarity data objects within the same cluster as large as possible, while not in the same difference data objects in a cluster is also as large as possible, thus the present invention may be utilized SLIC segmentation method "SLIC Superpixels Compared toState-of-the-art Superpixel methods" mentioned in this article (simple iterative linear polyethylene class) of source image sub-area dividing, the SLIC segmentation method is a super-pixel division-based clustering algorithm to calculate the LAB space and x, y pixel coordinates in table 5 were dimensional space. 此方法不仅可以分割彩色图像,也可以兼容分割灰度图,而且还可以人为的设置需要分割的区域的数量,因此将此方法用在本发明中,可以提高效率,并且划分的子区域很好的保留了源图像的信息,准确度高; This method can not only sub-region color image segmentation, segmentation may be compatible with grayscale, but also the number of divided regions set man needs, so this method is used in the present invention can improve the efficiency, and the divided good retain the information of the source image, high accuracy;

[0072] 步骤1.2、采用经典图像颜色特征提取方法获得第η个子图Gn的颜色特征 [0072] Step 1.2, using the classical method for obtaining an image color feature extracting color characteristics subgraphs Gn, η

Figure CN104408158BD00101

表示第η个子图Gn中第U个节点1V的颜色矢量,dc表示颜色特征 Η represents the color sub-vector U in FIG Gn first node of 1V, dc color feature

Figure CN104408158BD00102

的维度,I < tn; dc—般取值为9; Dimension, I <tn; dc- general value of 9;

[0073] 颜色特征是一种全局特征,描述了图像或图像区域所对应的景物的表面性质,是基于像素点的特征,经典的图像颜色特征提取方法主要有颜色直方图法、颜色集法、颜色矩法等,例如本发明就可以利用颜色矩法提取子图的颜色特征,因为颜色分布信息主要集中在低阶矩中,因此,仅采用颜色的一阶矩、二阶矩和三阶矩就足以表达图像的颜色分布,又由于每个像素具有颜色空间的三个颜色通道,因此图像的颜色矩有9个分量来描述,本发明中用此方法可以使得提取的颜色特征更加完整准确; [0073] The color feature is a global feature, the surface properties of the image description or image region corresponding to a scene, the pixel is based on the feature, the classic image color feature extraction methods are the color histogram method, the color set method, color moment method or the like, for example, the present invention can use the color feature extraction sub-picture color moment method, because the color information is distributed mainly in the lower order moment, and therefore, using only color first moment, second moment and the third moment enough to express the color distribution of the image, and because each pixel has three color channels of the color space, so that the colors of the image moments nine components described, the present invention is used in this method may be such that the color feature extraction more complete and accurate;

[0074] 采用经典图像纹理特征提取方法获得第η个子图Gn的纹理特征 [0074] The classical method for obtaining the image texture feature extraction of texture features subgraphs Gn, η

Figure CN104408158BD00103

Figure CN104408158BD00104

表示第η个子图Gn中第U个节点的纹理矢量,dTE表示纹理特征 Η represents the texture vectors of the sub U Gn FIG nodes, dTE represent texture features

Figure CN104408158BD00105

的维度;dTE—般取值为128; Dimensions; dTE- like a value of 128;

[0075] 纹理特征在一个网格密集的大小统一的细胞单元上计算,而且为了提高性能,还采用了重叠的局部对比度归一化技术;纹理特征的主要思想是一副图像中,局部目标的形状和表象能够被梯度和边缘的方向密度分布很好的描述,本发明中具体的实现方法如下: [0075] The texture feature on a dense grid of cells of uniform size calculation unit, and to improve performance, but also uses overlapping local contrast normalization techniques; main idea is that the texture features of an image, the partial object the shape and appearance can be distributed density gradient and the edge direction of the well described in the present invention, the following specific method:

[0076] 首先将每个子图分成小的连通区域,称其为细胞单元,然后采集细胞单元中各像素点的梯度的或边缘的方向直方图;最后把这些直方图组合起来就可以构成纹理特征的描述符。 [0076] First, each sub-picture into small communication area, called a cell unit, then the unit cells were harvested gradient direction of each pixel or an edge histogram; finally histograms may be formed by combining the texture feature descriptor. 为了提高性能,将各个细胞单元组合成空间上连通的区间,我们将这些局部直方图在图像的区间内,进行对比度归一化,所采用的方法是:先计算各直方图在这个区间中的密度,然后根据这个密度对区间中的各个细胞单元做归一化。 To improve performance, the individual cells of the units into communication with the space section, we refer to these partial image in the interval histogram, the contrast normalization process is employed: first calculate histograms in this interval density, then according to the density of the respective cell units do interval normalized. 通过这个归一化后,能对光照变化和阴影获得更好的效果。 After normalization, better results can be obtained on the light and shadow through this change. 因此用这种方法提取的特征,对图像几何和光学形变能保持很好的不变性,实验中效果会更好。 Thus this method of extracting features of the image geometry and to maintain good optical distortion invariance, the experiment would be better.

[0077] 利用式⑴提取第η个子图6„的几何结构特征 [0077] Extraction using the formula η th ⑴ FIG. 6, "the geometric structural feature

Figure CN104408158BD00106

[0078] [0078]

Figure CN104408158BD00107

[0079] 式⑴中, [0079] In the formula ⑴,

Figure CN104408158BD00108

表示几何结构特征 It represents the geometric structural feature

Figure CN104408158BD00109

的第U行第b列元素, U b of the first row-column element,

Figure CN104408158BD001010

表示从第η个子图Gn中第“个节点心,的区域中心到第k个节点%的区域中心的矢量的水平角度; I ^ti , tj ^ tn ; Gn represents from FIG η th first "node heart, a horizontal angle to the center region of the k-th node vector% of the area center; ^ tn I ^ ti, tj;

[0080] 步骤1.3、将第η个子图Gn的颜色特征 [0080] Step 1.3, wherein the first color sub-η Gn in FIG.

Figure CN104408158BD00111

、纹理特征 Texture features

Figure CN104408158BD00112

和几何结构特征 Geometric and structural characteristics

Figure CN104408158BD00113

分别进行矩阵转置后依次相连,获得视觉特征矩阵 Sequentially connected respectively, after the matrix transpose, visual feature matrix obtained

Figure CN104408158BD00114

采用特征融合方法将视觉特征矩阵 Fusion methods using visual characteristic feature matrix

Figure CN104408158BD00115

1转化成视觉特征矢量yn, 1 is converted into a visual feature vector yn,

Figure CN104408158BD00116

,dY表示视觉特征矢量yn的矢量维度; , DY represents a visual feature vector of the vector yn dimension;

[0081] 本发明中每个子图的低层视觉特征包含有三个:颜色特征、纹理特征和几何结构特征,将这三个特征作为传感器的三个输入端,然后对这三个特征信息进行综合分析处理,利用〈〈Subspaces Indexing Model on Grassmann Manifold for Image Search〉〉这篇文章中提到的特征融合方法,将其融合成一个矢量,本发明中特征融合方法的优点在于实现了可观的信息压缩,有利于实时处理,并且由于所提取的特征直接与显著性子图的选取有关,因而融合结果能最大限度的给出选取的显著性子图的特征信息; [0081] In the present invention, low-level visual features of each sub-graph comprises three: color, texture geometric and structural features, as these three features three input terminals of the sensor, then these three characteristic information comprehensive analysis processing, using the feature << Subspaces Indexing Model on grassmann Manifold for Image Search >> mentioned in this article fusion method, and fused into a vector, the advantages of the present invention is characterized in that the fusion process to achieve a considerable information compression, facilitate real-time processing, and due to the extracted features are directly related to the selection of significant subgraphs, and thus gives the maximum integration result can significantly subgraphs select feature information;

[0082] 步骤1 .4、重复步骤1 .2和步骤1 .3,依次获得所有子图的视觉特征集合 [0082] Step 1.4, the step of repeating steps 1.2 and 1.3, all the sub-sequence to obtain a set of visual features of FIG.

Figure CN104408158BD00117

[0083] 步骤2、融合语义特征: [0083] Step 2 fusion semantic feature:

[0084] 步骤2.1、利用式⑵计算第i个子图G1和第j个子图&amp;的视觉特征距离CIgw(G1iGj): [0084] Step 2.1, calculated using the formula ⑵ the i-th and j-th graph G1 of FIG. & Amp; Visual characteristic distance CIgw (G1iGj):

[0085] [0085]

Figure CN104408158BD00118

(2) (2)

[0086] 式(2)中, [0086] Formula (2),

Figure CN104408158BD00119

表示第i个子图G1的视觉特征矩阵丨 G1 represents the i th FIG visual feature matrix Shu

Figure CN104408158BD001110

的标准正交基, Orthonormal basis,

Figure CN104408158BD001111

表示第j个子图&amp;的视觉特征矩阵 Represents the j-th FIG. & Amp; Visual feature matrix

Figure CN104408158BD001112

的标准正交基,KiJSN;事实上这里的视觉特征距离就是这篇文章《Similarity and affine invariant distancesbetween 2D point sets》中提到的Golub-Werman距离,引入这个距离是为了更清楚的描述子图之间的局部结构特点; Orthonormal basis, KiJSN; in fact visual features here is the distance from Golub-Werman mentioned in this article "Similarity and affine invariant distancesbetween 2D point sets", the introduction of this distance in order to more clearly describe the subgraph between the local structural features;

[0087] 步骤2.2、从网络资源中获得源图像的语义标签集合Tag= {tagl,tag2,···,tag。 [0087] Step 2.2, the source image is obtained from the set of network resources a semantic tag Tag = {tagl, tag2, ···, tag. ,…,tag。 , ..., tag. },tag。 }, Tag. 表示第c个标签,ce [1,C],C表示标签总个数;定义Nc^示利用第c个标签检索到的图像个数;定义第η个子图Gn的标签矢量bn= [bn>1,bn,2, . . .,bn,。 Denotes c-th tag, ce [1, C], C represents the total number of tags; Nc ^ defined number of images illustrating the use of the c-th tag retrieved; Definition η subgraphs Gn label vector bn = [bn> 1, bn, 2,..., bn ,. ,. . .,bn,c] GR1xe,并有bn,c=l表示第n个子图Gn含有第c个标签tagc,bn,c = 0则表示第n个子图6„不含有第c个标签tagc; ,..., Bn, c] GR1xe, and there bn, c = l denotes the n-th FIG Gn containing the c-th tag tagc, bn, c = 0 indicates n-th FIG. 6, "does not contain a c-th tag tagc;

[0088] 步骤2.3、构建第i个子图Gi和第j个子图Gj的语义相似矢量[biHbj] GRlxe,并有bi,c Π 13」,。 [0088] Step 2.3, construct the i-th and j-th FIGS Gi Gj FIG semantic similarity vector [biHbj] GRlxe, and has bi, c Π 13 ",. 表不第i个子图Gi标签矢量bi的第c个兀素bi,c和第j个子图Gj标签矢量bj的第c个元素bj,c的逻辑“与”运算,例如C = 4时,bi= [1,0,1,0],bj= [0,1,1,0],有[biflbj] = [0,0,1,0];利用式⑶计算第i个子图G1和第j个子图Gj的语义相似性距离Is (i,j): TABLE not the i-th FIG Gi label vector bi, c-th Wu element bi, c and j-th FIG Gj label vector bj of the c-th element bj, c is a logical "and" operation, for example, C = 4 when, bi = [1,0,1,0], bj = [0,1,1,0], there [biflbj] = [0,0,1,0]; using the calculation formula ⑶ subgraphs G1 i and j FIG Gj sub semantic similarity distance is (i, j):

Figure CN104408158BD001113

[0091] 步骤2.4、构建第i个子图G1和第j个子图Gj的语义差异矢量 [0091] Step 2.4, the i-th semantic difference vector construct subgraphs G1 and Gj of the j th FIG.

Figure CN104408158BD00121

,并有® 表不第i个子图Gi标签矢量bi的第c个兀素bi,c和第j个子图Gj标签矢量bj的第c个兀素bj, c 的逻辑“异或”运算,例如C = 4 时,bi = [ 1,0,1,0 ],bj = [ 0,1,1,0 ],有 And there ® table is not the i-th label vector bi, c-th Wu element bi, c and j-th FIG Gj label vector bj of the c-th Wu element bj, c logic diagram Gi of "exclusive or" operation, e.g. when C = 4, bi = [1,0,1,0], bj = [0,1,1,0], there

Figure CN104408158BD00122

;利用式⑷计算第i个子图G1和第j个子图Gj的语义差异性距离Id (i,j): ; Semantic difference is calculated using the formula ⑷ the i-th and j-th graph G1 of FIG Gj distance Id (i, j):

[0092] [0092]

Figure CN104408158BD00123

[0093] 步骤2.5、利用式(5)依次获得融合变换矩阵En的第i行第j列元素if,从而获得融合变换矩阵En,EneRNXN,l彡n彡N: [0093] Step 2.5, using formula (5) are sequentially obtained i-th row j-th column element of the transformation matrix En if fusion, to obtain a fusion transformation matrix En, EneRNXN, l San San n N:

[0094] [0094]

Figure CN104408158BD00124

[0095] 利用式⑶构建源图像的语义融合特征矩阵R,ReRNXN: [0095] Construction of the source image using the formula ⑶ fusion semantic feature matrix R, ReRNXN:

[0096] R = EiWiEi+,…,+EnWnEn+,…,+EnWnEn ⑶ [0096] R = EiWiEi +, ..., + EnWnEn +, ..., + EnWnEn ⑶

[0097] 式(6)中,Wn表示第η个子图6„的语义融合矩阵,且为对角矩阵,Wne Rnxn;语义融合矩阵Wn的第h个对角元素 In [0097] formula (6), Wn represents η 6 th "semantic integration matrix and a diagonal matrix, Wne Rnxn; semantic matrix Wn fusion h-th diagonal element

Figure CN104408158BD00125

[0098] 步骤2.6、采用拉格朗日数乘法求解式(7),获得线性投影矩阵U的线性近似最优解: [0098] Step 2.6, using the Lagrange multiplication solving the formula (7), the linear projection matrix U to obtain the linear approximation of the optimal solution:

Figure CN104408158BD00126

[0102] 在许多极值问题中,函数的自变量往往要受到一些条件的限制,本发明中,有一些约束条件仰=!#!^·1的限制,因此用拉格朗日数乘法就能很好的解决优化问题,使得问题简单明了,实际上线性投影矩阵U的线性近似最优解就是矩阵L的d个最小特征值所对应的d个特征矢量; [0102] In many extremum problem, argument of a function often limited by a number of conditions, in the present invention, there are some constraints overhand =! #! ^ · 1 limits, can thus Lagrange multiplication good solution to the optimization problem, the problem that the simple fact of the linear projection matrix U is the linear approximation of the optimal solution of L d smallest eigenvalue matrix corresponding to the feature vector d;

[0103] 步骤2.7、利用式⑶获得Y的语义编码 [0103] Step 2.7, using the formula ⑶ obtain semantic encoding of Y

Figure CN104408158BD00127

[0104] Y ' =UtY ⑶ [0104] Y '= UtY ⑶

[0105] 步骤3、主动学习算法: [0105] Step 3, active learning algorithms:

[0106] 步骤3.1、利用式⑶获得语义编码Y'的编码距离矩阵D的第i行第j列元素du,从而获得Y7的编码距离矩阵D, [0106] Step 3.1, using the formula obtained semantic encoding ⑶ Y 'encoding the i-th row j-th column element du distance matrix D, thereby obtaining the coding Y7 distance matrix D,

Figure CN104408158BD00128

[0107] [0107]

Figure CN104408158BD00129

[0108] 式(9)中,Cllj表示第i个子图G1的语义编码和第j个子图Gj的语义编码y/的几何距离;KiJSN; In [0108] of formula (9), Cllj represents the i-th coding semantic graph G1 of FIG Gj and j-th semantic encoding y / geometric distance; KiJSN;

[0109] 步骤3.2、定义所有子图的显著参数矩阵集合为4,并有4=[&amp;1,&amp;2,一,&amp;„,〜,&amp;〜]£RNXN,an表示第η个子图GW显著参数矢量,实际上就是表示第η个子图Gn对其它所有子图的线性重构的重要程度;aneRNxl;将第η个子图Gn的显著参数矢量&amp;„作为一个块,采用分块坐标迭代下降法对式(10)进行求解,获得第η个子图6„的显著参数矩阵8„的初始值 [0109] Step 3.2, a significant parameter matrix defining a set of all sub-graphs is 4, and there are 4 = [& amp; 1, & amp; 2, a, & amp; ", ~, & amp; ~] £ RNXN, an represents η subgraphs GW significant parameter vector actually represents η subgraphs Gn importance for linear reconstruction all other subgraphs of; aneRNxl; the first η subgraphs Gn significant parameter vector & amp; "as a block, use of sub the initial value of the block coordinates descent method iterations of formula (10) is solved to obtain sub FIG. 6 "is a significant parameter matrix 8" of the η

Figure CN104408158BD00131

,从而获得所有显著参数矩阵的初始值为 So as to obtain all the initial values ​​of the significant parameter matrix

Figure CN104408158BD00132

;

[0110] [0110]

Figure CN104408158BD00133

[0111] 式(1〇)中, [0111] Formula (1〇) in

Figure CN104408158BD00134

表示优化求解的偏残差; Optimization solution represents the partial residuals;

Figure CN104408158BD00135

表示稀疏诱导正则化的惩罚代价,这里引入无穷大范数就是为了加强所有子图的显著参数矢量的稀疏性,便于后续显著性参数矢量的优化,μ表示控制惩罚度的局部参数,λ表示控制惩罚度的全局参数;其中μ和λ—般可以在0到1范围内选择,最优的,μ为0.01 ;λ为0.1 ;dni表示编码距离矩阵D中第η行第i列元素;ani表示第η个子图Gn的显著参数矢量8„的第i个元素; Denotes the thinning induced regularization penalty cost is incorporated herein infinity norm is to strengthen the sparsity significant parameter vector of all sub-graphs, to facilitate optimizing the subsequent significant parameter vector, [mu] represents the local parameter control penalization of [lambda] a control punishment of global parameters; wherein [mu] λ- like and may be selected in the range of 0-1, optimal, [mu] is 0.01; λ is 0.1; dni coding distance matrix D represents the first column of the i-th row element η; ANI represents η subgraphs Gn significant parameter vector 8 "i-th element;

[0112] 步骤4、优化更新: [0112] Step 4, Optimization Update:

[0113] 步骤4.1、初始化k = 0; [0113] Step 4.1, initializes k = 0;

[0114] 步骤4.2、定义第k次更新时,第η个子图Gn的显著参数矢量&amp;„的更新值为 [0114] Step 4.2, the definition of the k th update, the first sub-η Gn FIG significant parameter vector & amp; "update value

Figure CN104408158BD00136

Have

Figure CN104408158BD00137

[0115] 设置投影半径为g,g为正整数,ge [1,Ν+1];采用SVD分解方法对第k次更新的更新值 [0115] Set the projection radius g, g being a positive integer, ge [1, Ν + 1]; SVD decomposition method using an updated value of the k-th update

Figure CN104408158BD00138

的各个元素进行半径为g的1 :范数的正交投影,获得 The radius of each of the elements 1 g: the norm of the orthogonal projection, to obtain

Figure CN104408158BD00139

的投影矢量 Projection vector

Figure CN104408158BD001310

表示 Show

Figure CN104408158BD001311

的第η个元素; Element of η;

[0116] SVD分解方法就是奇异值分解方法,SVD的作用在于将更新值< 转化为由一组正交基向量张成的子空间; [0116] SVD decomposition method is the method of singular value decomposition, SVD effect is that the updated value <transformed by a set of orthogonal subspace spanned basis vectors;

[0117] 定义第k次更新时,源图像的惩罚代价矩阵为 [0117] definition at the k th update, the cost penalty of the source image matrix

Figure CN104408158BD001312

Figure CN104408158BD001313

的第g个惩罚代价矢量,并有 The cost penalty of the g-th vector, and a

Figure CN104408158BD001314

表示第g个子图的惩罚代价矢量 The first vector represents the cost penalty FIG g th

Figure CN104408158BD001315

的第η个元素, Η of elements,

[0118] 利用式(11)计算第η个元素ξ、η: [0118] using formula (11) [eta] Calculation element ξ, η:

[0119] [0119]

Figure CN104408158BD001316

(Ii) (Ii)

[0120] 步骤4.3、采用原对偶方法对式(12)进行求解获得优化的更新值 [0120] Step 4.3, using the original dual Methods of formula (12) is solved to obtain an updated value of the optimization

Figure CN104408158BD001317

;

[0121] [0121]

Figure CN104408158BD001318

[0122] 式(12)中,t表示更新收敛步长因子, In the [0122] Formula (12), t represents the update step size convergence factor,

Figure CN104408158BD001319

表示函数 It represents a function

Figure CN104408158BD001320

在点af的微分; In the af point differential;

[0123] 本发明中采用原对偶方法对更新值进行优化求解,使得优化问题变得简单易求,并且在步骤3.2的分块求解上,使用原对偶,可以减少计算力度提高本发明的效率;原对偶问题的具体形成过程可参考文献《Proximal Methods for Hierarchical Sparse Coding》中的介绍; [0123] The present invention uses primal-dual method to update the value optimization solution, so that the optimization problem becomes easy to find, and the blocking solution step 3.2, using the original dual, can reduce the computational efforts to improve the efficiency of the present invention; DETAILED original formation of the dual problem can be described by reference "Proximal Methods for Hierarchical Sparse Coding" in;

[0124] 步骤4.4、将k+1的值赋值给k;重复步骤4.2和步骤4.3对更新值 [0124] Step 4.4, the value of k + 1 is assigned to K; the step of repeating steps 4.2 and 4.3 pairs update value

Figure CN104408158BD00141

进行更新,直到更新值 Update until the updated value

Figure CN104408158BD00142

收敛为止;从而依次获得显著参数矩阵集合A的解 Converges; thereby sequentially obtaining significant parameter set A solution matrix

Figure CN104408158BD00143

[0125] 步骤5、利用I [0125] Step 5, using I

Figure CN104408158BD00144

进行计算,获得的计算结果按降序进行排序,选择前m个计算结果所对应的子图依次进行链接,从而获得视点追踪路径。 It was calculated. The results obtained are sorted in descending order, selecting the first m sub-picture corresponding to the calculation result successively on the link and path tracking viewpoint. m—般可以在3至Ij5范围内选择,比如m = 4。 m- like can be selected in the range of 3 to Ij5, such as m = 4.

[0126] 图3是本发明追踪技术在NUSEF和OSIE数据集上,对视点追踪路线的预测,克服了当前视点追踪方法缺乏对目标图像中各显著区域之间的几何及语义相关性描述,造成建模精度不高的问题,取得了很好的追踪效果。 [0126] FIG. 3 is the present invention, tracking technology in NUSEF and OSIE data set, viewpoint tracking predicted route overcomes the current lack of view tracking method geometric and semantic correlation described between the target image in each of the salient regions, resulting in modeling accuracy is not high, achieved good results tracking.

[0127] 以上,仅为本发明较佳的一种实施方式,其他研究人员根据上面,完全可以在其他相关领域内,比如图像增强、图像分析等,取得相同的效果。 [0127] above is only the preferred embodiment of the present invention provides a way, other researchers according to the above, can in other related areas, such as image enhancement, image analysis, to obtain the same effect. 必须说明的是,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或相关参数改变,都应涵盖在本发明的保护范围之内。 It must be noted that any skilled in the art in the art within the technical scope disclosed in the present invention, or equivalents be changed according to the parameters related aspect of the present invention and its inventive concept should be covered in the protection scope of the present invention Inside.

Claims (1)

1. 一种基于几何重构和语义融合的视点追踪方法,其特征是按如下步骤进行: 步骤1、构建子图的视觉特征集合: 步骤1.1、采用聚类方法将源图像划分成1个子区域,将每个子区域作为一个节点,构建包含若干个节点的子图,从而获得子图集合 CLAIMS 1. A method of tracking a viewpoint geometric remodeling and semantic Fusion, which is characterized by the following steps: Step 1, FIG constructs visual feature set: Step 1.1, using the clustering method to the source image into sub-regions 1 , each sub-region as a node, FIG construct comprising a plurality of sub-nodes, to thereby obtain a set of sub FIG.
Figure CN104408158BC00021
,N表示子图的总数;Gn 表示所述子图集合G中第η个子图,并有 , N denotes the total number of sub-picture; FIG Gn represents the sub-set G of FIG η th, and there
Figure CN104408158BC00022
表示第η个子图中包含tn 个节点;并有 Η represents subgraphs contained tn nodes; and have
Figure CN104408158BC00023
中第^个节点;En表示所述^个节点之间的几何连接边集合;1 < tn< 1; 步骤1.2、采用经典图像颜色特征提取方法获得第η个子图Gn的颜色特征 ^ First node; En ^ represents the set of edges between a geometrical connection nodes; 1 <tn <1; Step 1.2, using the classical method for obtaining an image color feature extracting color characteristics subgraphs Gn, η
Figure CN104408158BC00024
表示第η个子图Gn中第U个节点U勺颜色矢量,dc表示颜色特征M〖的维度,K ti < tn; 采用经典图像纹理特征提取方法获得第η个子图Gn的纹理特征 FIG η th Gn represents the first node U U spoon color vector, dc 〖color feature dimensions M, K ti <tn; classical texture feature extraction method for obtaining an image texture features subgraphs Gn, η
Figure CN104408158BC00025
表示第η个子图Gn中第ti个节点心,:的纹理矢量,dTE表示纹理特征的维度; 利用式⑴提取所述第η个子图6„的几何结构特征 FIG η th Gn represents the first node ti heart: texture vector, dTE represents a dimension of texture features; ⑴ extracted using the formula η th of FIG. 6, "the geometric structural feature
Figure CN104408158BC00026
Figure CN104408158BC00027
式(1)中, Formula (1),
Figure CN104408158BC00028
表示所述几何结构特征Μ〗的第U行第b列元素 It represents the geometric structural feature of U Μ〗 row-column element b
Figure CN104408158BC00029
表示从第η个子图Gn中第^个节点1^的区域中心到第b个节点%的区域中心的矢量的水平角度;1 ^ti , tj ^ tn ; 步骤1.3、将所述第η个子图Gn的颜色特征 Represents ^ ^ node 1 to the level of the central region of the vector node b% of the area from central angle η th in FIG Gn; 1 ^ ti, tj ^ tn; Step 1.3, the second sub-η FIG. color characteristics of Gn
Figure CN104408158BC000210
纹理特征 Texture features
Figure CN104408158BC000211
和几何结构特征 Geometric and structural characteristics
Figure CN104408158BC000212
分别进行矩阵转置后依次相连,获得视觉特征矩阵 Sequentially connected respectively, after the matrix transpose, visual feature matrix obtained
Figure CN104408158BC000213
:采用特征融合方法将所述视觉特征矩阵 : Using the method of feature fusion visual characteristic matrix
Figure CN104408158BC000214
转化成视觉特征矢量yn, Transforming into a visual feature vector yn,
Figure CN104408158BC000215
,dY表示所述视觉特征矢量yn的矢量维度; , DY represents the visual feature vector of the vector yn dimension;
Figure CN104408158BC000216
步骤1.4、重复步骤1.2和步骤1.3,依次获得所有子图的视觉特征集合步骤2、融合语义特征: 步骤2.1、利用式⑵计算第i个子图G1和第j个子图&amp;的视觉特征距离dGW (G1 ,Gj): Step 1.4 Repeat steps 1.2 and Step 1.3, sequentially obtained all submaps visual feature set Step 2, fusion semantic features: Step 2.1, using the formula ⑵ calculating the i-th graph G1 and the j-th FIG. & Amp; Visual characteristic distance dGW (G1, Gj):
Figure CN104408158BC000217
式(2)中,M丨表示第i个子图(^的视觉特征矩阵 Formula (2), M denotes an i th FIG Shu (^ visual feature matrix
Figure CN104408158BC000218
的标准正交基,表示第j个子图&amp;的视觉特征矩阵 Orthonormal basis, represents the j-th FIG. & Amp; Visual feature matrix
Figure CN104408158BC000219
的标准正交基, 步骤2.2、从网络资源中获得源图像的语义标签集合Tag= {tagi,tag2,…,tag。 Orthonormal basis, step 2.2, the source image is obtained from a set of network resources a semantic tag Tag = {tagi, tag2, ..., tag. ,…, tagc},tagc表示第c个标签,ce [1,C],C表示标签总个数;定义Ne表示利用第c个标签检索到的图像个数;定义第η个子图Gn的标签矢量bn= [bn>1,bn,2, . . .,bn,。 , ..., tagc}, tagc denotes c-th tag, ce [1, C], C represents the total number of tags; defined Ne represents the number of images using the c-th tag retrieved; Definition η th label of FIG Gn vector bn = [bn> 1, bn, 2,..., bn ,. ,. . .,bn,c] GRlxe,并有bn,。 ,..., Bn, c] GRlxe, and there bn ,. = 1表示第n个子图Gn含有第c个标签tag。 = 1 represents the n-th FIG Gn tag containing the c-th tag. ,bn>c; = 0则表示第n个子图6„不含有第c个标签tagc; , Bn> c; = 0 indicates the n-th FIG. 6, "does not contain a c-th tag TAGC;
Figure CN104408158BC00031
步骤2.3、构建第i个子图Gi和第j个子图Gj的语义相似矢量bj,。 Step 2.3, construct the i-th and j-th FIGS Gi Gj FIG semantic similarity vector bj ,. 表示第i个子图G1标签矢量1^的第c个元素b1;。 It denotes an i-th label vector 1 ^ graph G1 of the c-th element b1 ;. 和第j个子图Gj标签矢量匕的第c个元素bj,c的逻辑“与”运算;利用式(3)计算第i个子图G1和第j个子图Gj的语义相似性距离Is (i, And j-th FIG Gj label vector dagger c-th element bj, c is a logical "and" operation; (3) is calculated using the formula i-th graph G1 and the j-th FIG Gj semantic similarity distance Is (i,
Figure CN104408158BC00032
步骤2.4、构建第i个子图G1和第j个子图Gj的语义差异矢量 Step 2.4, the i-th semantic difference vector construct subgraphs G1 and Gj of the j th FIG.
Figure CN104408158BC00033
,并有表示第i个子图G1标签矢量比的第c个元素b1>c和第j个子图&amp;标签矢量匕的第c个元素by的异或;利用式⑷计算所述第i个子图G1和第j个子图Gj的语义差异性距离Id (i,j): And indicating the i-th FIG G1 label vector ratio of the c-th element b1> c and j-th FIG. & Amp; label vector dagger c-th element by the exclusive OR; using Equation ⑷ calculating the i-th FIG. semantic differences G1 and the j-th FIG Gj distance Id (i, j):
Figure CN104408158BC00034
步骤2.5、利用式⑶依次获得融合变换矩阵En的第i行第j列元素从而获得所述融合变换矩阵En Step 2.5, using the formula ⑶ fusion transformation matrix En sequentially obtained in i-th row j-th column element of the transformation matrix to obtain said fusion En
Figure CN104408158BC00035
Figure CN104408158BC00036
利用式(6)构建源图像的语义融合特征矩阵R, Semantic using formula (6) fusion constructs of the source image feature matrix R,
Figure CN104408158BC00037
Figure CN104408158BC00038
式(6)中,Wn表示第η个子图6„的语义融合矩阵,且为对角矩阵, In the formula (6), η represents Wn of FIG. 6 th "semantic integration matrix and a diagonal matrix,
Figure CN104408158BC00039
所述语义融合矩阵Wn的第h个对角元素 The fusion matrix Wn semantic h-th diagonal element
Figure CN104408158BC000310
步骤2.6、采用拉格朗日数乘法求解式(7),获得线性投影矩阵U的线性近似最优解: Linear step 2.6, using the Lagrange multiplication solving the formula (7), to obtain a linear projection matrix U approximate optimal solution:
Figure CN104408158BC000311
步骤2.7、利用式⑶获得Y的语义编码 Step 2.7, using the formula Y obtained semantic encoding ⑶
Figure CN104408158BC000312
Figure CN104408158BC000313
步骤3、主动学习算法: 步骤3.1、利用式⑶获得所述语义编码Y'的编码距离矩阵D的第i行第j列元素du,从而获得所述Y'的编码距离矩阵D,D e Rnxn : Step 3, active learning algorithm: Step 3.1, is obtained using the semantic encoding of formula ⑶ Y 'from the i-th row of the encoding matrix D du j-th column element, to thereby obtain the Y' coding distance matrix D, D e Rnxn :
Figure CN104408158BC000314
式(9)中,du表示第i个子图G1的语义编码和第j个子图Gj的语义编码y/的几何距 In the formula (9), du is the i th FIG semantic code G1 and FIG Gj j-th semantic encoding y / geometric distance
Figure CN104408158BC000315
步骤3.2、定义所有子图的显著参数矩阵集合为A,并有 Step 3.2, a significant parameter matrix defined as a set of all sub-graphs A, and has
Figure CN104408158BC000316
an表示第η个子图Gn的显著参数矢量,aneRNxl;将所述第n个子图Gn的显著参数矢量&amp;„作为一个块,采用分块坐标迭代下降法对式(10)进行求解,获得第η个子图Gn的显著参数矩阵&amp;„ 的初始值<,从而获得所有显著参数矩阵的初始值为 an denotes η subgraphs Gn significant parameter vector, aneRNxl; the n-th FIG Gn significant parameter vector & amp; "as a block, using the block coordinates iterative descent of formula (10) is solved to obtain a first η subgraphs Gn significant parameter matrix & amp; "initial value <, thereby obtaining the initial value of all the significant parameters of the matrix
Figure CN104408158BC00041
Figure CN104408158BC00042
式(10)中 (10) In the formula
Figure CN104408158BC00043
表示优化求解的偏残差; Optimization solution represents the partial residuals;
Figure CN104408158BC00044
表示稀疏诱导正则化的惩罚代价,μ表示控制惩罚度的局部参数,λ表示控制惩罚度的全局参数;dni表示所述编码距离矩阵D中第η行第i列元素;ani表示所述第η个子图Gn的显著参数矢量&amp;„的第i 个元素; 步骤4、优化更新: 步骤4.1、初始化k = 0; 步骤4.2、定义第k次更新时,第η个子图Gn的显著参数矢量&amp;„的更新值为有 Represents sparse regularization induced penalties the cost, μ a partial degree penalty parameter control, λ represents a parameter controlling the global penalty degrees; DNI distance matrix D represents the encoded first row, i th column element η; η ANI represents a first subgraphs Gn significant parameter vector & amp; "i-th element; step 4, the optimization update: step 4.1, initializes k = 0; step 4.2, the definition of the k th update, significant parameter vector of η subgraphs Gn, & amp ; "the update is there
Figure CN104408158BC00045
设置投影半径为g,g为正整数,ge [1,Ν+1];采用SVD分解方法对第k次更新的更新值< 的各个元素进行半径为g的1 :范数的正交投影,获得所述4的投影矢量 Set the projection radius g, g being a positive integer, ge [1, Ν + 1]; SVD decomposition method using each of the k th element of the updated update value <is a radius of 1 g: the norm of the orthogonal projection, said projection vector obtained 4
Figure CN104408158BC00046
表示所述的第η个元素; 定义第k次更新时,源图像的惩罚代价矩阵为 Η represents the element; the definition of the k th update penalty cost matrix of the source image
Figure CN104408158BC00047
示所述0的第g个惩罚代价矢量,并有 It shows the g-th vector of the cost penalty of 0, and a
Figure CN104408158BC00048
表示所述第g个子图的惩罚代价矢量 It represents the cost penalty of the vector of the g th FIG.
Figure CN104408158BC00049
'的第η个元素, 利用式(11)计算所述第η个元素ξ、η: 'Th element of [eta], the equation (11) calculating said first element [eta] ξ, η:
Figure CN104408158BC000410
步骤4.3、采用原对偶方法对式(12)进行求解获得优化的更新值<+1: Step 4.3, using the original dual method for solving the updated value to obtain an optimal formula (12) <+1:
Figure CN104408158BC000411
式(12)中,t表示更新收敛步长因子, In the formula (12), t represents the update step size convergence factor,
Figure CN104408158BC000412
I表示函数 I express function
Figure CN104408158BC000413
的微分; 步骤4.4、将k+Ι的值赋值给k;重复步骤4.2和步骤4.3对所述更新值af进行更新,直到更新值4+1收敛为止;从而依次获得所述显著参数矩阵集合A的解; Differentiation; step 4.4, the value is assigned to k + Ι K; 4.2 and the step of repeating steps af 4.3 performs update value update until the updated value of 4 + 1 converges; thereby sequentially obtaining the set of significant parameter matrix A solutions;
Figure CN104408158BC000414
步骤5、利用 Step 5, using
Figure CN104408158BC000415
进行计算,获得的计算结果按降序进行排序,选择前m个计算结果所对应的子图依次进行链接,从而获得视点追踪路径。 It was calculated. The results obtained are sorted in descending order, selecting the first m sub-picture corresponding to the calculation result successively on the link and path tracking viewpoint.
CN201410733763.7A 2014-12-05 2014-12-05 One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion CN104408158B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410733763.7A CN104408158B (en) 2014-12-05 2014-12-05 One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410733763.7A CN104408158B (en) 2014-12-05 2014-12-05 One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion

Publications (2)

Publication Number Publication Date
CN104408158A CN104408158A (en) 2015-03-11
CN104408158B true CN104408158B (en) 2017-10-03

Family

ID=52645789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410733763.7A CN104408158B (en) 2014-12-05 2014-12-05 One kind of viewpoint tracking geometrical reconstruction method and semantic Fusion

Country Status (1)

Country Link
CN (1) CN104408158B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6628821B1 (en) * 1996-05-21 2003-09-30 Interval Research Corporation Canonical correlation analysis of image/control-point location coupling for the automatic location of control points
CN101923715A (en) * 2010-09-02 2010-12-22 西安电子科技大学 Image segmentation method based on texture information constrained clustering of particle swarm optimization space
CN103700088A (en) * 2013-12-01 2014-04-02 北京航空航天大学 Image set unsupervised co-segmentation method based on deformable graph structure representation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6628821B1 (en) * 1996-05-21 2003-09-30 Interval Research Corporation Canonical correlation analysis of image/control-point location coupling for the automatic location of control points
CN101923715A (en) * 2010-09-02 2010-12-22 西安电子科技大学 Image segmentation method based on texture information constrained clustering of particle swarm optimization space
CN103700088A (en) * 2013-12-01 2014-04-02 北京航空航天大学 Image set unsupervised co-segmentation method based on deformable graph structure representation

Also Published As

Publication number Publication date
CN104408158A (en) 2015-03-11

Similar Documents

Publication Publication Date Title
Yang et al. Saliency detection via graph-based manifold ranking
Liu et al. Multiview Hessian discriminative sparse coding for image annotation
Gao et al. 3-D object retrieval and recognition with hypergraph analysis
Simonyan et al. Two-stream convolutional networks for action recognition in videos
Minhas et al. Human action recognition using extreme learning machine based on visual vocabularies
Yu et al. Pairwise constraints based multiview features fusion for scene classification
Yao et al. Coupled action recognition and pose estimation from multiple views
Zhang et al. Actively learning human gaze shifting paths for semantics-aware photo cropping
Liu et al. A part‐aware surface metric for shape analysis
CN102663409A (en) Pedestrian tracking method based on HOG-LBP
Bu et al. Learning high-level feature by deep belief networks for 3-D model retrieval and recognition
Li et al. Deep supervised discrete hashing
Liu et al. Multi-objective convolutional learning for face labeling
Zhang et al. Pose-robust face recognition via sparse representation
Iscen et al. Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations
Leng et al. 3D object retrieval with stacked local convolutional autoencoder
Liu et al. Deep convolutional neural networks for thermal infrared object tracking
Dang et al. Discriminative features for identifying and interpreting outliers
CN101770584B (en) Extraction method for identification characteristic of high spectrum remote sensing data
CN103514456A (en) Image classification method and device based on compressed sensing multi-core learning
Zou et al. Df-net: Unsupervised joint learning of depth and flow using cross-task consistency
Wang et al. Beyond low-rank representations: Orthogonal clustering basis reconstruction with optimized graph structure for multi-view spectral clustering
Liu et al. Graph-based characteristic view set extraction and matching for 3D model retrieval
Xu et al. Structured attention guided convolutional neural fields for monocular depth estimation
CN101916376B (en) Local spline embedding-based orthogonal semi-monitoring subspace image classification method

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
GR01