CN107515905A - 一种基于草图的交互式图像搜索与融合方法 - Google Patents

一种基于草图的交互式图像搜索与融合方法 Download PDF

Info

Publication number
CN107515905A
CN107515905A CN201710652876.8A CN201710652876A CN107515905A CN 107515905 A CN107515905 A CN 107515905A CN 201710652876 A CN201710652876 A CN 201710652876A CN 107515905 A CN107515905 A CN 107515905A
Authority
CN
China
Prior art keywords
image
sub
sketch
contour
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710652876.8A
Other languages
English (en)
Other versions
CN107515905B (zh
Inventor
王敬宇
戚琦
赵宇
王晶
廖建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201710652876.8A priority Critical patent/CN107515905B/zh
Publication of CN107515905A publication Critical patent/CN107515905A/zh
Application granted granted Critical
Publication of CN107515905B publication Critical patent/CN107515905B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/56Information retrieval; Database structures therefor; File system structures therefor of still image data having vectorial format
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/758Involving statistics of pixels or of feature values, e.g. histogram matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

一种基于草图的交互式图像搜索与融合方法,包括下列操作步骤:(1)图像库建立索引文件过程;(2)基于草图获得图像检索结果过程;(3)图像融合过程。本发明方法在整体处理时间上相对于Sketch2Photo大大缩短,而且提供了更加自由化的用户交互,而相比于Photosketcher,本发明方法能够提供较高的检索精度,并大大减少用户检索的次数,为用户提供更加合理、丰富的素材。

Description

一种基于草图的交互式图像搜索与融合方法
技术领域
本发明涉及一种基于草图的交互式图像搜索与融合方法,属于信息技术领域,特别是属于计算机视觉技术领域。
背景技术
随着社交网络的普及,人们对图像进行处理的需求越来越旺盛。一些简单易用的图像处理软件变得十分火热。分析其成功背后的原因,无非是为非专业用户提供了简单方便的图像编辑的平台。但目前为止还没有一款软件为用户提供简易的自由图像合成的功能。
早在2010年左右,已经有研究人员针对于方便、快捷的合成一幅图像进行了研究。核心问题是如何方便快捷的获取目标物体以及背景图像?目前可用的检索手段只有两种:基于文本的图像检索以及基于草图的图像检索。单一的依靠文本检索,往往难以获得满足特定形状特征的物体;而仅仅依靠目前的草图检索技术,在检索精度上又会大打折扣。Sketch2Photo(参见Chen,T.,Cheng,M.,Tan,P.,Shamir,A.,Hu,S.2009.Sketch2Photo:Internet Image Montage.ACM Trans.Graph.28,5,Article 124(December 2009),10pages.DOI=10.1145/1618452.1618470)将两种方式相结合,实现了自动由草图合成真实图片。但直接在线处理互联网上的图片,经过层层过滤来找到满足合成需要的图片往往需要很长时间,难以满足用户需求。为了满足时间的需求,就需要对图片提前进行离线的预处理,这就需要自建图像库。Photosketcher(参见Eitz M,Richter R,Hildebrand K,Boubekeur T,Alexa M.Photosketcher:interactive sketch-based imagesynthesis.IEEE Comput Graph Appl.2011Nov-Dev;31(6):56-66.doi:10.1109/MCG.2011.67)采用了离线式图像库进行图像检索,不需要附加文本信息。这样虽然加快了检索速度,但精度却难以令人满意。其中一个原因是:photosketcher采用的特征提取的方法对于图像的位置、方向、尺寸上存在局限性,在BOVW模型下无法考虑特征点的空间位置信息。
目前已有的草图检索技术,图像库中的图片往往是图标型图片或者场景,还没有人对日常生活中自然场景下图片进行过检索。而且就目前的应用场景来说,减少用户检索的次数,为用户返回更加精准丰富的物体素材才是检索的重点。
因此如何通过草图来对日常生活中自然场景图片进行检索,并将检索物体与目标场景合成成为目前计算机视觉领域一个急需要解决的技术问题。
发明内容
有鉴于此,本发明的目的是发明一种方法,实现以日常复杂场景图片为图片库,用户只需要输入某一物体的草图,就能返回出现该物体的场景图像,并将检索物体与目标场景合成。
为了达到上述目的,本发明提出了一种基于草图的交互式图像搜索与融合方法,所述方法包括下列操作步骤:
(1)图像库建立索引文件过程,具体内容是:将图像库中源图像分割为只包含单一物体的子图像,记录其映射关系;获取子图像中物体轮廓,并且利用GF-HOG算法计算其对应的特征向量;根据BoVW视觉词袋模型,对所得特征向量进行聚类,获得视觉词典;然后对每一子图计算其视觉单词词频的统计直方图;按照倒排索引的方式,建立所述图像库的索引文件;
(2)基于草图获得图像检索结果过程,具体内容是:根据用户输入的草图,计算该草图的特征向量;根据步骤(1)中所得到的视觉词典,获取草图的统计直方图;根据该直方图,利用步骤(1)所得的索引文件,计算草图与各子图像的相似度,对子图像按照相似度进行排序;结合子图像的标签信息,对排序结果进行反馈;根据步骤(1)中所述的映射关系,将子图像所对应的源图像返回给用户;
(3)图像融合过程,具体内容是:从所述步骤(2)获得的检索图像,使用Grabcut算法抠出所需要的物体;使用Possion融合方法把抠出的物体放进背景图像,实现图像融合。
所述步骤(1)中获取子图像中物体轮廓的具体内容是包括如下操作步骤:
(1101)使用物体检测算法YOLO,对图像库中每一幅图像中的物体进行检测,获得只包含单个物体的子图像、其对应的标签信息以及标签准确度;
(1102)对上述的每个子图像,使用显著性区域检测算法SaliencyCut进行显著性区域检测,将子图像中前景即物体与背景分割开,形成二值化图像;
(1103)对上述的二值化图像,使用Canny算法计算得到物体的轮廓。
所述步骤(1)中根据所得到的子图像中物体轮廓,计算其对应的特征向量的具体内容是包括如下操作步骤:
(1201)首先,以二值化轮廓图M作为输入,其中M(x,y)=1表示轮廓像素点,M(x,y)=0表示非轮廓像素点,x,y分别表示像素点的行和列坐标,运用以下公式求得轮廓像素点的梯度方向θ(x,y),从而获得轮廓图M的稀疏梯度方向场Ψ:
(1202)在保持轮廓像素点梯度方向不变的情况下,对非轮廓像素点的梯度方向进行插值处理,从而获得稠密梯度方向场ΘΩ;同时为使所述的稠密梯度方向场ΘΩ在整个图像坐标Ω∈R2满足平滑性,需要对稠密梯度方向场ΘΩ进行拉普拉斯平滑约束,具体如下式:
该式中,Θ表示待求的像素点的梯度方向,Ω表示整个图像坐标,∫∫Ω是在整个图像坐标系中对运算符内数值求积分操作,表示求梯度操作,v是对所述的稀疏梯度方向场Ψ计算其梯度后得到的引导场,即||||2表示对运算符内数值求模的平方,表示轮廓像素点,θ是轮廓像素点的梯度方向;
(1203)在满足狄利克雷边界条件的基础上,上式用如下泊松方程来进行求解:
该式中,表示拉普拉斯算子,div是求散度操作,上述方程在离散状态下可表示为如下方程:
其中,对于图像中任一像素点p,Np表示像素点p的四个邻域点的集合,在四邻域条件下|Np|=4,q表示Np内一点,表示轮廓像素点,vpq=θpq,该式可以通过求解线性代数的方式进行求解,从而获得所述的稠密梯度方向场ΘΩ
(1204)在获得所述的稠密梯度方向场ΘΩ后,以轮廓像素点为中心,利用HOG算法对ΘΩ进行多尺度采样,构造该轮廓图的特征向量。
所述步骤(1)中对所计算得到的物体的特征向量进行聚类所采用方法是k-means聚类方法。
步骤(1)中所述的按照倒排索引的方式,建立所述图像库的索引文件的具体内容是包含如下操作步骤:
(1301)根据BoVW模型,将所有子图像的词频统计直方图合并在一起组成一个N行K列的直方图矩阵,其中N为图像库中子图像的个数,K为聚类中心数,将矩阵保存到文件中;
(1302)按列遍历上述直方图矩阵,统计每一列中值不为0的图像的标号,并将统计结果写入文件中,这样就获得了所需要的倒排索引文件。
所述步骤(2)的具体内容是包含如下操作步骤:
(21)按照步骤(1)中所述方法,计算输入草图的特征向量;
(22)利用步骤(1)中获得的视觉词典,统计视觉单词出现的频率,得到草图对应的统计直方图Q;
(23)利用步骤(1301)和(1302)获得的倒排索引结构以及矩阵,计算查询草图与子图像的相似度,相似度公式定义如下:
该式中,Q表示查询草图的统计直方图,Di表示图像库中子图像i的统计直方图, N是图像库中子图像的个数,p表示视觉词典中聚类中心的标号,fp是图像库中包含视觉单词Wp的子图像的个数,而fQ,p以及分别是视觉单词Wp在查询草图以及子图像i中所占的频率;
(24)通过步骤(23),计算得到子图像i与用户输入草图的相似度Si,利用如下公式求得在Top-k下出现的类别的反馈值FT
上式中,Ci为YOLO返回的子图像i标签的准确度,Ti为子图像i的标签,T为某一类别标签;利用上述公式获得的各个类别标签的反馈值FT,然后利用如下公式对Top-n下子图像进行相似度重计算,这里n一般取大于等于k的自然数,其中Si为反馈前子图像i的相似度,S'i为重新计算获得的子图像i的相似度;
在Top-n下对S'i进行重新排序;
(25)利用步骤(1)中的映射关系,返回相似度最高的前k张子图像所对应的源图像。
所述步骤(3)的具体内容是包含如下操作步骤:
(31)对于草图检索返回的结果,使用Grabcut算法,将图像中的物体抠出,然后将抠图结果留在备选区待用;
(32)待所有物体都被抠出放入备选区后,将备选区中物体全部放置在背景图片上,调整其大小以及位置,然后使用Possion融合,将物体融合到背景中,从而获得一副自然的图片。
本发明的有益效果在于整体处理时间上相对于Sketch2Photo大大缩短,而且提供了更加自由化的用户交互,而相比于Photosketcher,本发明方法能够提供较高的检索精度,并大大减少用户检索的次数,为用户提供更加合理、丰富的素材。
附图说明
图1是本发明提出的一种基于草图的交互式图像搜索与融合方法的流程图。
图2是本发明实施例所用的一个建库图像。
图3是对图2进行步骤(1101)操作获得的结果图。
图4是利用图3中矩形框将图片进行分割得到的只包含单一物体的子图像。
图5是对图4图像进行步骤(1102)操作得到的二值化图像。
图6是对图4进行步骤(1103)操作获得的轮廓图。
图7是对图2中所对应的轮廓图进行步骤(1201)操作获得的稀疏梯度方向场。
图8是对图3中稀疏梯度方向场进行步骤(1203)获得的稠密梯度方向场。
图9是本发明中步骤(1204)中HOG算法的示意图。
图10是本发明实施例所用的一个查询图像。
图11是以图10为查询实例,在不加入标签反馈时按相似度排序的Top-10结果。
图12是以图10为查询实例,并且加入标签反馈后按相似度排序的Top-10结果。
图13是本发明实施例中搜索以及融合图片的一些实例。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面结合附图对本发明作进一步的详细描述。
参见图1,介绍本发明提出的一种基于草图的交互式图像搜索与融合方法,所述方法包括下列操作步骤:
(1)图像库建立索引文件过程,具体内容是:将图像库中源图像分割为只包含单一物体的子图像,记录其映射关系;获取子图像中物体轮廓,并且利用GF-HOG算法(参见RuiHu,Mark Barnard,John Collomosse.Gradient field descriptor for sketch basedretrieval and localization.ICIP 2010.doi:10.1109/ICIP.2010.5649331)计算其对应的特征向量;根据Bagof Visual Words(BoVW,参见Sivic J,Zisserman A.Video Google:AText Retrieval Approach to Object Matching in Videos[C]//null.IEEE ComputerSociety,2003:1470.)视觉词袋模型,对所得特征向量进行聚类,获得视觉词典;然后对每一子图计算其视觉单词词频的统计直方图;按照倒排索引的方式,建立所述图像库的索引文件;
(2)基于草图获得图像检索结果过程,具体内容是:根据用户输入的草图,计算该草图的特征向量;根据步骤(1)中所得到的视觉词典,获取草图的统计直方图;根据该直方图,利用步骤(1)所得的索引文件,计算草图与各子图像的相似度,对子图像按照相似度进行排序;结合子图像的标签信息,对排序结果进行反馈;根据步骤(1)中所述的映射关系,将子图像所对应的源图像返回给用户;
(3)图像融合过程,具体内容是:从所述步骤(2)获得的检索图像,使用Grabcut算法(参见Carsten Rother,Vladimir Kolmogorov,Andrew Blake.“GrabCut”—InteractiveForeground Extraction using Iterated GraphCuts.SIGGRAPH'04ACM.doi:10.1145/1186562.1015720)抠出所需要的物体;使用Possion融合方法(参见Patrick Perez,MichelGangnet,et al.Possion Image2003ACM 0730-0301/03/0700-0313)把抠出的物体放进背景图像,实现图像融合。
所述步骤(1)中获取子图像中物体轮廓的具体内容是包括如下操作步骤:
(1101)使用物体检测算法YOLO(参见Joseph Redmon,Santosh Divvala,RossGirshick,Ali Farhadi.You Only Look Once:Unified,Real-Time ObjectDetection.CVPR.2016.doi:10.1109/CVPR.2016.91),对图像库中每一幅图像中的物体进行检测,获得只包含单个物体的子图像、其对应的标签信息以及标签准确度;
参见图2,图2为本发明实施例所用一个建库图像,利用YOLO算法,可以得到图3所示的结果。从图3中可以看出该算法将图2中的物体准确框出,并且给出了其标签为“horse”,标签的准确度为0.92。利用图3中矩形框的坐标,可以将图2切割为图4所示的结果。
使用上述步骤将物体分割开,便于以单个物体为目标的搜索,减少其他物体所带来的干扰。同时获得物体的标签,能够为草图检索加入语义信息,从而进一步提高草图检索的准确度。
(1102)对上述的每个子图像,使用显著性区域检测算法SaliencyCut(参见Ming-Ming Cheng,Niloy J.Mitra,Xiaolei Huang,Philip H.S.Torr,and Shi-Min Hu.GlobalContrast Based Salient Region Detection.IEEE Transactions on Pattern Analysisand Machine Intelligence.2014.doi:10.1109/TPAMI.2014.2345401)进行显著性区域检测,将子图像中前景即物体与背景分割开,形成二值化图像;
参见图5,图5为对图4进行SakiencyCut算法进行处理后得到的二值化图片,白色部分为物体,黑色部分为背景。
采用显著性区域检测算法能够保留物体基本轮廓的同时,有效滤除背景所带来的干扰,从而获得高质量的轮廓图图片集。
(1103)对上述的二值化图像,使用Canny算法(参见CannyJ.AComputationalApproach To Edge Detection[J].Pattern Analysis&MachineIntelligence IEEE Transactions on,1986,pami-8(6):184–203.)计算得到物体的轮廓。
参见图6,图6为对图5中图片进行Cany算法提取轮廓后的结果图。
所述步骤(1)中根据所得到的子图像中物体轮廓,计算其对应的特征向量的具体内容是包括如下操作步骤:
(1201)首先,以二值化轮廓图M作为输入,其中M(x,y)=1表示轮廓像素点,M(x,y)=0表示非轮廓像素点,x,y分别表示像素点的行和列坐标,运用以下公式求得轮廓像素点的梯度方向θ(x,y),从而获得轮廓图M的稀疏梯度方向场Ψ:
参见图6,图6为输入的二值化轮廓图M,图7为计算得到的稀疏梯度方向场Ψ的表示图。
(1202)在保持轮廓像素点梯度方向不变的情况下,对非轮廓像素点的梯度方向进行插值处理,从而获得稠密梯度方向场ΘΩ;同时为使所述的稠密梯度方向场ΘΩ在整个图像坐标Ω∈R2满足平滑性,需要对稠密梯度方向场ΘΩ进行拉普拉斯平滑约束,具体如下式:
该式中,Θ表示待求的像素点的梯度方向,Ω表示整个图像坐标,∫∫Ω是在整个图像坐标系中对运算符内数值求积分操作,表示求梯度操作,v是对所述的稀疏梯度方向场Ψ计算其梯度后得到的引导场,即||||2表示对运算符内数值求模的平方,表示轮廓像素点,θ是轮廓像素点的梯度方向;
(1203)在满足狄利克雷边界条件的基础上,上式用如下泊松方程来进行求解:
该式中,表示拉普拉斯算子,div是求散度操作,上述方程在离散状态下可表示为如下方程:
其中,对于图像中任一像素点p,Np表示像素点p的四个邻域点的集合,在四邻域条件下|Np|=4,q表示Np内一点,表示轮廓像素点,vpq=θpq,该式可以通过求解线性代数的方式进行求解,从而获得所述的稠密梯度方向场ΘΩ
参见图8,图8为求解得到的稠密梯度方向场ΘΩ的表示图。
(1204)在获得所述的稠密梯度方向场ΘΩ后,以轮廓像素点为中心,利用HOG算法(参见N.Dalaland B.Triggs,“Histograms of oriented gradients for humandetection,”in CIVR,New York,NY,USA,2007,pp.401-408,ACM)对ΘΩ进行多尺度采样,构造该轮廓图的特征向量。
在实施例中,本发明将方向量化为了9个方向,以轮廓点像素为中心,构造3乘3大小的窗口,因此该窗口包含9个子窗口。为了构造尺度不变性,每个子窗口边长分别选取7、11以及15个像素点长度进行方向统计,因此每个子窗口可以得到一个9维的向量。将9个子窗口的向量进行合并,然后将统计结果进行归一化,这样就获得了81维的特征向量。该算法示意图参见图9。这里给出了图8中某个轮廓像素点的3个尺度的特征向量:
a7=[0,0,0.366116,0.146446,0,0,0,0,0,0,0,0.0313814,0.188288,0.198749,0.0941441,0,0,0,0,0,0,0,0,0.156907,0.355656,0,0,0,0.135986,0.376576,0,0,0,0,0,0,0.0523023,0.115065,0.0732232,0.0313814,0.0313814,0.0523023,0.0732232,0.0627627,0.0209209,0,0,0,0,0,0,0.0836837,0.428879,0,0.0104605,0.0627627,0.0313814,0.0836837,0.135986,0.104605,0.0523023,0,0.0313814,0.0836837,0.135986,0.0418418,0.0418418,0.0523023,0.0418418,0.0732232,0.0104605,0.0313814,0.0523023,0,0,0,0.0104605,0,0.0836837,0.355656,0.0104605]T
a11=[0,0,0.325097,0.15462,0,0,0,0,0,0,0,0.00792921,0.174443,0.186336,0.111009,0,0,0,0,0,0,0,0,0.0951505,0.356814,0.0277522,0,0,0.0356814,0.416283,0.0277522,0,0,0,0,0,0.0277522,0.0951505,0.0792921,0.0277522,0.0237876,0.0475752,0.0832567,0.0792921,0.0158584,0,0,0,0,0,0,0.0594691,0.420248,0,0.00792921,0.0317168,0.0277522,0.0237876,0.138761,0.162549,0.0673983,0,0.019823,0.130832,0.122903,0.0555044,0.0475752,0.0436106,0.0317168,0.0277522,0,0.019823,0.00792921,0.0039646,0,0,0,0,0.0475752,0.412319,0.00792921]T
a15=[0,0,0.327283,0.141753,0,0,0,0,0,0,0,0.0020846,0.170938,0.168853,0.122992,0.00416921,0,0,0,0,0,0,0,0.0604535,0.335621,0.0729612,0,0,0.0416921,0.396075,0.0145922,0.00833842,0,0.0020846,0,0.00625381,0.0187614,0.0750458,0.089638,0.0333537,0.0291845,0.0458613,0.0854688,0.079215,0.0125076,0,0,0,0,0,0,0.0416921,0.427344,0,0.0125076,0.0541997,0.0333537,0.0270999,0.0771304,0.183445,0.0729612,0,0.00833842,0.223053,0.0812996,0.0354383,0.0312691,0.0437767,0.0145922,0.020846,0.00833842,0.010423,0.0020846,0.00416921,0,0.0020846,0,0.0020846,0.0125076,0.437767,0.00833842]T
所述步骤(1)中对所计算得到的物体的特征向量进行聚类所采用方法是k-means聚类方法。
在实施例中,本发明选取聚类中心K=5000,这样就可以获得一个5000行81列的视觉词典矩阵,每一行为一个聚类中心的视觉单词。已知一张图片的所有特征向量,可以通过比较特征向量与聚类中心的距离找到距离最近的聚类中心,从而按照聚类中心进行词频统计,获得一个5000维的词频统计直方图。
步骤(1)中所述的按照倒排索引的方式,建立所述图像库的索引文件的具体内容是包含如下操作步骤:
(1301)根据BoVW模型,将所有子图像的词频统计直方图合并在一起组成一个N行K列的直方图矩阵,其中N为图像库中子图像的个数,K为聚类中心数,将矩阵保存到文件中;
在本发明的实施例中,发明人使用了MicrosoftCOCO验证图像库,该图像库参见:http://mscoco.org/dataset/#download,该图像库包含40K张图片,每张图片包含多种物体,经过步骤(1101)切分为88266张子图像,即N=88266,而聚类中心数K=5000,这样就构建了一个88266行5000列的矩阵。下面给出了其中一行的数据,该行表示图4子图像所对应的词频统计直方图。
R=[0,0,0,0,0,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0.000586166,0.00293083,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00644783,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.003517,0,0,0,0,0,0.00293083,0,0,0.000586166,0,0,0.000586166,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0.000586166,0.00117233,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0,0,0,0.000586166,0,0.0017585,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.007034,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0.000586166,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0.00586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0.000586166,0,0,0,0,0,0,0,0.00293083,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00586166,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0.00410317,0,0,0,0,0.00117233,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0.00410317,0,0.0017585,0.00117233,0.00586166,0,0,0,0.003517,0,0.00410317,0,0,0,0,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0.00644783,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0.00644783,0,0,0,0,0,0,0,0,0,0,0,0.003517,0,0,0,0.000586166,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.003517,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0.0017585,0,0,0.000586166,0,0,0,0,0,0,0.00234467,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00410317,0,0.0017585,0.00234467,0,0,0.00293083,0.0017585,0,0,0,0,0,0,0.00410317,0,0,0,0,0,0,0,0,0,0.003517,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0.00410317,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0.0017585,0,0,0,0,0.0017585,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0.000586166,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0.00117233,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0.000586166,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0052755,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0.0017585,0.0017585,0.003517,0,0,0,0,0,0.00996483,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0.000586166,0,0,0,0,0,0,0,0,0.00410317,0.000586166,0.00410317,0,0,0,0,0,0.0017585,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0.00117233,0,0,0,0,0.00468933,0,0.0017585,0,0,0,0,0,0,0,0.00468933,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.0017585,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0.000586166,0,0.00117233,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0.00293083,0,0,0,0.00117233,0,0,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0.0017585,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0.00117233,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.00644783,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0.000586166,0.00293083,0,0,0.00117233,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00410317,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0.00820633,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0.00293083,0.000586166,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00410317,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0.00410317,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.010551,0,0,0,0.000586166,0.000586166,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0.0017585,0,0,0,0,0,0.000586166,0,0,0,0.00117233,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0.00117233,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0.00117233,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0.000586166,0,0,0,0.00117233,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0.0269637,0,0,0,0,0,0,0,0.00937866,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0.003517,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0.000586166,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0.00644783,0.00234467,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0.00117233,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0.00293083,0,0,0,0,0.00410317,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0.00234467,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00644783,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.003517,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0.00586166,0,0,0.00117233,0.00117233,0,0,0.003517,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0.00410317,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0.00410317,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.0134818,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.00234467,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0.000586166,0,0,0.000586166,0,0,0,0,0,0.000586166,0,0,0,0,0.000586166,0,0,0,0,0,0,0.007034,0,0,0.00117233,0,0,0,0,0,0,0,0.0052755,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00586166,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0.0017585,0.00410317,0,0,0,0,0.00117233,0,0,0.000586166,0,0,0.0087925,0,0,0,0,0,0,0.00410317,0.00762016,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0.00410317,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0.00117233,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00468933,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0.000586166,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0.00117233,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0.0017585,0,0,0.0017585,0.00410317,0.00293083,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0111372,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0.00117233,0,0,0,0,0,0,0,0.00293083,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.0152403,0,0,0,0,0,0,0.0123095,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0.00468933,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0.00234467,0,0.00586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0.00644783,0,0,0,0,0.00996483,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0.00293083,0,0,0,0,0,0,0,0.0169988,0,0,0,0,0.00234467,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0.00586166,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0.00762016,0,0,0,0,0.00234467,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0.000586166,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0.0017585,0.003517,0,0,0,0,0,0,0,0,0,0,0.000586166,0.000586166,0.00117233,0,0,0,0,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0.0052755,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00293083,0,0,0,0,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.010551,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0.00410317,0,0,0,0,0,0,0,0.000586166,0,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0.00117233,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0.0017585,0.000586166,0,0,0,0,0,0,0,0,0,0,0,0.00762016,0,0,0,0.0017585,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.00234467,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.00762016,0,0,0,0.003517,0,0.003517,0,0,0.003517,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.000586166,0,0,0,0,0,0.000586166,0]
(1302)按列遍历上述直方图矩阵,统计每一列中值不为0的图像的标号,并将统计结果写入文件中,这样就获得了所需要的倒排索引文件。
通过观察步骤(1301)获得的结果可知,最后得到的矩阵是一个稀疏矩阵。因此可以通过统计非零值图片标号、建立倒排索引的方式加速计算过程。对直方图矩阵进行统计,下面给出了其中一列的统计结果,其中数字代表图片的序号,该向量表示序号所对应的图片含有该聚类中心所对应的特征向量。
I=[86 89 108 375 383 554 623 706 871 939 967 1027 1030 1166 11961274 1592 1603 1627 1697 1733 1922 1973 2023 2095 2145 2172 2244 2383 24212463 2553 2722 2887 2905 2917 2940 3046 3119 3187 3330 3339 3384 3394 34073416 3632 3829 4028 4268 4362 4542 4554 4559 4619 4640 4676 4691 4700 47504952 4955 4965 5077 5144 5155 5184 5279 5292 5388 5394 5443 5641 5665 56935720 5731 5743 5750 5836 5934 5998 6018 6250 6259 6372 6450 6540 6596 65976664 6758 6760 6777 6809 6903 7071 7151 7193 7264 7283 7309 7361 7471 76537654 7748 7769 7838 7854 7939 7981 7988 8004 8006 8042 8069 8278 8391 84628514 8629 8728 8808 8834 8835 8873 8982 9011 9147 9267 9512 9545 9631 96969916 10037 10165 10282 10388 10730 11011 11079 11096 11137 11246 11282 1137411380 11381 11513 11574 11668 11680 11718 11727 11761 11790 11875 11956 1202812216 12240 12266 12300 12388 12509 12585 12611 12638 12692 12703 12742 1275012793 12958 13024 13028 13047 13058 13179 13204 13256 13321 13391 13705 1383213855 13881 14264 14296 14416 14527 14531 14627 14631 14710 14858 14973 1499115100 15164 15210 15419 15428 15436 15521 15584 15597 15703 15782 15981 1602916157 16277 16431 16477 16489 16667 16732 16974 17127 17419 17444 17557 1766517671 17685 17735 17875 17881 17923 17950 17992 18017 18024 18054 18221 1829318297 18308 18313 18482 18497 18556 18654 18669 18713 18928 19025 19068 1921619268 19277 19311 19383 19481 19504 19624 19633 19659 19731 19889 20166 2040020465 20479 20488 20583 20752 20961 21172 21254 21304 21351 21409 21462 2169721757 21771 21799 21904 22203 22252 22311 22654 22754 22786 22808 22810 2290122902 23039 23101 23261 23272 23373 23403 23464 23481 23572 23647 23721 2378123805 23839 23957 23981 23983 24058 24117 24232 24346 24393 24682 24774 2482924831 24874 24888 24982 25140 25241 25299 25340 25390 25452 25467 25985 2603726048 26163 26285 26311 26313 26318 26405 26529 26710 26712 26718 26842 2713727202 27289 27309 27445 27472 27819 27906 27951 27976 27996 28005 28051 2816028206 28286 28371 28496 28502 28568 28612 28658 28731 28981 29056 29137 2916529183 29398 29455 29460 29577 29641 29650 29722 29744 29777 29815 29872 2996729994 30213 30270 30274 30303 30330 30550 30666 30995 31022 31220 31260 3138031422 31489 31491 31660 31884 31957 31990 32019 32053 32081 32096 32108 3214532147 32195 32277 32451 32527 32686 32773 32797 32817 32892 33141 33203 3325233273 33322 33333 33364 33390 33414 33473 33502 33638 33641 33847 34009 3415134241 34299 34309 34329 34465 34516 34541 34651 34753 34817 34927 34967 3502635034 35050 35100 35240 35249 35474 35509 35516 35598 35709 35804 35890 3593735966 36135 36178 36224 36267 36368 36477 36547 36627 36669 36722 36835 3689036903 37123 37241 37251 37273 37331 37516 37520 37527 37733 37817 38037 3812038351 38467 38505 38531 39027 39140 39355 39665 39685 39708 39791 39842 3998340297 40429 40474 40842 40903 41193 41483 41503 41532 41616 41740 41746 4176141808 42067 42136 42149 42178 42560 42686 42783 42829 42926 43007 43066 4325743831 43856 44080 44269 44278 44411 44469 44500 44761 44763 44941 45009 4507945147 45190 45283 45303 45350 45384 45460 45751 45932 46231 46265 46273 4645946827 46829 46887 46996 47016 47020 47102 47118 47130 47253 47395 47658 4775747821 47866 47924 47972 47987 48099 48449 48454 48495 48527 48589 48617 4869648784 48883 48906 49310 49337 49415 49440 49471 49604 49641 49653 49659 4972049732 49796 49799 49824 49944 50248 50403 50433 50632 50644 50831 50839 5086850936 51292 51323 51389 51827 51830 51852 52049 52086 52146 52245 52289 5231052740 52762 52963 53098 53111 53134 53157 53347 53356 53491 53678 53776 5384154041 54341 54557 54650 54926 55132 55143 55211 55286 55695 55717 55746 5578955965 56297 56435 56715 56809 56844 56968 57109 57116 57123 57174 57265 5739157803 57848 57872 57947 58043 58102 58285 58498 58550 58741 58921 58935 5893758938 59152 59198 59224 59260 59477 59493 59606 59613 59682 59718 59836 6002160039 60103 60239 60467 60585 60650 60680 60762 60777 60824 60895 60935 6094460962 60989 61065 61113 61177 61582 61758 61790 61867 61972 61987 62186 6239462434 62527 62543 62701 62806 62838 62928 62937 62955 62982 63038 63048 6322863258 63405 63455 63488 63552 63662 64105 64137 64148 64409 64537 64623 6505665405 65456 65517 65644 65646 65693 65879 65972 65977 66051 66261 66565 6657766784 67110 67142 67149 67746 67868 68049 68493 68541 68734 68988 69053 6906669087 69156 69313 69686 69745 70293 70444 70659 70679 70724 70906 71168 7117971889 71980 71996 72154 72260 72289 72349 72462 72597 72703 72724 72782 7282672971 73312 73380 73485 73573 73606 73612 73636 74305 74317 74332 74351 7443774792 74958 75295 75388 75536 75537 75541 75554 75564 75568 75598 75703 7588276145 76331 76351 76602 76825 77065 77195 77370 77376 77523 77545 77634 7784877921 77927 78033 78243 78540 78609 78792 78811 79038 79092 79242 79273 7964679719 79791 79863 80028 80122 80154 80187 80208 80421 80462 80797 81251 8138881459 81603 81655 81689 82004 82102 82121 82364 82425 82466 82603 82774 8278482961 83084 83167 83273 83285 83382 83522 83658 83682 83869 83972 84022 8410184327 84665 84840 85067 85424 86210 86275 86303 86372 86398 86448 86747 8683587119 87268 87297 87428 87489 87677 87732 87740 87862 88042 88200]T
所述步骤(2)的具体内容是包含如下操作步骤:
(21)按照步骤(1)中所述方法,计算输入草图的特征向量;
(22)利用步骤(1)中获得的视觉词典,统计视觉单词出现的频率,得到草图对应的统计直方图Q;
参见图10,图10所示图像作为查询草图;
(23)利用步骤(1301)和(1302)获得的倒排索引结构以及矩阵,计算查询草图与子图像的相似度,相似度公式定义如下:
该式中,Q表示查询草图的统计直方图,Di表示图像库中子图像i的统计直方图, N是图像库中子图像的个数,p表示视觉词典中聚类中心的标号,fp是图像库中包含视觉单词Wp的子图像的个数,而fQ,p以及分别是视觉单词Wp在查询草图以及子图像i中所占的频率;
上式表示两个向量的余弦相似度,在此基础上同时加入了文档检索中常用的TF-IDF(TermFrequency–Inverse DocumentFrequency)算法,上式中IDFp即TF-IDF算法中表述的逆文档频率IDF,而fQ,p以及即为TF-IDF算法中表述的词频TF。具体可参见TF-IDF算法。
以图10为查询草图,下面给出了Top-50的图片的相似度S、图片标签T以及图片序号i,而Top-10所对应的图片如图11所示。
S=[2.19008,1.22887,0.978853,0.915278,0.89948,0.886331,0.884973,0.880953,0.879824,0.838481,0.838048,0.836581,0.8161,0.769038,0.747189,0.711824,0.71155,0.708184,0.703801,0.701853,0.697603,0.694958,0.679824,0.665309,0.664681,0.647052,0.642052,0.634729,0.63425,0.633312,0.633138,0.632802,0.619647,0.619475,0.616024,0.613952,0.60772,0.606761,0.593584,0.593071,0.592695,0.591643,0.590981,0.588569,0.571987,0.571159,0.569874,0.566539,0.564236,0.560989]
T=[bird,bird,bird,bird,bird,bird,surfboard,motorbike,person,knife,bird,bird,bird,,bird,motorbike,,bear,bird,,bird,person,bird,dog,,bird,person,bird,bird,b ird,,bird,bird,bird,bird,bottle,bird,,,bird,bird,bench,carrot,surfboard,bird,el ephant,bird,bird,,,]
i=[8222,2608,1032,4400,3581,9818,1391,3149,7339,4391,3433,4180,9524,1406,8501,8573,68,9558,4947,8923,9411,6145,3008,301,5224,6028,594,9678,4020,2959,6495,5134,3660,4638,8502,8137,4131,7880,8982,1638,9528,4798,9165,6185,2616,6379,4373,3198,7251,7315]
在本实施例中,对于标签准确度Ci<0.5的标签,由于标签是不准确的,会对反馈带来误差,因此发明人忽略了其标签信息,所以T中有些标签信息为空白。
(24)通过步骤(23),计算得到子图像i与用户输入草图的相似度Si,利用如下公式求得在Top-k下出现的类别的反馈值FT
上式中,Ci为YOLO返回的子图像i标签的准确度,Ti为子图像i的标签,T为某一类别标签;利用上述公式获得的各个类别标签的反馈值FT,然后利用如下公式对Top-n下子图像进行相似度重计算,这里n一般取大于等于k的自然数,其中Si为反馈前子图像i的相似度,S'i为重新计算获得的子图像i的相似度;
在本实施例中,发明人选取k=10,n=50,发明人在Top-10下求取出现类别的反馈值,然后对Top-50下图片进行相似度重计算。以图10为查询草图,根据求步骤(23)中所得结果,FT求取过程举例如下,其中C为Top-10下图片标签的准确度:
C=[0.98185,0.978674,0.982609,0.987828,0.881285,0.996606,0.50939,0.540037,0.635041,0.668688],
对于无标签信息的图片,计算其反馈值时令FT=0,即令S'i=Si。相似度重计算后的结果S'如下:
S′=[1.31416,1.07447,0.992579,0.96962,0.963754,0.958821,0.889592,0.885986,0.885792,0.846343,0.940286,0.939712,0.93163,0.769038,0.903424,0.722784,0.71155,0.708184,0.884784,0.701853,0.882061,0.708438,0.874161,0.665309,0.664681,0.859239,0.657248,0.8535,0.853275,0.852835,0.633138,0.852596,0.846377,0.846295,0.844649,0.613952,0.840664,0.606761,0.593584,0.833545,0.833361,0.591643,0.590981,0.602009,0.823097,0.571159,0.822036,0.820356,0.564236,0.560989]
在Top-n下对S'i进行重新排序;
对上述S'进行重新排序,得到的相似度S”、图片标签T以及图片序号i'如下,Top-10所对应的图片如图12所示。
S″=[1.31416,1.07447,0.992579,0.96962,0.963754,0.958821,0.940286,0.939712,0.93163,0.903424,0.889592,0.885986,0.885792,0.884784,0.882061,0.874161,0.859239,0.8535,0.853275,0.852835,0.852596,0.846377,0.846343,0.846295,0.844649,0.840664,0.833545,0.833361,0.823097,0.822036,0.820356,0.769038,0.722784,0.71155,0.708438,0.708184,0.701853,0.665309,0.664681,0.657248,0.633138,0.613952,0.606761,0.602009,0.593584,0.591643,0.590981,0.571159,0.564236,0.560989]
T=[bird,bird,bird,bird,bird,bird,bird,bird,bird,bird,surfboard,motorbike,per son,bird,bird,bird,bird,bird,bird,bird,bird,bird,knife,bird,bird,bird,bird,bird,bird,bird,bird,,motorbike,,person,bear,,dog,,person,,bottle,,surfboard,,bench,carrot,elephant,,,]
i′=[8222,2608,1032,4400,3581,9818,3433,4180,9524,8501,1391,3149,7339,4947,9411,3008,6028,9678,4020,2959,5134,3660,4391,4638,8502,4131,1638,9528,2616,4373,3198,1406,8573,68,6145,9558,8923,301,5224,594,6495,8137,7880,6185,8982,4798,9165,6379,7251,7315]
这样,直观上讲用户并不需要输入文本信息,系统会根据外形判断出对应草图最可能属于的几种类别,并优先返回满足这些类别外形最相似的物体。也就是说,用户画得越像返回结果越令用户满意。这点,可以对照图11以及图12看出,在未加入反馈前,返回结果中存在与鸟类外形相似的物体,而通过对初始结果进行统计,系统能够判断出图5草图更加形似鸟类,于是对相似度进行反馈后,标签为鸟类的图片排序结果会被提前,而其他物体则会被置后。
(25)利用步骤(1)中的映射关系,返回相似度最高的前k张子图像所对应的源图像。
之所以返回源图像,主要基于以下考虑:目前几乎所有草图检索系统都是直接使用单一物体的图标型图片作为图片库,大部分人忽视了物体之间的相关性。而在实际应用中,这些相关物体又极大可能出现在用户所想象的场景中,如:一只跳跃的狗经常与一只飞碟出现在同一场景下,所以当用户画一只跳跃的狗时,如果飞碟也同时在图片中,这样会大大缩减用户的检索次数。本发明为这种情况提供了可能。这是之前任何一种草图检索系统所不具备的。
所述步骤(3)的具体内容是包含如下操作步骤:
(31)对于草图检索返回的结果,使用Grabcut算法,将图像中的物体抠出,然后将抠图结果留在备选区待用;
(32)待所有物体都被抠出放入备选区后,将备选区中物体全部放置在背景图片上,调整其大小以及位置,然后使用Possion融合,将物体融合到背景中,从而获得一副自然的图片。
参见图13,图13给出了本发明实施例中的搜索以及融合图片的一些实例。
实例a中,用户通过草图搜索“标志牌”以及“汽车”的图片,然后将图片中的“标志牌”以及“汽车”抠出,放置在“街道”背景的图片中,调节其大小以及位置后进行融合,从而得到最后的结果。
实例b中,用户通过草图搜索“展翅飞翔的鸟”以及“吃草的马”的图片,然后将图片中的“展翅飞翔的鸟”以及“吃草的马”抠出,放到“草原”背景的图片中,调节其大小以及位置后进行融合,从而得到最后的结果。
实例c中,用户通过草图搜索“单板滑雪者”的图片,从返回结果中,选取了包含多个“滑雪者”的图片,因为其他姿势的“滑雪者”也是用户所想要的素材,然后用户将图片中的两位“滑雪者”抠出,放到“滑雪场”背景的图片中,调节大小以及位置后进行融合,从而得到最后的结果。
实例d中,假设用户要合成一幅棒球运动员们在棒球场上打棒球的场景。在之前提到的系统中,用户往往需要分别搜索各个位置的运动员,才能得到所需的素材,而这既耗时又耗力。在本发明中,用户画了一张“击球手”的草图,从返回的结果中,选取了一张包含多个位置的运动员的图片,因为这些运动员也是合成图片中所需要的素材,这样省去了反复查找所浪费的时间。将图片中的三位“运动员”抠出放置在“棒球场”背景的图片上。然后利用同样的方法搜索“投球手”,将其抠出放置在背景图片上,调节所有素材的大小以及位置后进行融合,从而得到最后的结果。
发明人在“Flickr160”数据库和MicrosoftCOCO验证数据集上进行了大量实验,实验结果证明本发明的方法是非常有效的。

Claims (7)

1.一种基于草图的交互式图像搜索与融合方法,其特征在于:所述方法包括下列操作步骤:
(1)图像库建立索引文件过程,具体内容是:将图像库中源图像分割为只包含单一物体的子图像,记录其映射关系;获取子图像中物体轮廓,并且利用GF-HOG算法计算其对应的特征向量;根据BoVW视觉词袋模型,对所得特征向量进行聚类,获得视觉词典;然后对每一子图计算其视觉单词词频的统计直方图;按照倒排索引的方式,建立所述图像库的索引文件;
(2)基于草图获得图像检索结果过程,具体内容是:根据用户输入的草图,计算该草图的特征向量;根据步骤(1)中所得到的视觉词典,获取草图的统计直方图;根据该直方图,利用步骤(1)所得的索引文件,计算草图与各子图像的相似度,对子图像按照相似度进行排序;结合子图像的标签信息,对排序结果进行反馈;根据步骤(1)中所述的映射关系,将子图像所对应的源图像返回给用户;
(3)图像融合过程,具体内容是:从所述步骤(2)获得的检索图像,使用Grabcut算法抠出所需要的物体;使用Possion融合方法把抠出的物体放进背景图像,实现图像融合。
2.根据权利要求1所述的一种基于草图的交互式图像搜索与融合方法,其特征在于:所述步骤(1)中获取子图像中物体轮廓的具体内容是包括如下操作步骤:
(1101)使用物体检测算法YOLO,对图像库中每一幅图像中的物体进行检测,获得只包含单个物体的子图像、其对应的标签信息以及标签准确度;
(1102)对上述的每个子图像,使用显著性区域检测算法SaliencyCut进行显著性区域检测,将子图像中前景即物体与背景分割开,形成二值化图像;
(1103)对上述的二值化图像,使用Canny算法计算得到物体的轮廓。
3.根据权利要求1所述的一种基于草图的交互式图像搜索与融合方法,其特征在于:所述步骤(1)中根据所得到的子图像中物体轮廓,计算其对应的特征向量的具体内容是包括如下操作步骤:
(1201)首先,以二值化轮廓图M作为输入,其中M(x,y)=1表示轮廓像素点,M(x,y)=0表示非轮廓像素点,x,y分别表示像素点的行和列坐标,运用以下公式求得轮廓像素点的梯度方向θ(x,y),从而获得轮廓图M的稀疏梯度方向场Ψ:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>&amp;theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>arctan</mi> <mo>(</mo> <mfrac> <mfrac> <mrow> <mi>&amp;delta;</mi> <mi>M</mi> </mrow> <mrow> <mi>&amp;delta;</mi> <mi>y</mi> </mrow> </mfrac> <mfrac> <mrow> <mi>&amp;delta;</mi> <mi>M</mi> </mrow> <mrow> <mi>&amp;delta;</mi> <mi>x</mi> </mrow> </mfrac> </mfrac> <mo>)</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mo>&amp;ForAll;</mo> <mrow> <mi>x</mi> <mi>y</mi> </mrow> </msub> <mi>M</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced>
(1202)在保持轮廓像素点梯度方向不变的情况下,对非轮廓像素点的梯度方向进行插值处理,从而获得稠密梯度方向场ΘΩ;同时为使所述的稠密梯度方向场ΘΩ在整个图像坐标Ω∈R2满足平滑性,需要对稠密梯度方向场ΘΩ进行拉普拉斯平滑约束,具体如下式:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <munder> <mi>argmin</mi> <mi>&amp;Theta;</mi> </munder> <mo>&amp;Integral;</mo> <msub> <mo>&amp;Integral;</mo> <mi>&amp;Omega;</mi> </msub> <mo>|</mo> <mo>|</mo> <mo>&amp;dtri;</mo> <mi>&amp;Theta;</mi> <mo>-</mo> <mi>v</mi> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> </mtd> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <mi>&amp;Theta;</mi> <msub> <mo>|</mo> <mrow> <mo>&amp;part;</mo> <mi>&amp;Omega;</mi> </mrow> </msub> <mo>=</mo> <mi>&amp;theta;</mi> <msub> <mo>|</mo> <mrow> <mo>&amp;part;</mo> <mi>&amp;Omega;</mi> </mrow> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced>
该式中,Θ表示待求的像素点的梯度方向,Ω表示整个图像坐标,∫∫Ω是在整个图像坐标系中对运算符内数值求积分操作,表示求梯度操作,v是对所述的稀疏梯度方向场Ψ计算其梯度后得到的引导场,即|| ||2表示对运算符内数值求模的平方,表示轮廓像素点,θ是轮廓像素点的梯度方向;
(1203)在满足狄利克雷边界条件的基础上,上式用如下泊松方程来进行求解:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>&amp;Delta;</mi> <mi>&amp;Theta;</mi> <mo>=</mo> <mi>d</mi> <mi>i</mi> <mi>v</mi> <mi> </mi> <mi>v</mi> <mi> </mi> <mi>o</mi> <mi>v</mi> <mi>e</mi> <mi>r</mi> <mi>&amp;Omega;</mi> </mrow> </mtd> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <mi>&amp;Theta;</mi> <msub> <mo>|</mo> <mrow> <mo>&amp;part;</mo> <mi>&amp;Omega;</mi> </mrow> </msub> <mo>=</mo> <mi>&amp;theta;</mi> <msub> <mo>|</mo> <mrow> <mo>&amp;part;</mo> <mi>&amp;Omega;</mi> </mrow> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced>
该式中,表示拉普拉斯算子,div是求散度操作,上述方程在离散状态下可表示为如下方程:
<mrow> <mo>|</mo> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>|</mo> <msub> <mi>&amp;Theta;</mi> <mi>p</mi> </msub> <mo>-</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>q</mi> <mo>&amp;Element;</mo> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>&amp;cap;</mo> <mi>q</mi> <mo>&amp;NotElement;</mo> <mo>&amp;part;</mo> <mi>&amp;Omega;</mi> </mrow> </munder> <msub> <mi>&amp;Theta;</mi> <mi>q</mi> </msub> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>q</mi> <mo>&amp;Element;</mo> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>&amp;cap;</mo> <mo>&amp;part;</mo> <mi>&amp;Omega;</mi> </mrow> </munder> <msub> <mi>&amp;theta;</mi> <mi>q</mi> </msub> <mo>+</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>q</mi> <mo>&amp;Element;</mo> <msub> <mi>N</mi> <mi>p</mi> </msub> </mrow> </munder> <msub> <mi>v</mi> <mrow> <mi>p</mi> <mi>q</mi> </mrow> </msub> </mrow>
其中,对于图像中任一像素点p,Np表示像素点p的四个邻域点的集合,在四邻域条件下|Np|=4,q表示Np内一点,表示轮廓像素点,vpq=θpq,该式可以通过求解线性代数的方式进行求解,从而获得所述的稠密梯度方向场ΘΩ
(1204)在获得所述的稠密梯度方向场ΘΩ后,以轮廓像素点为中心,利用HOG算法对ΘΩ进行多尺度采样,构造该轮廓图的特征向量。
4.根据权利要求1所述的一种基于草图的交互式图像搜索与融合方法,其特征在于:所述步骤(1)中对所计算得到的物体的特征向量进行聚类所采用方法是k-means聚类方法。
5.根据权利要求1所述的一种基于草图的交互式图像搜索与融合方法,其特征在于:步骤(1)中所述的按照倒排索引的方式,建立所述图像库的索引文件的具体内容是包含如下操作步骤:
(1301)根据BoVW模型,将所有子图像的词频统计直方图合并在一起组成一个N行K列的直方图矩阵,其中N为图像库中子图像的个数,K为聚类中心数,将矩阵保存到文件中;
(1302)按列遍历上述直方图矩阵,统计每一列中值不为0的图像的标号,并将统计结果写入文件中,这样就获得了所需要的倒排索引文件。
6.根据权利要求1或5所述的一种基于草图的交互式图像搜索与融合方法,其特征在于:所述步骤(2)的具体内容是包含如下操作步骤:
(21)按照步骤(1)中所述方法,计算输入草图的特征向量;
(22)利用步骤(1)中获得的视觉词典,统计视觉单词出现的频率,得到草图对应的统计直方图Q;
(23)利用步骤(1301)和(1302)获得的倒排索引结构以及矩阵,计算查询草图与子图像的相似度,相似度公式定义如下:
<mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <mi>Q</mi> <mo>,</mo> <msub> <mi>D</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <msub> <mi>B</mi> <mi>Q</mi> </msub> <msub> <mi>B</mi> <msub> <mi>D</mi> <mi>i</mi> </msub> </msub> </mrow> </mfrac> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>p</mi> <mo>&amp;Element;</mo> <mi>Q</mi> <mo>&amp;cap;</mo> <msub> <mi>D</mi> <mi>i</mi> </msub> </mrow> </msub> <msub> <mi>f</mi> <mrow> <mi>Q</mi> <mo>,</mo> <mi>p</mi> </mrow> </msub> <msub> <mi>f</mi> <mrow> <msub> <mi>D</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>p</mi> </mrow> </msub> <msub> <mi>IDF</mi> <mi>p</mi> </msub> </mrow>
该式中,Q表示查询草图的统计直方图,Di表示图像库中子图像i的统计直方图, N是图像库中子图像的个数,p表示视觉词典中聚类中心的标号,fp是图像库中包含视觉单词Wp的子图像的个数,而fQ,p以及分别是视觉单词Wp在查询草图以及子图像i中所占的频率;
(24)通过步骤(23),计算得到子图像i与用户输入草图的相似度Si,利用如下公式求得在Top-k下出现的类别的反馈值FT
<mrow> <msub> <mi>F</mi> <mi>T</mi> </msub> <mo>=</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <msub> <mi>T</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>T</mi> </mrow> </msub> <mfrac> <msub> <mi>S</mi> <mi>i</mi> </msub> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </msubsup> <msub> <mi>S</mi> <mi>i</mi> </msub> </mrow> </mfrac> <mo>*</mo> <msub> <mi>C</mi> <mi>i</mi> </msub> </mrow>
上式中,Ci为YOLO返回的子图像i标签的准确度,Ti为子图像i的标签,T为某一类别标签;利用上述公式获得的各个类别标签的反馈值FT,然后利用如下公式对Top-n下子图像进行相似度重计算,这里n一般取大于等于k的自然数,其中Si为反馈前子图像i的相似度,S'i为重新计算获得的子图像i的相似度;
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <msubsup> <mi>S</mi> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> <mo>=</mo> <msup> <msub> <mi>S</mi> <mi>i</mi> </msub> <mrow> <mn>1</mn> <mo>-</mo> <msub> <mi>F</mi> <mi>T</mi> </msub> </mrow> </msup> </mrow> </mtd> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <msub> <mi>T</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>T</mi> </mrow> </mtd> </mtr> </mtable> </mfenced>
在Top-n下对S'i进行重新排序;
(25)利用步骤(1)中的映射关系,返回相似度最高的前k张子图像所对应的源图像。
7.根据权利要求1所述的一种基于草图的交互式图像搜索与融合方法,其特征在于:所述步骤(3)的具体内容是包含如下操作步骤:
(31)对于草图检索返回的结果,使用Grabcut算法,将图像中的物体抠出,然后将抠图结果留在备选区待用;
(32)待所有物体都被抠出放入备选区后,将备选区中物体全部放置在背景图片上,调整其大小以及位置,然后使用Possion融合,将物体融合到背景中,从而获得一副自然的图片。
CN201710652876.8A 2017-08-02 2017-08-02 一种基于草图的交互式图像搜索与融合方法 Active CN107515905B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710652876.8A CN107515905B (zh) 2017-08-02 2017-08-02 一种基于草图的交互式图像搜索与融合方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710652876.8A CN107515905B (zh) 2017-08-02 2017-08-02 一种基于草图的交互式图像搜索与融合方法

Publications (2)

Publication Number Publication Date
CN107515905A true CN107515905A (zh) 2017-12-26
CN107515905B CN107515905B (zh) 2020-06-26

Family

ID=60723085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710652876.8A Active CN107515905B (zh) 2017-08-02 2017-08-02 一种基于草图的交互式图像搜索与融合方法

Country Status (1)

Country Link
CN (1) CN107515905B (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536769A (zh) * 2018-03-22 2018-09-14 深圳市安软慧视科技有限公司 图像分析方法、搜索方法及装置、计算机装置及存储介质
CN109711437A (zh) * 2018-12-06 2019-05-03 武汉三江中电科技有限责任公司 一种基于yolo网络模型的变压器部件识别方法
CN109858570A (zh) * 2019-03-08 2019-06-07 京东方科技集团股份有限公司 图像分类方法及系统、计算机设备及介质
CN112364199A (zh) * 2021-01-13 2021-02-12 太极计算机股份有限公司 一种图片搜索系统
CN113392245A (zh) * 2021-06-16 2021-09-14 南京大学 一种用于众测任务发布的文本摘要与图文检索生成方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426705A (zh) * 2011-09-30 2012-04-25 北京航空航天大学 一种视频场景行为拼接方法
CN104778242A (zh) * 2015-04-09 2015-07-15 复旦大学 基于图像动态分割的手绘草图图像检索方法及系统
US20150269191A1 (en) * 2014-03-20 2015-09-24 Beijing University Of Technology Method for retrieving similar image based on visual saliencies and visual phrases
CN105808665A (zh) * 2015-12-17 2016-07-27 北京航空航天大学 一种新的基于手绘草图的图像检索方法
CN106126581A (zh) * 2016-06-20 2016-11-16 复旦大学 基于深度学习的手绘草图图像检索方法
CN106649487A (zh) * 2016-10-09 2017-05-10 苏州大学 基于兴趣目标的图像检索方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426705A (zh) * 2011-09-30 2012-04-25 北京航空航天大学 一种视频场景行为拼接方法
US20150269191A1 (en) * 2014-03-20 2015-09-24 Beijing University Of Technology Method for retrieving similar image based on visual saliencies and visual phrases
CN104778242A (zh) * 2015-04-09 2015-07-15 复旦大学 基于图像动态分割的手绘草图图像检索方法及系统
CN105808665A (zh) * 2015-12-17 2016-07-27 北京航空航天大学 一种新的基于手绘草图的图像检索方法
CN106126581A (zh) * 2016-06-20 2016-11-16 复旦大学 基于深度学习的手绘草图图像检索方法
CN106649487A (zh) * 2016-10-09 2017-05-10 苏州大学 基于兴趣目标的图像检索方法

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JINGYU WANG等: "MindCamera: Interactive Sketch-Based Image Retrieval and Synthesis", 《IEEE ACCESS 》 *
JOSEPH REDMON等: "You Only Look Once:Unified, Real-Time Object Detection", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
PATRICK P´EREZ等: "Poisson image editing", 《ACM TRANSACTIONS ON GRAPHICS》 *
RUI HU等: "Gradient field descriptor for sketch based retrieval and localization", 《2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING》 *
赵宇: "基于草图的交互式图像搜索与融合系统", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536769A (zh) * 2018-03-22 2018-09-14 深圳市安软慧视科技有限公司 图像分析方法、搜索方法及装置、计算机装置及存储介质
CN108536769B (zh) * 2018-03-22 2023-01-03 深圳市安软慧视科技有限公司 图像分析方法、搜索方法及装置、计算机装置及存储介质
CN109711437A (zh) * 2018-12-06 2019-05-03 武汉三江中电科技有限责任公司 一种基于yolo网络模型的变压器部件识别方法
CN109858570A (zh) * 2019-03-08 2019-06-07 京东方科技集团股份有限公司 图像分类方法及系统、计算机设备及介质
US11144799B2 (en) 2019-03-08 2021-10-12 Beijing Boe Optoelectronics Technology Co., Ltd. Image classification method, computer device and medium
CN112364199A (zh) * 2021-01-13 2021-02-12 太极计算机股份有限公司 一种图片搜索系统
CN113392245A (zh) * 2021-06-16 2021-09-14 南京大学 一种用于众测任务发布的文本摘要与图文检索生成方法
CN113392245B (zh) * 2021-06-16 2023-12-26 南京大学 一种用于众测任务发布的文本摘要与图文检索生成方法

Also Published As

Publication number Publication date
CN107515905B (zh) 2020-06-26

Similar Documents

Publication Publication Date Title
CN107515905B (zh) 一种基于草图的交互式图像搜索与融合方法
Cheng et al. Salientshape: group saliency in image collections
Eitz et al. Photosketcher: interactive sketch-based image synthesis
Cheng et al. Global contrast based salient region detection
CN112101150B (zh) 一种基于朝向约束的多特征融合行人重识别方法
US9251434B2 (en) Techniques for spatial semantic attribute matching for location identification
US9460518B2 (en) Visual clothing retrieval
CN105493078B (zh) 彩色草图图像搜索
US9087242B2 (en) Video synthesis using video volumes
US11704357B2 (en) Shape-based graphics search
CN102902807B (zh) 使用多个视觉输入模态的视觉搜索
AU2014321165A1 (en) Image searching method and apparatus
WO2006075902A1 (en) Method and apparatus for category-based clustering using photographic region templates of digital photo
CN107301644B (zh) 基于均值漂移和模糊聚类的自然图像无监督分割方法
CN110188763B (zh) 一种基于改进图模型的图像显著性检测方法
Hu et al. Markov random fields for sketch based video retrieval
Liu et al. Fast interactive image segmentation by discriminative clustering
Ahn et al. Face and hair region labeling using semi-supervised spectral clustering-based multiple segmentations
Wang et al. MindCamera: Interactive sketch-based image retrieval and synthesis
US20130301938A1 (en) Human photo search system
Min et al. Mobile landmark search with 3D models
Zhao et al. Learning best views of 3D shapes from sketch contour
Aamir et al. A hybrid approach for object proposal generation
Wu et al. Text detection using delaunay triangulation in video sequence
US11869127B2 (en) Image manipulation method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant