WO2018223370A1 - 一种基于时空约束的视频显著性检测方法及系统 - Google Patents

一种基于时空约束的视频显著性检测方法及系统 Download PDF

Info

Publication number
WO2018223370A1
WO2018223370A1 PCT/CN2017/087709 CN2017087709W WO2018223370A1 WO 2018223370 A1 WO2018223370 A1 WO 2018223370A1 CN 2017087709 W CN2017087709 W CN 2017087709W WO 2018223370 A1 WO2018223370 A1 WO 2018223370A1
Authority
WO
WIPO (PCT)
Prior art keywords
energy
motion
current frame
saliency
map
Prior art date
Application number
PCT/CN2017/087709
Other languages
English (en)
French (fr)
Inventor
邹文斌
陈宇环
王振楠
李霞
徐晨
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Priority to PCT/CN2017/087709 priority Critical patent/WO2018223370A1/zh
Publication of WO2018223370A1 publication Critical patent/WO2018223370A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion

Definitions

  • the invention belongs to the technical field of video, and in particular relates to a video saliency detection method and system based on space-time constraints.
  • the significance detection aims to predict visually relatively attractive areas. It has a wide range of applications in video classification, video retrieval, video summary, scene understanding, target tracking, etc. It is the basis and key issue of computer vision. Since motion information is an important clue for video saliency detection, unlike static image saliency detection, only spatial information is considered. The saliency detection of video should consider both motion information and spatial information.
  • the technical problem to be solved by the present invention is to provide a video saliency detection method and system based on spatiotemporal constraints, which aims to solve the problem of insufficient robustness of the existing video saliency detection method in complex scenarios.
  • the present invention is implemented in this way, a video saliency detection method based on space-time constraints, comprising:
  • a saliency global optimization model for constructing a spatiotemporal constraint is constructed according to the reliable target region, the reliable background region, and the mixed motion energy map, and the saliency global optimization model is solved to obtain a saliency map of the current frame.
  • the method further includes:
  • the current frame to be detected of the to-be-detected video is subjected to super-pixel segmentation, and the current frame obtained by the super-pixel segmentation includes:
  • the optical flow field motion estimation of the current frame is calculated by using a pyramid LK optical flow method according to the current frame and the previous frame of the current frame obtained by superpixel segmentation.
  • r i denotes a super pixel whose index is i in the super pixel set
  • N denotes the number of elements of the super pixel set
  • P t (r j ) denotes an average spatial position of the super pixel r j to mA (r i) represents the average similarity measure between r i and other super-pixels, expressed in ⁇ i r i superpixel used mA (ri) weighted average spatial position to Md (r i) representing the motion distribution Energy, then:
  • calculating the motion edge energy of the current frame according to the optical flow field motion estimation specifically includes:
  • the motion edge energy is calculated from the optical flow field motion estimate using a Sobel edge detector.
  • the balance parameter is represented by ⁇ , and the value range is [0, 1], where r i represents the super pixel with index i in the super pixel set, and S t-1 represents the significantness of the previous frame.
  • M h (r i ) represents the motion history energy of the super pixel r i
  • M e (r i ) represents the motion edge energy of the super pixel r i
  • M d (r i ) represents the motion of the super pixel r i .
  • the energy is distributed, and the mixed motion energy map is represented by M(r i ):
  • the initial target segmentation region is calculated according to the mixed motion energy map, and extracting the reliable target region and the reliable background region from the initial target segmentation region includes:
  • the reliable target area and the reliable background area are extracted from the super pixel set by a clustering method.
  • E(S) represents the target energy function
  • the target energy function is obtained by the following steps:
  • F(r i ) is the foreground a priori in the foreground term, indicating the magnitude of the probability that the superpixel r i belongs to the foreground
  • w b (r i ) is the background prior in the background term, indicating the super pixel r i is the probability magnitude of the background
  • w ij (r i , r j ) is a smoothing hypothesis, indicating the apparent similarity of two adjacent superpixels
  • the foreground prior F(r i ) in the foreground term is obtained by the following formula, namely:
  • M(r i ) represents the mixed motion energy of the super pixel r i
  • A(r i ) represents the average apparent similarity of the super pixel r i and the super pixel in the reliable target region
  • the background prior in the background term is represented by the superpixel r i and the average apparent similarity of the superpixels in the reliable background region.
  • the obtaining the saliency global optimization model to obtain the saliency map of the current frame comprises:
  • a saliency map of the current frame is obtained by solving the saliency global optimization model by a restricted least squares method.
  • the embodiment of the invention further provides a video saliency detection system based on space-time constraints, comprising:
  • An energy calculation unit configured to perform super-pixel segmentation on a current frame to be detected of the to-be-detected video, to obtain a current frame and a super-pixel set after super-pixel segmentation, and calculate an optical flow according to the current frame and a previous frame of the current frame Field motion estimation, calculating motion distribution energy and motion edge energy of the current frame according to the optical flow field estimation, acquiring a saliency map of the previous frame, and calculating motion according to the current frame and the previous frame a historical energy, generating a mixed motion energy map according to the saliency map of the previous frame, the motion distribution energy, the motion edge energy, and the motion history energy;
  • a saliency map calculation unit for obtaining an initial target segmentation region of the mixed motion energy map, extracting a reliable target region and a reliable background region from the initial target segmentation region, according to the reliable target region, the reliable background region, and
  • the hybrid energy motion graph constructs a salient global optimization model of spatiotemporal constraints, and the salient global optimization model is solved to obtain a salient map of the current frame.
  • the energy calculation unit is specifically configured to:
  • the present invention is further configured to perform super-pixel segmentation on the current frame to be detected by a simple linear iterative clustering algorithm to obtain a current frame after super-pixel segmentation, according to the current frame and the previous frame obtained by super-pixel segmentation.
  • the pyramid LK optical flow method calculates an optical flow field motion estimate for the current frame.
  • r i denotes a super pixel whose index is i in the super pixel set
  • N denotes the number of elements of the super pixel set
  • P t (r j ) denotes an average spatial position of the super pixel r j to mA (r i) represents the average similarity measure between r i and other super-pixels
  • ⁇ i to r i represents the superpixel used mA (r i) weighted average spatial position to Md (r i) representing the motion Distribution energy
  • the balance parameter is represented by ⁇ , and the value range is [0, 1], where r i represents the super pixel with index i in the super pixel set, and S t-1 represents the significantness of the previous frame.
  • M h (r i ) represents the motion history energy of the super pixel r i
  • M e (r i ) represents the motion edge energy of the super pixel r i
  • M d (r i ) represents the motion of the super pixel r i .
  • the energy is distributed, and the mixed motion energy map is represented by M(r i ):
  • the saliency map calculation unit is specifically configured to:
  • the Otsu method binarizing the mixed motion energy map to obtain a binary image, performing a digital image morphology opening operation on the binary image to obtain the initial target segmentation region, and calculating the initial a feature of a superpixel in a target segmentation region, the feature comprising a two-dimensional spatial location, a color feature, and a mixed motion energy value, and representing the superpixel in the superpixel set by the feature, using a clustering method from the The reliable target area and the reliable background area are extracted from a set of superpixels.
  • E(S) represents the target energy function
  • the saliency map calculation unit acquires the target energy function by the following steps:
  • F(r i ) is the foreground a priori in the foreground term, indicating the magnitude of the probability that the superpixel r i belongs to the foreground
  • w b (r i ) is the background prior in the background term, indicating the super pixel r i is the probability magnitude of the background
  • w ij (r i , r j ) is a smoothing hypothesis, indicating the apparent similarity of two adjacent superpixels
  • the foreground prior F(r i ) in the foreground term is obtained by the following formula, namely:
  • M(r i ) represents the mixed motion energy of the super pixel r i
  • A(r i ) represents the average apparent similarity of the super pixel r i and the super pixel in the reliable target region
  • the background prior in the background item is represented by a superpixel r i and an average similarity of superpixels in the reliable background region;
  • the saliency map calculation unit is further configured to solve the saliency global optimization model by a restricted least squares method to obtain a saliency map of the current frame.
  • the present invention has the beneficial effects that the embodiment of the present invention uses the motion information and the spatial information to establish the mixed motion energy.
  • a reliable region detection algorithm based on multi-dimensional feature clustering is proposed to extract a reliable significant target and a reliable background.
  • the region which in turn establishes a significant global optimization model of space-time constraints.
  • the embodiments of the present invention use various types of motion features and spatial features such as motion distribution energy of the region layer, motion edge energy of the edge layer, motion history energy of the pixel layer, and saliency map of the previous frame. The robustness and stability of the significance detection.
  • FIG. 1 is a flowchart of a video saliency detection method based on spatiotemporal constraints according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for detecting video saliency based on space-time constraints according to another embodiment of the present invention
  • FIG. 3 is a diagram showing the use effect of a video saliency detection method based on spatiotemporal constraints according to another embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a video saliency detection system based on spatiotemporal constraints according to another embodiment of the present invention.
  • FIG. 1 is a diagram showing a video saliency detection method based on space-time constraints according to an embodiment of the present invention, including:
  • S101 Perform super-pixel segmentation on the current frame to be detected of the to-be-detected video, to obtain a current frame and a super-pixel set after super-pixel segmentation.
  • the super-pixel segmentation of the current frame to be detected may be a simple linear iterative clustering (SLIC), but is not limited to this method.
  • SLIC simple linear iterative clustering
  • the super-pixel segmentation process detects a pre-processing operation of the current frame, and performs super-pixel segmentation to obtain a super-pixel set.
  • the various motion energies in the subsequent steps are super-pixel level representations. Therefore, in this step, the current to be detected needs to be detected first.
  • the frame is superpixel divided to obtain the current frame after superpixel segmentation.
  • calculating the optical flow field motion estimation of the current frame may adopt a pyramid LK (Lucas-Kanade) optical flow method, but is not limited to this method.
  • the motion distribution energy is a motion feature of the region layer.
  • the motion distribution energy value for each superpixel r i in the image is proposed and calculated, and is defined as follows:
  • r i represents the super pixel with index i in the super pixel set
  • N represents the number of elements of the super pixel set
  • P t (r j ) represents the average spatial position of the super pixel r j
  • mA(r i ) r i is the average similarity measure between other super pixel
  • ⁇ i r i represents the superpixel used mA (r i) weighted average spatial position
  • Md (r i) representing the energy distribution of the motion.
  • the motion edge energy is the motion feature of the edge layer, and its purpose is to extract the contour features of the moving target.
  • the motion edge energy can be calculated from the acquired optical flow field using a Sobel edge detector, but is not limited to this method.
  • the motion history energy is an image change detection performed on a pixel level. The closer the time when the pixel changes, the larger the energy value, and the farther the pixel changes, the smaller the corresponding energy value.
  • the calculation method of the mixed motion energy map may be, but not limited to, the following method, and the mixed motion energy map is represented by Mr i , then:
  • represents a balance parameter
  • the value range is [0, 1]
  • r i represents a super pixel with an index i in the super pixel set
  • S t-1 represents a saliency map of the previous frame
  • M h For the motion history energy, the most recent motion of the image is characterized, and the closer the moment of the pixel's recent motion is to the current frame, the higher the value.
  • M e and M d mainly detect the distribution of the edge and motion of the moving object.
  • ri in M h (ri) represents a super pixel with index i
  • M e (r i ) represents the motion edge energy of super pixel r i
  • M d (r i ) represents the motion distribution of super pixel r i . energy. .
  • the initial target segmentation area may be calculated using, but not limited to, the following method:
  • the obtained mixed motion energy map is binarized to obtain a binary image
  • the binary image is subjected to the opening operation of the digital image morphology to obtain the initial target segmentation region.
  • the cluster-based reliable target region and the reliable background region extraction are performed, and the steps are as follows:
  • the plurality of features include, but are not limited to, a two-dimensional spatial position, a color feature, and a mixed motion energy value, etc., and the superpixels are represented by the features;
  • a clustering method is used to extract a reliable target area and a reliable background area in the super pixel set.
  • S108 Construct a saliency global optimization model of the spatiotemporal constraint according to the reliable target region, the reliable background region, and the mixed motion energy map, and solve the saliency global optimization model to obtain a saliency map of the current frame.
  • the present embodiment constructs a significant global optimization model of the minimum target energy function of the space-time constraint based on the quadratic programming theory to calculate the video frame.
  • the significance value is defined as follows:
  • E(S) represents the target energy function
  • corresponding energy minimum objective functions are respectively designed for foreground items, background items and smoothing items, and the obtained three energy minimum objective functions are combined into one target energy function E(S).
  • the sub-item of the target energy function is designed as follows:
  • F(r i ) is the foreground a priori in the foreground term, indicating the magnitude of the probability that the superpixel r i belongs to the foreground
  • w b (r i ) is the background prior in the background term, indicating the super pixel r
  • the magnitude of the probability that i belongs to the background, w ij (r i , r j ) is a smoothing hypothesis, indicating the apparent similarity of two adjacent superpixels.
  • the calculation method of the foreground a priori F(r i ) in the foreground item may be, but is not limited to, the following method:
  • M(r i ) represents the mixed motion energy of the super pixel r i
  • A(r i ) represents the average apparent similarity of the super pixels in the reliable target region of the super pixel r i ;
  • the background a priori w b (r i ) in the background term can be expressed by the superpixel r i and the average apparent similarity of the superpixels in the resulting reliable background region, but is not limited to this method.
  • a super-pixel is used as a data node to establish an undirected connection graph, and a reliable target region and a reliable background region are used as positive and negative sample labels, and the constraint condition is constructed into a spatio-temporal confidence propagation model based on semi-supervised learning theory, which is a significant value propagation.
  • the foreground item, the background item, the smoothing item and the constraint in the model can be weighted according to different foreground or background priors, and have versatility and flexibility.
  • the solution of the saliency global optimization model is a convex quadratic optimization problem, which can be solved by the constrained least squares method.
  • two video frames are required to perform iterative calculation, which is embodied in: 1 that the saliency map calculation of the current frame requires a significant picture of the previous frame; 2 the generation of the current frame optical flow field also requires a previous video. Frames, ie two video frames, generate an optical flow field.
  • the saliency map of the first frame (first frame) in the video to be detected cannot be calculated because there is no information of the previous frame, so there is no way to calculate the saliency map of the previous frame, and the optical flow field cannot be generated;
  • the second frame in the to-be-detected video can calculate the saliency map, but since the first frame does not calculate the saliency map, the calculated input is less than the “saliency map of the previous frame”.
  • the specific calculation flow is shown in FIG. 2 .
  • the video to be detected starts from the third frame and is calculated according to the flow shown in FIG. 1 above.
  • Fig. 3 shows an example obtained by the above-described embodiment provided by the present invention, wherein Fig. 3a shows the current frame, Fig. 3b shows the previous frame, Fig. 3c shows the saliency of the previous frame, and Fig. 3d shows the motion distribution.
  • Energy Fig. 3e shows the motion edge energy
  • Fig. 3f shows the motion history energy
  • Fig. 3g shows the mixed motion energy
  • Fig. 3h shows the reliable region
  • Fig. 3i shows the saliency map of the current frame.
  • “energy” is essentially a set of values, each value corresponding to the energy value of one pixel or super pixel; the “energy map” is only a visualization of the energy value, the energy value The stronger the color, the whiter it is. The smaller the energy value, the darker the color.
  • FIG. 4 shows a video saliency detection system based on spatiotemporal constraints according to an embodiment of the present invention, including:
  • the energy calculation unit 401 is configured to perform super-pixel segmentation on the current frame to be detected of the to-be-detected video, to obtain a current frame and a super-pixel set after super-pixel segmentation, and calculate light according to the current frame and the previous frame of the current frame.
  • Flow field estimation calculating motion distribution energy and motion edge energy of the current frame according to the optical flow motion estimation, acquiring a saliency map of the previous frame, and calculating according to the current frame and the previous frame a motion history energy, generating a mixed motion energy map according to the salient map of the previous frame, the motion distribution energy, the motion edge energy, and the motion history energy;
  • a saliency map calculation unit 402 configured to obtain an initial target segmentation region of the mixed motion energy map, and extract a reliable target region and a reliable background region from the initial target segmentation region, according to the reliable target region, the reliable background
  • the region and the mixed energy motion map construct a significant global optimization model of spatiotemporal constraints, and the salient global optimization model is solved to obtain a saliency map of the current frame.
  • the energy calculation unit 401 is specifically configured to:
  • the present invention is further configured to perform super-pixel segmentation on the current frame to be detected by a simple linear iterative clustering algorithm to obtain a current frame after super-pixel segmentation, according to the current frame and the previous frame obtained by super-pixel segmentation.
  • the pyramid LK optical flow method calculates an optical flow field motion estimate for the current frame.
  • R & lt superpixel i represents the index is a super set of pixel i
  • N represents the number of elements in a super-set of pixels to P t (r j) represents an average over the spatial location of the pixels r j in mA (r i) represents the average similarity measure between r i and other super-pixels
  • ⁇ i to r i represents the superpixel used mA (r i) weighted average spatial position to Md (r i) representing the motion Distribution energy
  • the balance parameter is represented by ⁇ , and the value range is [0, 1], where r i represents the super pixel with index i in the super pixel set, and S t-1 represents the significantness of the previous frame.
  • ri in M h (ri) represents a super pixel with index i
  • M e (r i ) represents the motion edge energy of super pixel r i
  • M d (r i ) represents the motion distribution of super pixel r i .
  • Energy, expressed by M(r i ) is the mixed motion energy diagram:
  • saliency map calculation unit 402 is specifically configured to:
  • the Otsu method binarizing the mixed motion energy map to obtain a binary image, performing a digital image morphology opening operation on the binary image to obtain the initial target segmentation region, and calculating the initial a feature of a superpixel in a target segmentation region, the feature comprising a two-dimensional spatial location, a color feature, and a mixed motion energy value, and representing the superpixel in the superpixel set by the feature, using a clustering method from the The reliable target area and the reliable background area are extracted from a set of superpixels.
  • E (S) represents a target energy function
  • the saliency map calculation unit 402 acquires the target energy function by the following steps:
  • F(r i ) is the foreground a priori in the foreground term, indicating the magnitude of the probability that the superpixel r i belongs to the foreground
  • w b (r i ) is the background prior in the background term, indicating the super pixel r i is the probability magnitude of the background
  • w ij (r i , r j ) is a smoothing hypothesis, indicating the apparent similarity of two adjacent superpixels
  • the foreground prior F(r i ) in the foreground term is obtained by the following formula, namely:
  • M(r i ) represents the mixed motion energy of the super pixel r i
  • A(r i ) represents the average apparent similarity of the super pixel r i and the super pixel in the reliable target region
  • the background prior in the background item is represented by a superpixel r i and an average similarity of superpixels in the reliable background region;
  • the saliency map calculation unit 402 is further configured to solve the saliency global optimization model by a restricted least squares method to obtain a saliency map of the current frame.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

适用于视频检测领域,提供了视频显著性检测方法,包括:对待检测当前帧进行超像素分割得到超像素分割后的当前帧,根据当前帧和上一帧,计算光流场运动估计并计算得到运动分布能量和运动边缘能量,根据当前帧和上一帧计算运动历史能量,并由上述特征和上一帧的显著图生成混合运动能量图;获得混合运动能量图的初始目标分割区域并提取可靠目标区域和可靠背景区域,根据可靠目标区域、可靠背景区域和混合运动能量图构建显著性全局优化模型并求解,当前帧的显著图。采用区域层的运动分布能量、边缘层的运动边缘能量、像素层的运动历史能量和上一帧显著图等多种运动特征和空间特征,增强了显著性检测的鲁棒性和稳定性。

Description

一种基于时空约束的视频显著性检测方法及系统 技术领域
本发明属于视频技术领域,尤其涉及一种基于时空约束的视频显著性检测方法及系统。
背景技术
显著性检测旨在预测视觉上相对引人注意的区域,其在视频分类、视频检索、视频摘要、场景理解、目标跟踪等领域有广泛的应用,是计算机视觉的基础和关键问题。由于运动信息是视频显著性检测的重要线索,因此不同于静态图像显著性检测只考虑空间信息,视频的显著性检测要同时考虑运动信息和空间信息。
如何提取显著目标的运动信息是视频显著性检测的关键问题。目前,大多数方法采用光流场估计显著目标的运动,然而光流场对于光照变化和局部扰动非常敏感,导致不稳定的运动估计结果;另外一些方法采用边缘检测和运动连续性来估计目标的运动,但在复杂背景下的鲁棒性不足。
此外,如何根据运动信息和空间信息建立整体的显著性检测框架是视频显著性检测的另一个重要议题。目前,大多数方法首先提取视频空间信息和运动信息,然后分别建立空域显著图和时域显著图,进而将两者进行线性融合或动态融合作为视频显著性检测结果。这种框架没有将运动信息和空间信息实质性的进行融合,而只是将运动信息生成的显著图作为空域显著图的先验信息或补充信息,在面对复杂场景时不能完整地凸显显著目标和有效抑制复杂背景。
发明内容
本发明所要解决的技术问题在于提供一种基于时空约束的视频显著性检测方法及系统,旨在解决现有视频显著性检测方法在复杂场景下鲁棒性不足的问题。
本发明是这样实现的,一种基于时空约束的视频显著性检测方法,包括:
对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧和超像素集合;
根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计;
根据所述光流场运动估计计算所述当前帧的运动分布能量和运动边缘能量;
获取所述上一帧的显著图;
根据所述当前帧和所述上一帧,计算运动历史能量;
根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图;
获得所述混合运动能量图的初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域;
根据所述可靠目标区域、所述可靠背景区域和所述混合运动能量图构建时空约束的显著性全局优化模型,求解所述显著性全局优化模型得到所述当前帧的显著图。
进一步地,所述获取所述上一帧的显著图之前,还包括:
判断所述上一帧是否为所述待检测视频的第一帧;
若是,则根据所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合能量运动图;
若否,则执行所述根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合能量运动图的步骤。
进一步地,所述对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧包括:
通过简单线性迭代聚类算法对所述待检测当前帧进行超像素分割,得到超像素分割后的当前帧;
则所述根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计包括:
根据超像素分割后得到所述当前帧和所述当前帧的上一帧,采用金字塔LK光流法计算所述当前帧的光流场运动估计。
进一步地,以ri表示所述超像素集合中索引为i的超像素,以N表示超像素集合的元素个数,以Pt(rj)表示超像素rj的平均空间位置,以mA(ri)表示ri与其它超像素之间的平均相似性度量,以μi表示超像素ri使用mA(ri)加权后的平均空间位置,以Md(ri)表示所述运动分布能量,则:
Figure PCTCN2017087709-appb-000001
进一步地,根据所述光流场运动估计计算所述当前帧的运动边缘能量具体包括:
使用Sobel边缘检测器从所述光流场运动估计计算所述运动边缘能量。
进一步地,以γ表示平衡参数,其取值范围为[0,1],以ri表示所述超像素集合中索引为i的超像素,以St-1表示所述上一帧的显著图,以Mh(ri)表示超像素ri的运动历史能量,以 Me(ri)表示超像素ri的运动边缘能量,以Md(ri)表示超像素ri的运动分布能量,以M(ri)表示所述混合运动能量图,则:
Figure PCTCN2017087709-appb-000002
进一步地,所述根据所述混合运动能量图计算初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域包括:
利用大津法,对所述混合运动能量图进行二值化处理,得到二值图像;
对所述二值图像进行数字图像形态学的开操作处理,得到所述初始目标分割区域;
计算所述初始目标分割区域中超像素的特征,所述特征包括二维空间位置、颜色特征和混合运动能量值,并以所述特征表示所述超像素集合中的超像素;
采用聚类的方法,从所述超像素集合中提取所述可靠目标区域和所述可靠背景区域。
进一步地,所述显著性全局优化模型表示为:
Figure PCTCN2017087709-appb-000003
其中,E(S)表示目标能量函数,S={s1,s2,...,sN}表示待求解的超像素的显著性值序列,si的取值范围为[0,1],N表示所述超像素集合的元素个数,Φ表示前景项,Γ表示背景项,Ψ表示平滑项,
Figure PCTCN2017087709-appb-000004
表示空间上相邻的超像素对的集合,Θ(S)=k表示时空约束条件;
所述目标能量函数通过以下步骤获得:
分别对所述前景项、所述背景项和所述平滑项设计能量最小目标函数,将得到的三个能量最小目标函数组合形成所述目标能量函数,其中,以φ(si)表示所述前景项的能量最小目标函数,以Γ(si)表示所述背景项的能量最小目标函数,以ψ(si,sj)表示所述平滑项的能量最小目标函数,则:
φ(si)=F(ri)(1-si)2
Γ(si)=wb(ri)si 2               ;
ψ(si,sj)=wij(ri,rj)(si-sj)2
其中,F(ri)为所述前景项中的前景先验,表示超像素ri属于前景的概率大小,wb(ri)为所述背景项中的背景先验,表示超像素ri属于背景的概率大小,wij(ri,rj)为平滑假设,表示两个相邻超像素的表观相似度;
所述前景项中的前景先验F(ri)通过下述公式求得,即:
F(ri)=A(ri)M(ri);
其中,M(ri)表示超像素ri的混合运动能量,A(ri)表示超像素ri和所述可靠目标区域中超像素的平均表观相似度;
所述背景项中的背景先验采用超像素ri和所述可靠背景区域中超像素的平均表观相似度表示。
进一步地,所述求解所述显著性全局优化模型得到所述当前帧的显著图包括:
通过受限最小二乘法求解所述显著性全局优化模型,得到所述当前帧的显著图。
本发明实施例还提供了一种基于时空约束的视频显著性检测系统,包括:
能量计算单元,用于对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧和超像素集合,根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计,根据所述光流场运动估计计算所述当前帧的运动分布能量和运动边缘能量,获取所述上一帧的显著图,根据所述当前帧和所述上一帧,计算运动历史能量,根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图;
显著图计算单元,用于获得所述混合运动能量图的初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域,根据所述可靠目标区域、所述可靠背景区域和所述混合能量运动图构建时空约束的显著性全局优化模型,求解所述显著性全局优化模型得到所述当前帧的显著图。
进一步地,所述能量计算单元具体用于:
提取所述当前帧的上一帧,并判断所述上一帧是否为所述待检测视频的第一帧,若是,则根据所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图,若否,则执行所述根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图的步骤;
还用于通过简单线性迭代聚类算法对所述待检测当前帧进行超像素分割,得到超像素分割后的当前帧,根据超像素分割后得到的所述当前帧和所述上一帧,采用金字塔LK光流法计算所述当前帧的光流场运动估计。
进一步地,以ri表示所述超像素集合中索引为i的超像素,以N表示超像素集合的元素个数,以Pt(rj)表示超像素rj的平均空间位置,以mA(ri)表示ri与其它超像素之间的平均相似性度量,以μi表示超像素ri使用mA(ri)加权后的平均空间位置,以Md(ri)表示所述运动分布能量,则:
Figure PCTCN2017087709-appb-000005
进一步地,以γ表示平衡参数,其取值范围为[0,1],以ri表示所述超像素集合中索引为i的超像素,以St-1表示所述上一帧的显著图,以Mh(ri)表示超像素ri的运动历史能量,以 Me(ri)表示超像素ri的运动边缘能量,以Md(ri)表示超像素ri的运动分布能量,以M(ri)表示所述混合运动能量图,则:
Figure PCTCN2017087709-appb-000006
进一步地,所述显著图计算单元具体用于:
利用大津法,对所述混合运动能量图进行二值化处理,得到二值图像,对所述二值图像进行数字图像形态学的开操作处理,得到所述初始目标分割区域,计算所述初始目标分割区域中超像素的特征,所述特征包括二维空间位置、颜色特征和混合运动能量值,并以所述特征表示所述超像素集合中的超像素,采用聚类的方法,从所述超像素集合中提取所述可靠目标区域和所述可靠背景区域。
进一步地,所述显著性全局优化模型表示为:
Figure PCTCN2017087709-appb-000007
其中,E(S)表示目标能量函数,S={s1,s2,...,sN}表示待求解的超像素的显著性值序列,si的取值范围为[0,1],N表示所述超像素集合的元素个数,Φ表示前景项,Γ表示背景项,Ψ表示平滑项,
Figure PCTCN2017087709-appb-000008
表示空间上相邻的超像素对的集合,Θ(S)=k表示时空约束条件;
所述显著图计算单元通过以下步骤获取所述目标能量函数:
分别对所述前景项、所述背景项和所述平滑项设计能量最小目标函数,将得到的三个能量最小目标函数组合形成所述目标能量函数,其中,以φ(si)表示所述前景项的能量最小目标函数,以Γ(si)表示所述背景项的能量最小目标函数,以表示ψ(si,sj)所述平滑项的能量最小目标函数,则:
φ(si)=F(ri)(1-si)2
Γ(si)=wb(ri)si 2               ;
ψ(si,sj)=wij(ri,rj)(si-sj)2
其中,F(ri)为所述前景项中的前景先验,表示超像素ri属于前景的概率大小,wb(ri)为所述背景项中的背景先验,表示超像素ri属于背景的概率大小,wij(ri,rj)为平滑假设,表示两个相邻超像素的表观相似度;
所述前景项中的前景先验F(ri)通过下述公式求得,即:
F(ri)=A(ri)M(ri);
其中,M(ri)表示超像素ri的混合运动能量,A(ri)表示超像素ri和所述可靠目标区域中超像素的平均表观相似度;
所述背景项中的背景先验采用超像素ri和所述可靠背景区域中超像素的平均相似度表示;
所述显著图计算单元还用于通过受限最小二乘法求解所述显著性全局优化模型,得到所述当前帧的显著图。
本发明与现有技术相比,有益效果在于:本发明实施例使用运动信息和空间信息建立混合运动能量,在此基础上提出基于多维特征聚类的可靠区域检测算法提取可靠显著目标与可靠背景区域,进而建立时空约束的显著性全局优化模型。在特征方面,本发明实施例采用区域层的运动分布能量、边缘层的运动边缘能量、像素层的运动历史能量和上一帧显著图等多种运动特征和空间特征,这些特征优劣互补增强了显著性检测的鲁棒性和稳定性。
附图说明
图1是本发明实施例提供的一种基于时空约束的视频显著性检测方法的流程图;
图2是本发明另一实施例提供的一种基于时空约束的视频显著性检测方法的流程图;
图3是本发明另一实施例提供的一种基于时空约束的视频显著性检测方法的使用效果图。
图4是本发明另一实施例提供的一种基于时空约束的视频显著性检测系统的结构示意图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
图1示出了本发明实施例提供的一种基于时空约束的视频显著性检测方法,包括:
S101,对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧和超像素集合。
在本步骤中,对待检测当前帧进行超像素分割可以采用简单线性迭代聚类算法(SLIC,Simple Linear Iterative Clustering),但不限于此方法。超像素分割对待检测当前帧的一个预处理操作,进行超像素分割后得到超像素集合,后续步骤中的各种运动能量都是超像素级别的表示,因此在本步骤中需要先将待检测当前帧进行超像素分割,得到超像素分割后的当前帧。
S102,根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计。
在本步骤中,计算当前帧的光流场运动估计可以采用金字塔LK(Lucas-Kanade)光流法,但不限于此方法。
S103,根据所述光流场运动估计计算所述当前帧的运动分布能量和运动边缘能量。
运动分布能量是区域层的运动特征,本实施例中提出并计算针对图像中每一个超像素ri的运动分布能量值,其定义如下式:
Figure PCTCN2017087709-appb-000009
其中,ri表示所述超像素集合中索引为i的超像素,N表示超像素集合的元素个数,Pt(rj)表示超像素rj的平均空间位置,mA(ri)则是ri与其它超像素之间的平均相似性度量,μi表示超像素ri使用mA(ri)加权后的平均空间位置,Md(ri)表示所述运动分布能量。
运动边缘能量是边缘层的运动特征,其目的是提取运动目标的轮廓特征。可以使用Sobel边缘检测器从获取的光流场计算运动边缘能量,但不限于此方法。
S104,获取所述上一帧的显著图。
S105,根据所述当前帧和所述上一帧,计算运动历史能量。
具体地,运动历史能量是在像素层次上进行的图像变化检测,像素发生改变的时间越近其能量值越大,像素发生改变的时间越远则其对应的能量值越小。
S106,根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图。
在本步骤中,由上述各步骤提取的各种特征通过融合得到混合运动能量图。混合运动能量图的计算方法可以采用但不限于以下方法,以Mri表示所述混合运动能量图,则:
Figure PCTCN2017087709-appb-000010
其中,γ表示平衡参数,其取值范围为[0,1],ri表示所述超像素集合中索引为i的超像素,St-1表示所述上一帧的显著图,Mh为运动历史能量,表征图像最近的运动情况,其像素最近运动的时刻越接近当前帧,其值越高。Me和Md主要检测运动物体的边缘和运动的分布。而上述公式中Mh(ri)中的ri表示索引为i的超像素,Me(ri)表示超像素ri的运动边缘能量,Md(ri)表示超像素ri的运动分布能量。。
S107,获得所述混合运动能量图的初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域。
在本步骤中,可以采用但不限于以下方法计算初始目标分割区域:
第一步,利用大津法,对获得的混合运动能量图进行二值化操作,获得二值图像;
第二步,对二值图像进行数字图像形态学的开操作处理,获得初始目标分割区域。
在上述获得的初始目标分割区域的基础上,进行基于聚类的可靠目标区域和可靠背景区域提取,其步骤如下:
第一步,计算初始目标分割区域中超像素的多种特征,多种特征包括但不限于二维空间位置、颜色特征和混合运动能量值等,并以这些特征表示超像素;
第二步,采用聚类的方法,提取出所述超像素集合中可靠目标区域和可靠背景区域。
S108,根据所述可靠目标区域、所述可靠背景区域和所述混合运动能量图构建时空约束的显著性全局优化模型,求解所述显著性全局优化模型得到所述当前帧的显著图。
在本步骤中,利用上述步骤获得的可靠目标区域、可靠背景区域和混合运动能量,本实施例基于二次规划理论构建一个时空约束的最小化目标能量函数的显著性全局优化模型来计算视频帧的显著性值。本实施例提出的显著性全局优化模型定义如下:
Figure PCTCN2017087709-appb-000011
其中,E(S)表示目标能量函数,S={s1,s2,...,sN}表示待求解的超像素的显著性值序列,si的取值范围为[0,1],N表示所述超像素集合的元素个数,Φ表示前景项,Γ表示背景项,Ψ表示平滑项,
Figure PCTCN2017087709-appb-000012
表示空间上相邻的超像素对的集合,Θ(S)=k表示时空约束条件。
目标能量函数的定义:
本实施例中针对前景项、背景项和平滑项分别设计相应的能量最小目标函数,将得到的三个能量最小目标函数组合于一个目标能量函数E(S)中。该目标能量函数的分项设计如下:
φ(si)=F(ri)(1-si)2
Γ(si)=wb(ri)si 2               ;
ψ(si,sj)=wij(ri,rj)(si-sj)2
其中,F(ri)为所述前景项中的前景先验,表示超像素ri属于前景的概率大小,wb(ri)为所述背景项中的背景先验,表示超像素ri属于背景的概率大小,wij(ri,rj)为平滑假设,表示两个相邻超像素的表观相似度。
前景项中的前景先验F(ri)的计算方法可以采用但不限于以下方法:
F(ri)=A(ri)M(ri);
其中,M(ri)表示超像素ri的混合运动能量,A(ri)表示超像素ri所述可靠目标区域中超像素的平均表观相似度;
背景项中的背景先验wb(ri)可以采用超像素ri和所得的可靠背景区域中超像素的平均表 观相似度表示,但不限于此方法。
时空约束条件的定义:
本实施例以超像素为数据节点建立无向连接图,将可靠目标区域和可靠背景区域作为正负样本标签,基于半监督学习理论把约束条件构建成时空置信度传播模型,为显著值的传播提供支撑。
需要注意的是,在本实施例中,模型中前景项、背景项、平滑项和约束条件都可依据不同的前景或背景先验对它们进行加权设计,具备通用性和灵活性。
在本步骤中,显著性全局优化模型的求解是一个凸二次优化问题,可通过受限最小二乘法进行求解。
在本实施例中的具体使用中,需要两个视频帧进行迭代计算,体现在:①当前帧的显著图计算需要上一帧的显著图;②当前帧光流场的生成也需要上一视频帧,即两个视频帧生成一个光流场。
待检测视频中首帧(第一帧)的显著图不能计算,因为没有上一帧的信息,因此没办法计算上一帧的显著图,并且不能生成光流场;
待检测视频中的第二帧是可以计算显著图,但因为第一帧没有计算出显著图,因此其计算的输入少了“上一帧的显著图”,具体计算流程如图2所示。
待检测视频从第三帧开始,按照上述图1所示的流程进行计算。
图3示出了通过本发明提供的上述实施例进行运算后得到的一个实例,其中图3a表示当前帧,图3b表示上一帧,图3c表示上一帧的显著图,图3d表示运动分布能量,图3e表示运动边缘能量,图3f表示运动历史能量,图3g表示混合运动能量,图3h表示可靠区域,图3i表示当前帧的显著图。在本实施例中,“能量”,其实质上是数值的集合,每一个数值对应一个像素或超像素的能量值;所说的“能量图”,只是对这种能量值的可视化,能量值越强颜色越白,能量值越小颜色越黑。
图4示出了本发明实施例提供的一种基于时空约束的视频显著性检测系统,包括:
能量计算单元401,用于对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧和超像素集合,根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计,根据所述光流场运动估计计算所述当前帧的运动分布能量和运动边缘能量,获取所述上一帧的显著图,根据所述当前帧和所述上一帧,计算运动历史能量,根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图;
显著图计算单元402,用于获得所述混合运动能量图的初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域,根据所述可靠目标区域、所述可靠背景 区域和所述混合能量运动图构建时空约束的显著性全局优化模型,求解所述显著性全局优化模型得到所述当前帧的显著图。
进一步地,能量计算单元401具体用于:
提取所述当前帧的上一帧,并判断所述上一帧是否为所述待检测视频的第一帧,若是,则根据所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图,若否,则执行所述根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图的步骤;
还用于通过简单线性迭代聚类算法对所述待检测当前帧进行超像素分割,得到超像素分割后的当前帧,根据超像素分割后得到的所述当前帧和所述上一帧,采用金字塔LK光流法计算所述当前帧的光流场运动估计。
进一步地,以ri表示所述超像素集合中索引为i的超像素,以N表示超像素集合的元素个数,以Pt(rj)表示超像素rj的平均空间位置,以mA(ri)表示ri与其它超像素之间的平均相似性度量,以μi表示超像素ri使用mA(ri)加权后的平均空间位置,以Md(ri)表示所述运动分布能量,则:
Figure PCTCN2017087709-appb-000013
进一步地,以γ表示平衡参数,其取值范围为[0,1],以ri表示所述超像素集合中索引为i的超像素,以St-1表示所述上一帧的显著图,Mh(ri)中的ri表示索引为i的超像素,以Me(ri)表示超像素ri的运动边缘能量,以Md(ri)表示超像素ri的运动分布能量,以M(ri)表示所述混合运动能量图,则:
Figure PCTCN2017087709-appb-000014
进一步地,显著图计算单元402具体用于:
利用大津法,对所述混合运动能量图进行二值化处理,得到二值图像,对所述二值图像进行数字图像形态学的开操作处理,得到所述初始目标分割区域,计算所述初始目标分割区域中超像素的特征,所述特征包括二维空间位置、颜色特征和混合运动能量值,并以所述特征表示所述超像素集合中的超像素,采用聚类的方法,从所述超像素集合中提取所述可靠目标区域和所述可靠背景区域。
进一步地,所述显著性全局优化模型表示为:
Figure PCTCN2017087709-appb-000015
其中,E(S)表示目标能量函数,S={s1,s2,...,sN}表示待求解的超像素的显著性值序列,si的取值范围为[0,1],N表示所述超像素集合的元素个数,Φ表示前景项,Γ表示背景 项,Ψ表示平滑项,
Figure PCTCN2017087709-appb-000016
表示空间上相邻的超像素对的集合,Θ(S)=k表示时空约束条件;
显著图计算单元402通过以下步骤获取所述目标能量函数:
分别对所述前景项、所述背景项和所述平滑项设计能量最小目标函数,将得到的三个能量最小目标函数组合形成所述目标能量函数,其中,以φ(si)表示所述前景项的能量最小目标函数,以Γ(si)表示所述背景项的能量最小目标函数,以表示ψ(si,sj)所述平滑项的能量最小目标函数,则:
φ(si)=F(ri)(1-si)2
Γ(si)=wb(ri)si 2               ;
ψ(si,sj)=wij(ri,rj)(si-sj)2
其中,F(ri)为所述前景项中的前景先验,表示超像素ri属于前景的概率大小,wb(ri)为所述背景项中的背景先验,表示超像素ri属于背景的概率大小,wij(ri,rj)为平滑假设,表示两个相邻超像素的表观相似度;
所述前景项中的前景先验F(ri)通过下述公式求得,即:
F(ri)=A(ri)M(ri);
其中,M(ri)表示超像素ri的混合运动能量,A(ri)表示超像素ri和所述可靠目标区域中超像素的平均表观相似度;
所述背景项中的背景先验采用超像素ri和所述可靠背景区域中超像素的平均相似度表示;
显著图计算单元402还用于通过受限最小二乘法求解所述显著性全局优化模型,得到所述当前帧的显著图。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种基于时空约束的视频显著性检测方法,其特征在于,包括:
    对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧和超像素集合;
    根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计;
    根据所述光流场运动估计计算所述当前帧的运动分布能量和运动边缘能量;
    获取所述上一帧的显著图;
    根据所述当前帧和所述上一帧,计算运动历史能量;
    根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图;
    获得所述混合运动能量图的初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域;
    根据所述可靠目标区域、所述可靠背景区域和所述混合运动能量图构建时空约束的显著性全局优化模型,求解所述显著性全局优化模型得到所述当前帧的显著图。
  2. 如权利要求1所述的视频显著性检测方法,其特征在于,所述获取所述上一帧的显著图之前,还包括:
    判断所述上一帧是否为所述待检测视频的第一帧;
    若是,则根据所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合能量运动图;
    若否,则执行所述根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合能量运动图的步骤。
  3. 如权利要求1所述的视频显著性检测方法,其特征在于,所述对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧包括:
    通过简单线性迭代聚类算法对所述待检测当前帧进行超像素分割,得到超像素分割后的当前帧;
    则所述根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计包括:
    根据超像素分割后得到所述当前帧和所述当前帧的上一帧,采用金字塔LK光流法计算所述当前帧的光流场运动估计。
  4. 如权利要求1所述的视频显著性检测方法,其特征在于,以ri表示所述超像素集合中索引为i的超像素,以N表示超像素集合的元素个数,以Pt(rj)表示超像素rj的平均空间 位置,以mA(ri)表示ri与其它超像素之间的平均相似性度量,以μi表示超像素ri使用mA(ri)加权后的平均空间位置,以Md(ri)表示所述运动分布能量,则:
    Figure PCTCN2017087709-appb-100001
  5. 如权利要求1所述的视频显著性检测方法,其特征在于,根据所述光流场运动估计计算所述当前帧的运动边缘能量具体包括:
    使用Sobel边缘检测器从所述光流场运动估计计算所述运动边缘能量。
  6. 如权利要求1所述的视频显著性检测方法,其特征在于,以γ表示平衡参数,其取值范围为[0,1],以ri表示所述超像素集合中索引为i的超像素,以St-1表示所述上一帧的显著图,以Mh(ri)表示超像素ri的运动历史能量,以Me(ri)表示超像素ri的运动边缘能量,以Md(ri)表示超像素ri的运动分布能量,以M(ri)表示所述混合运动能量图,则:
    Figure PCTCN2017087709-appb-100002
  7. 如权利要求6所述的视频显著性检测方法,其特征在于,所述根据所述混合运动能量图计算初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域包括:
    利用大津法,对所述混合运动能量图进行二值化处理,得到二值图像;
    对所述二值图像进行数字图像形态学的开操作处理,得到所述初始目标分割区域;
    计算所述初始目标分割区域中超像素的特征,所述特征包括二维空间位置、颜色特征和混合运动能量值,并以所述特征表示所述超像素集合中的超像素;
    采用聚类的方法,从所述超像素集合中提取所述可靠目标区域和所述可靠背景区域。
  8. 如权利要求7所述的视频显著性检测方法,其特征在于,所述显著性全局优化模型表示为:
    Figure PCTCN2017087709-appb-100003
    其中,E(S)表示目标能量函数,S={s1,s2,...,sN}表示待求解的超像素的显著性值序列,si的取值范围为[0,1],N表示所述超像素集合的元素个数,Φ表示前景项,Γ表示背景项,Ψ表示平滑项,
    Figure PCTCN2017087709-appb-100004
    表示空间上相邻的超像素对的集合,Θ(S)=k表示时空约束条件;
    所述目标能量函数通过以下步骤获得:
    分别对所述前景项、所述背景项和所述平滑项设计能量最小目标函数,将得到的三个能量最小目标函数组合形成所述目标能量函数,其中,以φ(si)表示所述前景项的能量最小目 标函数,以Γ(si)表示所述背景项的能量最小目标函数,以ψ(si,sj)表示所述平滑项的能量最小目标函数,则:
    Figure PCTCN2017087709-appb-100005
    其中,F(ri)为所述前景项中的前景先验,表示超像素ri属于前景的概率大小,wb(ri)为所述背景项中的背景先验,表示超像素ri属于背景的概率大小,wij(ri,rj)为平滑假设,表示两个相邻超像素的表观相似度;
    所述前景项中的前景先验F(ri)通过下述公式求得,即:
    F(ri)=A(ri)M(ri);
    其中,M(ri)表示超像素ri的混合运动能量,A(ri)表示超像素ri和所述可靠目标区域中超像素的平均表观相似度;
    所述背景项中的背景先验采用超像素ri和所述可靠背景区域中超像素的平均表观相似度表示。
  9. 如权利要求1所述的视频显著性检测方法,其特征在于,所述求解所述显著性全局优化模型得到所述当前帧的显著图包括:
    通过受限最小二乘法求解所述显著性全局优化模型,得到所述当前帧的显著图。
  10. 一种基于时空约束的视频显著性检测系统,其特征在于,包括:
    能量计算单元,用于对待检测视频的待检测当前帧进行超像素分割,得到超像素分割后的当前帧和超像素集合,根据所述当前帧和所述当前帧的上一帧,计算光流场运动估计,根据所述光流场运动估计计算所述当前帧的运动分布能量和运动边缘能量,获取所述上一帧的显著图,根据所述当前帧和所述上一帧,计算运动历史能量,根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图;
    显著图计算单元,用于获得所述混合运动能量图的初始目标分割区域,从所述初始目标分割区域中提取可靠目标区域和可靠背景区域,根据所述可靠目标区域、所述可靠背景区域和所述混合能量运动图构建时空约束的显著性全局优化模型,求解所述显著性全局优化模型得到所述当前帧的显著图。
  11. 如权利要求10所述的视频显著性检测系统,其特征在于,所述能量计算单元具体用于:
    提取所述当前帧的上一帧,并判断所述上一帧是否为所述待检测视频的第一帧,若是, 则根据所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图,若否,则执行所述根据所述上一帧的显著图、所述运动分布能量、所述运动边缘能量和所述运动历史能量生成混合运动能量图的步骤;
    还用于通过简单线性迭代聚类算法对所述待检测当前帧进行超像素分割,得到超像素分割后的当前帧,根据超像素分割后得到的所述当前帧和所述上一帧,采用金字塔LK光流法计算所述当前帧的光流场运动估计。
  12. 如权利要求10所述的视频显著性检测系统,其特征在于,以ri表示所述超像素集合中索引为i的超像素,以N表示超像素集合的元素个数,以Pt(rj)表示超像素rj的平均空间位置,以mA(ri)表示ri与其它超像素之间的平均相似性度量,以μi表示超像素ri使用mA(ri)加权后的平均空间位置,以Md(ri)表示所述运动分布能量,则:
    Figure PCTCN2017087709-appb-100006
  13. 如权利要求10所述的视频显著性检测系统,其特征在于,以γ表示平衡参数,其取值范围为[0,1],以ri表示所述超像素集合中索引为i的超像素,以St-1表示所述上一帧的显著图,以Mh(ri)表示超像素ri的运动历史能量,以Me(ri)表示超像素ri的运动边缘能量,以Md(ri)表示超像素ri的运动分布能量,以M(ri)表示所述混合运动能量图,则:
    Figure PCTCN2017087709-appb-100007
  14. 如权利要求13所述的视频显著性检测系统,其特征在于,所述显著图计算单元具体用于:
    利用大津法,对所述混合运动能量图进行二值化处理,得到二值图像,对所述二值图像进行数字图像形态学的开操作处理,得到所述初始目标分割区域,计算所述初始目标分割区域中超像素的特征,所述特征包括二维空间位置、颜色特征和混合运动能量值,并以所述特征表示所述超像素集合中的超像素,采用聚类的方法,从所述超像素集合中提取所述可靠目标区域和所述可靠背景区域。
  15. 如权利要求14所述的视频显著性检测系统,其特征在于,所述显著性全局优化模型表示为:
    Figure PCTCN2017087709-appb-100008
    其中,E(S)表示目标能量函数,S={s1,s2,...,sN}表示待求解的超像素的显著性值序列,si的取值范围为[0,1],N表示所述超像素集合的元素个数,Φ表示前景项,Γ表示背景 项,Ψ表示平滑项,
    Figure PCTCN2017087709-appb-100009
    表示空间上相邻的超像素对的集合,Θ(S)=k表示时空约束条件;
    所述显著图计算单元通过以下步骤获取所述目标能量函数:
    分别对所述前景项、所述背景项和所述平滑项设计能量最小目标函数,将得到的三个能量最小目标函数组合形成所述目标能量函数,其中,以φ(si)表示所述前景项的能量最小目标函数,以Γ(si)表示所述背景项的能量最小目标函数,以表示ψ(si,sj)所述平滑项的能量最小目标函数,则:
    Figure PCTCN2017087709-appb-100010
    其中,F(ri)为所述前景项中的前景先验,表示超像素ri属于前景的概率大小,wb(ri)为所述背景项中的背景先验,表示超像素ri属于背景的概率大小,wij(ri,rj)为平滑假设,表示两个相邻超像素的表观相似度;
    所述前景项中的前景先验F(ri)通过下述公式求得,即:
    F(ri)=A(ri)M(ri);
    其中,M(ri)表示超像素ri的混合运动能量,A(ri)表示超像素ri和所述可靠目标区域中超像素的平均表观相似度;
    所述背景项中的背景先验采用超像素ri和所述可靠背景区域中超像素的平均相似度表示;
    所述显著图计算单元还用于通过受限最小二乘法求解所述显著性全局优化模型,得到所述当前帧的显著图。
PCT/CN2017/087709 2017-06-09 2017-06-09 一种基于时空约束的视频显著性检测方法及系统 WO2018223370A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/087709 WO2018223370A1 (zh) 2017-06-09 2017-06-09 一种基于时空约束的视频显著性检测方法及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/087709 WO2018223370A1 (zh) 2017-06-09 2017-06-09 一种基于时空约束的视频显著性检测方法及系统

Publications (1)

Publication Number Publication Date
WO2018223370A1 true WO2018223370A1 (zh) 2018-12-13

Family

ID=64566366

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087709 WO2018223370A1 (zh) 2017-06-09 2017-06-09 一种基于时空约束的视频显著性检测方法及系统

Country Status (1)

Country Link
WO (1) WO2018223370A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022028407A1 (zh) * 2020-08-03 2022-02-10 影石创新科技股份有限公司 一种全景视频剪辑方法、装置、存储介质及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160004929A1 (en) * 2014-07-07 2016-01-07 Geo Semiconductor Inc. System and method for robust motion detection
CN105427292A (zh) * 2015-11-11 2016-03-23 南京邮电大学 一种基于视频的显著目标检测方法
CN105959707A (zh) * 2016-03-14 2016-09-21 合肥工业大学 基于运动感知的静态背景视频压缩算法
CN106778776A (zh) * 2016-11-30 2017-05-31 武汉大学深圳研究院 一种基于位置先验信息的时空域显著度检测方法
CN107392917A (zh) * 2017-06-09 2017-11-24 深圳大学 一种基于时空约束的视频显著性检测方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160004929A1 (en) * 2014-07-07 2016-01-07 Geo Semiconductor Inc. System and method for robust motion detection
CN105427292A (zh) * 2015-11-11 2016-03-23 南京邮电大学 一种基于视频的显著目标检测方法
CN105959707A (zh) * 2016-03-14 2016-09-21 合肥工业大学 基于运动感知的静态背景视频压缩算法
CN106778776A (zh) * 2016-11-30 2017-05-31 武汉大学深圳研究院 一种基于位置先验信息的时空域显著度检测方法
CN107392917A (zh) * 2017-06-09 2017-11-24 深圳大学 一种基于时空约束的视频显著性检测方法及系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022028407A1 (zh) * 2020-08-03 2022-02-10 影石创新科技股份有限公司 一种全景视频剪辑方法、装置、存储介质及设备

Similar Documents

Publication Publication Date Title
CN107392917B (zh) 一种基于时空约束的视频显著性检测方法及系统
KR102153607B1 (ko) 영상에서의 전경 검출 장치 및 방법
WO2019218824A1 (zh) 一种移动轨迹获取方法及其设备、存储介质、终端
JP6631179B2 (ja) 前景画像分割方法及び装置、プログラム、並びに記録媒体
Yun et al. Scene conditional background update for moving object detection in a moving camera
CN109086724B (zh) 一种加速的人脸检测方法及存储介质
JP2018509678A (ja) ターゲット取得の方法及び装置
WO2019071976A1 (zh) 基于区域增长和眼动模型的全景图像显著性检测方法
Huang et al. Motion detection with pyramid structure of background model for intelligent surveillance systems
KR100572768B1 (ko) 디지탈 영상 보안을 위한 사람 얼굴 객체 자동검출 방법
Sengar et al. Foreground detection via background subtraction and improved three-frame differencing
CN112270745B (zh) 一种图像生成方法、装置、设备以及存储介质
EP3149707A1 (en) Method and apparatus for object tracking and segmentation via background tracking
Ait Abdelali et al. An adaptive object tracking using Kalman filter and probability product kernel
Zhu et al. Shadow removal with background difference method based on shadow position and edges attributes
Roy et al. A comprehensive survey on computer vision based approaches for moving object detection
Shao et al. Cast shadow detection based on the YCbCr color space and topological cuts
WO2018223370A1 (zh) 一种基于时空约束的视频显著性检测方法及系统
Dai et al. Robust and accurate moving shadow detection based on multiple features fusion
Zhu Moving Objects Detection and Segmentation Based on Background Subtraction and Image Over-Segmentation.
Liu et al. [Retracted] Mean Shift Fusion Color Histogram Algorithm for Nonrigid Complex Target Tracking in Sports Video
Malavika et al. Moving object detection and velocity estimation using MATLAB
Zhang et al. Integral channel features for particle filter based object tracking
Zhong et al. Confidence-based color modeling for online video segmentation
Lindstrom et al. Background and foreground modeling using an online EM algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17913028

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11.03.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17913028

Country of ref document: EP

Kind code of ref document: A1