CN102568006A

CN102568006A - Visual saliency algorithm based on motion characteristic of object in video

Info

Publication number: CN102568006A
Application number: CN2012100069309A
Authority: CN
Inventors: 王杜瑶; 黄素娟; 谭刚; 沈慧; 王铎成
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2011-03-02
Filing date: 2012-01-11
Publication date: 2012-07-11
Anticipated expiration: 2032-01-11
Also published as: CN102568006B

Abstract

The invention discloses a visual saliency algorithm based on the motion characteristic of an object in video. The visual saliency algorithm specifically comprises the following steps of: (1) calculating a motion vector of a brightness component of a current frame by using a block match-based motion estimation calculation method; (2) acquiring the motion vector from which the mean is removed; (3) performing Gaussian filter on the motion vector after the mean is removed; (4) calculating squares of components in the horizontal direction and components in the vertical direction of the motion vector respectively to obtain visual saliency maps of the horizontal direction and the vertical direction; and (5) acquiring a final visual saliency map. According to the method, when motion factors of the object in the video are considered, the motion saliency is extracted according to the local motion vector characteristic of the saliency object in the video by reducing the influence of the global motion of the object in the video on the visual saliency, so that the accuracy of the visual saliency algorithm can be effectively improved; and the algorithm provided by the invention has low complexity.

Description

Visual saliency algorithm based on object motion characteristics in video

Technical Field

The invention relates to a visual saliency algorithm based on object motion characteristics in a video, and belongs to the technical field of computer vision and image processing.

Background

With the development of information technology, the amount of video information contacted by people in life is more and more huge, and how to efficiently extract a salient object in a video draws more and more researchers' attention, and the visual saliency has wide application in the aspect of video signal processing, such as the fields of video retrieval, video compression, video monitoring, video tracking and the like.

In the aspect of video retrieval, because the video data volume is very large, the retrieval accuracy can be effectively improved by extracting the salient objects in the video and taking the salient objects as the characteristics of the video.

In video compression, as video resolution is higher and higher nowadays, efficient video compression algorithms are also one of the hot spots of research. Meanwhile, a video compression algorithm combined with a human eye vision model is one of key technologies of the next generation of video coding and decoding, so that the visual saliency is particularly important as an important aspect of the human eye vision model.

In the aspect of video monitoring, the intelligent degree of video monitoring can be effectively improved by the visual saliency, so that the visual saliency research has very important significance.

In the aspect of video tracking, people mainly track the motion of a salient object generally, and the accuracy of video tracking can be effectively improved through a visual saliency algorithm.

The visual saliency has wide application in video signal processing, so the visual saliency has very important significance for visual saliency research, and the visual saliency mainly extracts a salient region in an image video according to visual characteristics. At present, the research on the visual saliency mainly aims at the image saliency, and an image saliency algorithm mainly utilizes the characteristics of color, brightness and the like of an image to calculate the saliency of the image, but the image saliency algorithm does not utilize the motion characteristics of a video, so that the effect of directly applying the image saliency algorithm to video saliency detection is poor. However, the video saliency algorithm is less researched, and the defect of high algorithm complexity exists. In general, the motion of an object in a video is mainly composed of two aspects: one is global motion due to motion of the camera and the other is local motion due to relative motion of objects in the video. Compared with the global motion of the local motion of the object in the video, the local motion of the object in the video is higher in significance, and the global motion of the object in the video is lower in significance，Without reducing the impact of global motion of objects in the video on the visual saliency in advance, the accuracy of the saliency algorithm will be reduced.

The relative motion and the global motion of the object in the video can be obtained through a motion estimation algorithm of block matching. For example, a publication entitled "a new diamond search fast matching motion estimation algorithm" (this author is shan zhu, published in 2000, "image processing society of electronics and electronics engineers", 2000,9(2): 287-290) describes a motion estimation algorithm for block matching, which is an algorithm that compares a current block with a corresponding block in a previous frame, calculates an error of the two blocks, takes the block with the smallest error as a best matching block, and shifts a motion between the two blocks as a motion vector. Bilinear interpolation algorithm (K.R. Castleman, Vermilion, digital image processing, electronic industry Press) can be used to obtain motion vector consistent with current frame size2006: 96-97), the interpolation algorithm is to obtain the pixel value to be solved by weighted average of the pixel values of the four nearest points around the pixel value。

Disclosure of Invention

The invention aims to provide a visual saliency algorithm based on object motion characteristics in a video aiming at the problems in the prior art. According to the algorithm, the influence of the global motion of the object in the video on the visual saliency is reduced by detecting the motion of the object in the video, and the accuracy of the visual saliency algorithm can be effectively improved.

In order to achieve the purpose, the invention adopts the following scheme:

a visual saliency algorithm based on object motion characteristics in a video comprises the following specific steps:

(1) calculating a motion vector of the brightness component of the current frame by adopting a motion estimation algorithm based on block matching;

(2) obtaining the motion vector with the mean value removed;

(3)、performing Gaussian filtering on the motion vector with the mean value removed;

(4) respectively calculating the squares of the horizontal direction component and the vertical direction component of the motion vector to obtain visual saliency maps in the horizontal direction and the vertical direction;

(5) and acquiring a final visual saliency map.

The step (1) of calculating the motion vector of the luminance component of the current frame by using the motion estimation algorithm based on block matching specifically comprises the following steps:

(11) extracting a current frame original image and a previous frame original image of the video stream needing visual saliency detection;

(12) respectively extracting the brightness components of the two adjacent frames;

(13) partitioning the brightness component of the current frame of the video stream according to the size of a 16 multiplied by 16 pixel block;

(14) calculating the motion vector of each pixel block in the brightness component of the current frame by adopting a motion estimation algorithm based on block matching;

(15) amplifying the motion vector by using a bilinear interpolation algorithm to obtain a motion vector with the same size as the current frameV(t)。

The obtaining of the motion vector with the mean value removed in the step (2) specifically includes the following steps:

(21)、averaging the motion vectors of each pixel block in the current frame brightness component to obtain the average value of the motion vectorsV _m(t)；

(22)、The motion vector calculated in the step (15)V(t) Subtracting the mean of the motion vectorsV _m(t) Obtaining a motion vector with the mean value removed;

the gaussian filtering of the motion vector with the mean value removed in the step (3) includes the following specific steps:

(31)、filtering the motion vector after the mean value is removed by adopting a Gaussian filter to obtain a filtered motion vector;

(32)、setting the boundary of the filtered motion vector as 0 to obtain the final filtered motion vectorV _f(t)。

The step (4) of calculating the squares of the horizontal direction component and the vertical direction component of the motion vector respectively to obtain the visual saliency maps in the horizontal direction and the vertical direction includes the following specific steps:

(41)、filtering the horizontal component of the motion vectorV _fx(t) Squaring to obtain a visual saliency map in the horizontal directionS _x(t)；

(42)、Filtering the vertical component of the motion vectorV _fy(t) Squaring to obtain a visual saliency map in the vertical directionS _y(t)。

Obtaining the final visual saliency map in step (5) above specifically includes:

adding the visual saliency maps in the horizontal direction and the vertical direction and mapping the visual saliency maps to 0-255 to obtain a final visual saliency mapS(t) Visual saliency mapS(t) The calculation expression of (a) is:

whereinS _x(t) Is a visual saliency map in a horizontal direction,S _y(t) Is a visual saliency map in a vertical direction,S(t) In order to be the final visual saliency map,

indicating a rounding down. The larger the gray value in the saliency map is, the higher the saliency of the corresponding region of the current frame is; the smaller the gray value in the saliency map is, the lower the saliency of the corresponding region of the current frame is.

Compared with the prior art, the visual saliency algorithm based on the motion characteristics of the objects in the video has the following prominent substantive features and remarkable advantages: when the method considers the motion factors of the objects in the video, the motion significance is extracted according to the local motion vector characteristics of the significant objects in the video by reducing the influence of the global motion of the objects in the video on the visual significance, the accuracy of the visual significance algorithm can be effectively improved, and the algorithm disclosed by the invention is low in complexity.

Drawings

FIG. 1 is a block flow diagram of a visual saliency algorithm based on object motion characteristics in video in accordance with the present invention;

FIG. 2 is a flow chart of step (1) described in FIG. 1;

FIG. 3 is a flow chart of step (2) depicted in FIG. 1;

FIG. 4 is a flowchart of step (3) described in FIG. 1;

FIG. 5 is a flowchart of step (4) described in FIG. 1;

FIG. 6 is a 16 th frame original image in a coastguard video sequence;

FIG. 7 is a 17 th frame original image in a coastguard video sequence;

FIG. 8 is a visual saliency map corresponding to FIG. 6;

fig. 9 is a visual saliency map corresponding to fig. 7.

Detailed Description

The invention is described in further detail below with reference to the figures and the detailed description of the invention.

As shown in fig. 1-9, the visual saliency algorithm based on the motion characteristics of the object in the video according to the present invention has the following steps:

(1)、the motion vector of the brightness component of the current frame is calculated by adopting a motion estimation algorithm based on block matching, and the specific implementation steps are as follows:

(15) amplifying the motion vector by using a bilinear interpolation algorithm to obtain a motion vector with the same size as the current frameV(t)；

(2) And obtaining the motion vector with the mean value removed, wherein the specific implementation steps are as follows:

(21)、averaging the motion vectors to obtain a mean value of the motion vectorsV _m(t)；

(22)、Motion vector is converted intoV(t) Subtracting the mean of the motion vectorsV _m(t) Obtaining a motion vector with the mean value removed;

(3)、and performing Gaussian filtering on the motion vector after the mean value is removed, wherein the specific implementation steps are as follows:

(32)、setting the boundary of the corresponding filtered motion vector as 0 to obtain the final filtered motion vectorV _f(t)；

(4)、Respectively calculating the squares of the horizontal direction component and the vertical direction component of the motion vector to obtain the visual saliency maps in the horizontal direction and the vertical direction, and specifically implementing the steps as follows:

(42)、Filtering the vertical component of the motion vectorV _fy(t) Squaring to obtain a visual saliency map in the vertical directionS _y(t)；

(5)、Acquiring a final visual saliency map, wherein the specific implementation steps are as follows:

adding the visual saliency maps in the horizontal direction and the vertical direction and mapping the visual saliency maps to 0-255 to obtain a final visual saliency mapS(t). Visual saliency mapS(t) The calculation expression of (a) is:

wherein,S _x(t) Is a visual saliency map in a horizontal direction,S _y(t) Is a visual saliency map in a vertical direction,S(t) In order to be the final visual saliency map,

Claims

1. A visual saliency algorithm based on object motion characteristics in a video comprises the following specific steps:

(2) obtaining the motion vector with the mean value removed;

(3) performing Gaussian filtering on the motion vector with the mean value removed;

(5) and acquiring a final visual saliency map.

2. The visual saliency algorithm based on object motion features in video according to claim 1, characterized in that said step (1) of calculating the motion vector of the luminance component of the current frame by using a motion estimation algorithm based on block matching comprises the following specific steps:

3. The visual saliency algorithm based on object motion features in video according to claim 1, characterized in that the obtaining of the motion vector with the mean value removed in step (2) specifically comprises the following steps:

(21) averaging the motion vectors of each pixel block in the brightness component of the current frame to obtain a motion vector average valueV _m(t)；

(22) And (3) calculating the motion vector in the step (15)V(t) Subtracting the mean of the motion vectorsV _m(t) And obtaining the motion vector with the mean value removed.

4. The visual saliency algorithm based on object motion features in video according to claim 1, characterized in that said step (3) of gaussian filtering the motion vector after mean removal comprises the following specific steps:

(31) filtering the motion vector after the mean value is removed by adopting a Gaussian filter to obtain a filtered motion vector;

(32) setting the boundary of the filtered motion vector to be 0 to obtain the final filtered motion vectorV _f(t)。

5. The visual saliency algorithm based on motion features of objects in video according to claim 1, characterized in that said step (4) of calculating the squares of the horizontal direction component and the vertical direction component of the motion vector respectively to obtain the visual saliency maps in the horizontal direction and the vertical direction comprises the following specific steps:

(41) horizontal component of the filtered motion vectorV _fx(t) Squaring to obtain a visual saliency map in the horizontal directionS _x(t)；

(42) The vertical component of the filtered motion vectorV _fy(t) Squaring to obtain a visual saliency map in the vertical directionS _y(t)。

6. The algorithm for visual saliency based on object motion features in videos as claimed in claim 1, wherein said step (5) of obtaining a final visual saliency map specifically comprises:

indicating a rounding down.