CN104778238B

CN104778238B - The analysis method and device of a kind of saliency

Info

Publication number: CN104778238B
Application number: CN201510157666.2A
Authority: CN
Inventors: 高万林; 陈雪瑞; 贾敬敦; 张港红; 于丽娜; 任延昭; 徐东波; 朱佳佳; 陶红燕
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2015-04-03
Filing date: 2015-04-03
Publication date: 2018-01-05
Anticipated expiration: 2035-04-03
Also published as: CN104778238A

Abstract

The present invention relates to video data digging technical field, and in particular to the analysis method and device of a kind of saliency.The analysis method of the saliency includes：For video flowing to be analyzed, all keys is extracted；For each key frame, notable figure corresponding to the key frame is obtained；Obtain the audio descriptor of the video flowing to be analyzed；Notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are associated, obtain the saliency figure of the video flowing.By the extraction for being combined progress Saliency maps with motion change feature of the video image in space from the static conspicuousness including color, brightness, it is managed with voice data simultaneously and matched, so as to obtain comprehensive conspicuousness of video data, the accuracy of saliency analysis is effectively improved.

Description

The analysis method and device of a kind of saliency

Technical field

The present invention relates to video data digging technical field, and in particular to the analysis method and dress of a kind of saliency Put.

Background technology

With the development of digital technology, video data also presents explosive growth, how video is carried out quick Semanteme parsing and acquisition of information, are favorably improved storage, classification and the retrieval rate to massive video, improve management effect Rate.

Each two field picture of video is mainly divided into super-pixel unit by existing saliency analysis method, then Likelihood Computation is carried out to each super-pixel unit, and obtained measure value is carried out to the conspicuousness of fusion acquisition super-pixel unit Scheme, finally the bilateral gaussian filtering of Saliency maps progress to super-pixel unit obtains saliency figure.

But do not relate to the influence that monochrome information and audio-frequency information are analyzed saliency in the prior art, Lead to not the conspicuousness for obtaining accurate, comprehensive video data.

The content of the invention

The defects of conspicuousness for accurate, comprehensive video data can not be obtained in the prior art, the invention provides The analysis method and device of a kind of saliency.

A kind of analysis method of saliency provided by the invention, including：

For video flowing to be analyzed, all keys is extracted；

For each key frame, notable figure corresponding to the key frame is obtained；

Obtain the audio descriptor of the video flowing to be analyzed；

Notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are associated, Obtain the saliency figure of the video flowing.

Further, it is described to be directed to each key frame, the step of obtaining notable figure corresponding to the key frame, including：

For color characteristic, brightness and the motion feature of each key-frame extraction key frame；

The notable figure according to corresponding to the color characteristic, brightness and motion feature of the key frame determine the key frame.

Further, the color characteristic, brightness and motion feature according to the key frame determines the key frame The step of corresponding notable figure, including：

The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the pass The brightness of key frame carries out the luma unit element notable figure that fusion obtains the key frame, and the motion feature of the key frame is entered Row fusion obtains the moving cell element notable figure of the key frame；

The color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are carried out Superposition obtains notable figure corresponding to the key frame.

Further, the step of audio descriptor of the acquisition video flowing to be analyzed, including：

The audio descriptor of the video flowing to be analyzed is obtained using the notable model of audio.

Further, notable figure corresponding to all keys by acquisition and the audio of the video flowing to be analyzed are retouched State symbol to be associated, the step of obtaining the saliency figure of the video flowing, including：

It is suitable according to the time by notable figure corresponding to all keys of acquisition and the audio description of the video flowing to be analyzed Sequence synchronizes association, obtains the saliency figure of the video flowing.

On the other hand, present invention also offers a kind of analytical equipment of saliency, including：

Key-frame extraction module, for for video flowing to be analyzed, extracting all keys；

First acquisition module, for for each key frame, obtaining notable figure corresponding to the key frame；

Second acquisition module, for obtaining the audio descriptor of the video flowing to be analyzed；

Relating module, retouched for notable figure corresponding to all keys by acquisition and the audio of the video flowing to be analyzed State symbol to be associated, obtain the saliency figure of the video flowing.

Further, first acquisition module includes feature extraction submodule and determination sub-module；

Feature extraction submodule, for color characteristic, the brightness for each key-frame extraction key frame And motion feature；

Determination sub-module, the key is determined for the color characteristic according to the key frame, brightness and motion feature Notable figure corresponding to frame.

Further, the determination sub-module is further specifically used for：

Further, second acquisition module is specifically used for：

Further, the relating module is specifically used for：

By the audio descriptor of notable figure corresponding to all keys of acquisition and the video flowing to be analyzed, according to the time Order synchronizes association, obtains the saliency figure of the video flowing.

The analysis method and device of a kind of saliency provided by the invention, by from including color, brightness Static conspicuousness is combined with motion change feature of the video image in space and carries out the extractions of Saliency maps, while by itself and sound Frequency is according to matching is managed, and so as to obtain comprehensive conspicuousness of video data, is effectively improved saliency analysis Accuracy.

Brief description of the drawings

The features and advantages of the present invention can be more clearly understood by reference to accompanying drawing, accompanying drawing is schematically without that should manage Solve to carry out any restrictions to the present invention, in the accompanying drawings：

Fig. 1 is the schematic flow sheet of the analysis method of saliency in one embodiment of the invention；

Fig. 2 is the schematic flow sheet for the notable figure that each frame key frame screen image is obtained in one embodiment of the invention；

Fig. 3 is the structural representation of the analytical equipment of saliency in one embodiment of the invention.

Embodiment

Technical solution of the present invention is further elaborated in conjunction with drawings and examples.

Fig. 1 shows the schematic flow sheet of the analysis method of saliency in the present embodiment, as shown in figure 1, this implementation A kind of analysis method for saliency that example provides, including：

S1, for video flowing to be analyzed, extract all keys.

Wherein, the extraction of the key frame can be extracted according to the distance between consecutive frame, for example, the video from acquisition In one frame of any extraction as first key frame, then judge the frame adjacent with first key frame Gray homogeneity whether More than predetermined threshold value, it is second key frame if greater than the then consecutive frame, otherwise judges next frame, performs successively until extraction Untill all keys in the video.

The extracting mode of the key frame can use approach well known, and the present embodiment is not by way of example only, right It is defined.

S2, for each key frame, obtain notable figure corresponding to the key frame；

S3, obtain the audio descriptor of the video flowing to be analyzed.In the present embodiment, it can be obtained using the notable model of audio Take the audio descriptor of the video.

S4, notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are closed Connection, obtains the saliency figure of the video flowing.Specifically, according to corresponding to time sequencing by all keys of acquisition significantly Figure synchronizes with the audio descriptor of the video flowing to be analyzed to be associated.

As shown in Fig. 2 the S2 is directed to each key frame, the step of obtaining notable figure corresponding to the key frame, bag Include：

S21, for color characteristic, brightness and the motion feature of each key-frame extraction key frame.

For example, color characteristic, brightness and motion feature are extracted by following methods：

First, 9 layers of gaussian pyramid of input picture are established.Wherein the 0th layer is input picture, and 1 to 8 layers are with 5 respectively The size that × 5 Gaussian filter is filtered and sampled to obtain to input picture is respectively 1/2 to the 1/256 of input picture Subgraph.Then, color, brightness, motion feature are extracted respectively to each layer of pyramid, forms feature pyramid.Finally, to golden word The characteristic value of tower different layers makes the difference, and obtains the Characteristic Contrast of center-periphery.

S22, according to corresponding to the color characteristic, brightness and motion feature of the key frame determine the key frame significantly Figure.

Further, the S22 determines the key according to the color characteristic, brightness and motion feature of the key frame Corresponding to frame the step of notable figure, including：

For example, obtained feature is normalized to section [0；1], with the elimination amplitude difference related to feature.For A few most significant point is allowed to be evenly distributed on whole characteristic pattern so that each characteristic pattern only retains the several aobvious of minority Point is write, it is necessary to iteration (note normalizes and iterative process is N (x)) be optimized, so as to obtain the list corresponding to each category feature Element notable figure, notable figure S=N (I)+N (C)+N (O)/3 corresponding to input picture is obtained after taking average.Wherein, I；C；O points Brightness degree, color and direction character after Wei not normalizing.

The analysis method for a kind of saliency that the present embodiment provides, by from the static state including color, brightness Conspicuousness is combined with motion change feature of the video image in space and carries out the extractions of Saliency maps, while by itself and audio number According to matching is managed, so as to obtain comprehensive conspicuousness of video data, the accurate of saliency analysis is effectively improved Property.

On the other hand, the present embodiment additionally provides a kind of analytical equipment of saliency, including：

Key-frame extraction module 101, for for video flowing to be analyzed, extracting all keys；

First acquisition module 102, for for each key frame, obtaining notable figure corresponding to the key frame；

Second acquisition module 103, for obtaining the audio descriptor of the video flowing to be analyzed；

Relating module 104, for notable figure corresponding to all keys by acquisition and the sound of the video flowing to be analyzed Frequency descriptor is associated, and obtains the saliency figure of the video flowing.

Further, first acquisition module 102 includes feature extraction submodule and determination sub-module；

Further, the determination sub-module is further specifically used for：

Further, second acquisition module 103 is specifically used for：

Further, the relating module 104 is specifically used for：

The analytical equipment for a kind of saliency that the present embodiment provides, by from the static state including color, brightness Conspicuousness is combined with motion change feature of the video image in space and carries out the extractions of Saliency maps, while by itself and audio number According to matching is managed, so as to obtain comprehensive conspicuousness of video data, the accurate of saliency analysis is effectively improved Property.

Although being described in conjunction with the accompanying embodiments of the present invention, those skilled in the art can not depart from this hair Various modifications and variations are made in the case of bright spirit and scope, such modifications and variations are each fallen within by appended claims Within limited range.

Claims

1. a kind of analysis method of saliency, it is characterised in that methods described includes：

For video flowing to be analyzed, all keys is extracted；

Obtain the audio descriptor of the video flowing to be analyzed；

Notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are associated, obtained The saliency figure of the video flowing；

Wherein, for each key frame, the step of obtaining notable figure corresponding to the key frame, further comprises：For each Color characteristic, brightness and the motion feature of the key-frame extraction key frame, according to the color characteristic of the key frame, Brightness and motion feature determine notable figure corresponding to the key frame；

Wherein, the notable figure according to corresponding to the color characteristic, brightness and motion feature of the key frame determine the key frame The step of further comprise：

The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the key frame Brightness carry out fusion obtain the key frame luma unit element notable figure, the motion feature of the key frame is melted The moving cell element notable figure for obtaining the key frame is closed, the color cell element notable figure of the key frame, luma unit element is aobvious Work figure and moving cell element notable figure are overlapped to obtain notable figure corresponding to the key frame；

Wherein, the color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are entered Row superposition obtain the key frame corresponding to notable figure further comprise：

The color characteristic, the brightness and the motion feature are normalized, to being normalized The color characteristic, the brightness and the motion feature afterwards optimizes iteration, and to after Optimized Iterative The color characteristic, the brightness and the motion feature obtain notable figure corresponding to the key frame after taking average.

2. according to the method for claim 1, it is characterised in that the audio descriptor for obtaining the video flowing to be analyzed The step of, including：

3. according to the method for claim 1, it is characterised in that notable figure corresponding to all keys by acquisition with The audio descriptor of the video flowing to be analyzed is associated, the step of obtaining the saliency figure of the video flowing, including：

By the audio descriptor of notable figure corresponding to all keys of acquisition and the video flowing to be analyzed, according to time sequencing Association is synchronized, obtains the saliency figure of the video flowing.

4. a kind of analytical equipment of saliency, it is characterised in that described device includes：

Relating module, for notable figure corresponding to all keys by acquisition and the audio descriptor of the video flowing to be analyzed It is associated, obtains the saliency figure of the video flowing；

Wherein, first acquisition module includes feature extraction submodule and determination sub-module：

The feature extraction submodule, for color characteristic, the brightness for each key-frame extraction key frame And motion feature, the determination sub-module, determined for the color characteristic according to the key frame, brightness and motion feature Notable figure corresponding to the key frame；

The determination sub-module, for the color characteristic of the key frame to be carried out to the color cell element of the fusion acquisition key frame Notable figure, the brightness of the key frame is carried out to the luma unit element notable figure of the fusion acquisition key frame, by the pass The motion feature of key frame carries out the moving cell element notable figure that fusion obtains the key frame, by the color cell element of the key frame Notable figure, luma unit element notable figure and moving cell element notable figure be overlapped to obtain the key frame corresponding to notable figure, enter One step is used for：

5. device according to claim 4, it is characterised in that second acquisition module is specifically used for：

6. device according to claim 4, it is characterised in that the relating module is specifically used for：