CN104778238B - The analysis method and device of a kind of saliency - Google Patents
The analysis method and device of a kind of saliency Download PDFInfo
- Publication number
- CN104778238B CN104778238B CN201510157666.2A CN201510157666A CN104778238B CN 104778238 B CN104778238 B CN 104778238B CN 201510157666 A CN201510157666 A CN 201510157666A CN 104778238 B CN104778238 B CN 104778238B
- Authority
- CN
- China
- Prior art keywords
- notable
- key frame
- video flowing
- analyzed
- brightness
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Abstract
The present invention relates to video data digging technical field, and in particular to the analysis method and device of a kind of saliency.The analysis method of the saliency includes:For video flowing to be analyzed, all keys is extracted;For each key frame, notable figure corresponding to the key frame is obtained;Obtain the audio descriptor of the video flowing to be analyzed;Notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are associated, obtain the saliency figure of the video flowing.By the extraction for being combined progress Saliency maps with motion change feature of the video image in space from the static conspicuousness including color, brightness, it is managed with voice data simultaneously and matched, so as to obtain comprehensive conspicuousness of video data, the accuracy of saliency analysis is effectively improved.
Description
Technical field
The present invention relates to video data digging technical field, and in particular to the analysis method and dress of a kind of saliency
Put.
Background technology
With the development of digital technology, video data also presents explosive growth, how video is carried out quick
Semanteme parsing and acquisition of information, are favorably improved storage, classification and the retrieval rate to massive video, improve management effect
Rate.
Each two field picture of video is mainly divided into super-pixel unit by existing saliency analysis method, then
Likelihood Computation is carried out to each super-pixel unit, and obtained measure value is carried out to the conspicuousness of fusion acquisition super-pixel unit
Scheme, finally the bilateral gaussian filtering of Saliency maps progress to super-pixel unit obtains saliency figure.
But do not relate to the influence that monochrome information and audio-frequency information are analyzed saliency in the prior art,
Lead to not the conspicuousness for obtaining accurate, comprehensive video data.
The content of the invention
The defects of conspicuousness for accurate, comprehensive video data can not be obtained in the prior art, the invention provides
The analysis method and device of a kind of saliency.
A kind of analysis method of saliency provided by the invention, including:
For video flowing to be analyzed, all keys is extracted;
For each key frame, notable figure corresponding to the key frame is obtained;
Obtain the audio descriptor of the video flowing to be analyzed;
Notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are associated,
Obtain the saliency figure of the video flowing.
Further, it is described to be directed to each key frame, the step of obtaining notable figure corresponding to the key frame, including:
For color characteristic, brightness and the motion feature of each key-frame extraction key frame;
The notable figure according to corresponding to the color characteristic, brightness and motion feature of the key frame determine the key frame.
Further, the color characteristic, brightness and motion feature according to the key frame determines the key frame
The step of corresponding notable figure, including:
The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the pass
The brightness of key frame carries out the luma unit element notable figure that fusion obtains the key frame, and the motion feature of the key frame is entered
Row fusion obtains the moving cell element notable figure of the key frame;
The color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are carried out
Superposition obtains notable figure corresponding to the key frame.
Further, the step of audio descriptor of the acquisition video flowing to be analyzed, including:
The audio descriptor of the video flowing to be analyzed is obtained using the notable model of audio.
Further, notable figure corresponding to all keys by acquisition and the audio of the video flowing to be analyzed are retouched
State symbol to be associated, the step of obtaining the saliency figure of the video flowing, including:
It is suitable according to the time by notable figure corresponding to all keys of acquisition and the audio description of the video flowing to be analyzed
Sequence synchronizes association, obtains the saliency figure of the video flowing.
On the other hand, present invention also offers a kind of analytical equipment of saliency, including:
Key-frame extraction module, for for video flowing to be analyzed, extracting all keys;
First acquisition module, for for each key frame, obtaining notable figure corresponding to the key frame;
Second acquisition module, for obtaining the audio descriptor of the video flowing to be analyzed;
Relating module, retouched for notable figure corresponding to all keys by acquisition and the audio of the video flowing to be analyzed
State symbol to be associated, obtain the saliency figure of the video flowing.
Further, first acquisition module includes feature extraction submodule and determination sub-module;
Feature extraction submodule, for color characteristic, the brightness for each key-frame extraction key frame
And motion feature;
Determination sub-module, the key is determined for the color characteristic according to the key frame, brightness and motion feature
Notable figure corresponding to frame.
Further, the determination sub-module is further specifically used for:
The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the pass
The brightness of key frame carries out the luma unit element notable figure that fusion obtains the key frame, and the motion feature of the key frame is entered
Row fusion obtains the moving cell element notable figure of the key frame;
The color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are carried out
Superposition obtains notable figure corresponding to the key frame.
Further, second acquisition module is specifically used for:
The audio descriptor of the video flowing to be analyzed is obtained using the notable model of audio.
Further, the relating module is specifically used for:
By the audio descriptor of notable figure corresponding to all keys of acquisition and the video flowing to be analyzed, according to the time
Order synchronizes association, obtains the saliency figure of the video flowing.
The analysis method and device of a kind of saliency provided by the invention, by from including color, brightness
Static conspicuousness is combined with motion change feature of the video image in space and carries out the extractions of Saliency maps, while by itself and sound
Frequency is according to matching is managed, and so as to obtain comprehensive conspicuousness of video data, is effectively improved saliency analysis
Accuracy.
Brief description of the drawings
The features and advantages of the present invention can be more clearly understood by reference to accompanying drawing, accompanying drawing is schematically without that should manage
Solve to carry out any restrictions to the present invention, in the accompanying drawings:
Fig. 1 is the schematic flow sheet of the analysis method of saliency in one embodiment of the invention;
Fig. 2 is the schematic flow sheet for the notable figure that each frame key frame screen image is obtained in one embodiment of the invention;
Fig. 3 is the structural representation of the analytical equipment of saliency in one embodiment of the invention.
Embodiment
Technical solution of the present invention is further elaborated in conjunction with drawings and examples.
Fig. 1 shows the schematic flow sheet of the analysis method of saliency in the present embodiment, as shown in figure 1, this implementation
A kind of analysis method for saliency that example provides, including:
S1, for video flowing to be analyzed, extract all keys.
Wherein, the extraction of the key frame can be extracted according to the distance between consecutive frame, for example, the video from acquisition
In one frame of any extraction as first key frame, then judge the frame adjacent with first key frame Gray homogeneity whether
More than predetermined threshold value, it is second key frame if greater than the then consecutive frame, otherwise judges next frame, performs successively until extraction
Untill all keys in the video.
The extracting mode of the key frame can use approach well known, and the present embodiment is not by way of example only, right
It is defined.
S2, for each key frame, obtain notable figure corresponding to the key frame;
S3, obtain the audio descriptor of the video flowing to be analyzed.In the present embodiment, it can be obtained using the notable model of audio
Take the audio descriptor of the video.
S4, notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are closed
Connection, obtains the saliency figure of the video flowing.Specifically, according to corresponding to time sequencing by all keys of acquisition significantly
Figure synchronizes with the audio descriptor of the video flowing to be analyzed to be associated.
As shown in Fig. 2 the S2 is directed to each key frame, the step of obtaining notable figure corresponding to the key frame, bag
Include:
S21, for color characteristic, brightness and the motion feature of each key-frame extraction key frame.
For example, color characteristic, brightness and motion feature are extracted by following methods:
First, 9 layers of gaussian pyramid of input picture are established.Wherein the 0th layer is input picture, and 1 to 8 layers are with 5 respectively
The size that × 5 Gaussian filter is filtered and sampled to obtain to input picture is respectively 1/2 to the 1/256 of input picture
Subgraph.Then, color, brightness, motion feature are extracted respectively to each layer of pyramid, forms feature pyramid.Finally, to golden word
The characteristic value of tower different layers makes the difference, and obtains the Characteristic Contrast of center-periphery.
S22, according to corresponding to the color characteristic, brightness and motion feature of the key frame determine the key frame significantly
Figure.
Further, the S22 determines the key according to the color characteristic, brightness and motion feature of the key frame
Corresponding to frame the step of notable figure, including:
The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the pass
The brightness of key frame carries out the luma unit element notable figure that fusion obtains the key frame, and the motion feature of the key frame is entered
Row fusion obtains the moving cell element notable figure of the key frame;
The color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are carried out
Superposition obtains notable figure corresponding to the key frame.
For example, obtained feature is normalized to section [0;1], with the elimination amplitude difference related to feature.For
A few most significant point is allowed to be evenly distributed on whole characteristic pattern so that each characteristic pattern only retains the several aobvious of minority
Point is write, it is necessary to iteration (note normalizes and iterative process is N (x)) be optimized, so as to obtain the list corresponding to each category feature
Element notable figure, notable figure S=N (I)+N (C)+N (O)/3 corresponding to input picture is obtained after taking average.Wherein, I;C;O points
Brightness degree, color and direction character after Wei not normalizing.
The analysis method for a kind of saliency that the present embodiment provides, by from the static state including color, brightness
Conspicuousness is combined with motion change feature of the video image in space and carries out the extractions of Saliency maps, while by itself and audio number
According to matching is managed, so as to obtain comprehensive conspicuousness of video data, the accurate of saliency analysis is effectively improved
Property.
On the other hand, the present embodiment additionally provides a kind of analytical equipment of saliency, including:
Key-frame extraction module 101, for for video flowing to be analyzed, extracting all keys;
First acquisition module 102, for for each key frame, obtaining notable figure corresponding to the key frame;
Second acquisition module 103, for obtaining the audio descriptor of the video flowing to be analyzed;
Relating module 104, for notable figure corresponding to all keys by acquisition and the sound of the video flowing to be analyzed
Frequency descriptor is associated, and obtains the saliency figure of the video flowing.
Further, first acquisition module 102 includes feature extraction submodule and determination sub-module;
Feature extraction submodule, for color characteristic, the brightness for each key-frame extraction key frame
And motion feature;
Determination sub-module, the key is determined for the color characteristic according to the key frame, brightness and motion feature
Notable figure corresponding to frame.
Further, the determination sub-module is further specifically used for:
The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the pass
The brightness of key frame carries out the luma unit element notable figure that fusion obtains the key frame, and the motion feature of the key frame is entered
Row fusion obtains the moving cell element notable figure of the key frame;
The color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are carried out
Superposition obtains notable figure corresponding to the key frame.
Further, second acquisition module 103 is specifically used for:
The audio descriptor of the video flowing to be analyzed is obtained using the notable model of audio.
Further, the relating module 104 is specifically used for:
By the audio descriptor of notable figure corresponding to all keys of acquisition and the video flowing to be analyzed, according to the time
Order synchronizes association, obtains the saliency figure of the video flowing.
The analytical equipment for a kind of saliency that the present embodiment provides, by from the static state including color, brightness
Conspicuousness is combined with motion change feature of the video image in space and carries out the extractions of Saliency maps, while by itself and audio number
According to matching is managed, so as to obtain comprehensive conspicuousness of video data, the accurate of saliency analysis is effectively improved
Property.
Although being described in conjunction with the accompanying embodiments of the present invention, those skilled in the art can not depart from this hair
Various modifications and variations are made in the case of bright spirit and scope, such modifications and variations are each fallen within by appended claims
Within limited range.
Claims (6)
1. a kind of analysis method of saliency, it is characterised in that methods described includes:
For video flowing to be analyzed, all keys is extracted;
For each key frame, notable figure corresponding to the key frame is obtained;
Obtain the audio descriptor of the video flowing to be analyzed;
Notable figure corresponding to all keys of acquisition and the audio descriptor of the video flowing to be analyzed are associated, obtained
The saliency figure of the video flowing;
Wherein, for each key frame, the step of obtaining notable figure corresponding to the key frame, further comprises:For each
Color characteristic, brightness and the motion feature of the key-frame extraction key frame, according to the color characteristic of the key frame,
Brightness and motion feature determine notable figure corresponding to the key frame;
Wherein, the notable figure according to corresponding to the color characteristic, brightness and motion feature of the key frame determine the key frame
The step of further comprise:
The color characteristic of the key frame is carried out to the color cell element notable figure of the fusion acquisition key frame, by the key frame
Brightness carry out fusion obtain the key frame luma unit element notable figure, the motion feature of the key frame is melted
The moving cell element notable figure for obtaining the key frame is closed, the color cell element notable figure of the key frame, luma unit element is aobvious
Work figure and moving cell element notable figure are overlapped to obtain notable figure corresponding to the key frame;
Wherein, the color cell element notable figure, luma unit element notable figure and moving cell element notable figure of the key frame are entered
Row superposition obtain the key frame corresponding to notable figure further comprise:
The color characteristic, the brightness and the motion feature are normalized, to being normalized
The color characteristic, the brightness and the motion feature afterwards optimizes iteration, and to after Optimized Iterative
The color characteristic, the brightness and the motion feature obtain notable figure corresponding to the key frame after taking average.
2. according to the method for claim 1, it is characterised in that the audio descriptor for obtaining the video flowing to be analyzed
The step of, including:
The audio descriptor of the video flowing to be analyzed is obtained using the notable model of audio.
3. according to the method for claim 1, it is characterised in that notable figure corresponding to all keys by acquisition with
The audio descriptor of the video flowing to be analyzed is associated, the step of obtaining the saliency figure of the video flowing, including:
By the audio descriptor of notable figure corresponding to all keys of acquisition and the video flowing to be analyzed, according to time sequencing
Association is synchronized, obtains the saliency figure of the video flowing.
4. a kind of analytical equipment of saliency, it is characterised in that described device includes:
Key-frame extraction module, for for video flowing to be analyzed, extracting all keys;
First acquisition module, for for each key frame, obtaining notable figure corresponding to the key frame;
Second acquisition module, for obtaining the audio descriptor of the video flowing to be analyzed;
Relating module, for notable figure corresponding to all keys by acquisition and the audio descriptor of the video flowing to be analyzed
It is associated, obtains the saliency figure of the video flowing;
Wherein, first acquisition module includes feature extraction submodule and determination sub-module:
The feature extraction submodule, for color characteristic, the brightness for each key-frame extraction key frame
And motion feature, the determination sub-module, determined for the color characteristic according to the key frame, brightness and motion feature
Notable figure corresponding to the key frame;
The determination sub-module, for the color characteristic of the key frame to be carried out to the color cell element of the fusion acquisition key frame
Notable figure, the brightness of the key frame is carried out to the luma unit element notable figure of the fusion acquisition key frame, by the pass
The motion feature of key frame carries out the moving cell element notable figure that fusion obtains the key frame, by the color cell element of the key frame
Notable figure, luma unit element notable figure and moving cell element notable figure be overlapped to obtain the key frame corresponding to notable figure, enter
One step is used for:
The color characteristic, the brightness and the motion feature are normalized, to being normalized
The color characteristic, the brightness and the motion feature afterwards optimizes iteration, and to after Optimized Iterative
The color characteristic, the brightness and the motion feature obtain notable figure corresponding to the key frame after taking average.
5. device according to claim 4, it is characterised in that second acquisition module is specifically used for:
The audio descriptor of the video flowing to be analyzed is obtained using the notable model of audio.
6. device according to claim 4, it is characterised in that the relating module is specifically used for:
By the audio descriptor of notable figure corresponding to all keys of acquisition and the video flowing to be analyzed, according to time sequencing
Association is synchronized, obtains the saliency figure of the video flowing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510157666.2A CN104778238B (en) | 2015-04-03 | 2015-04-03 | The analysis method and device of a kind of saliency |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510157666.2A CN104778238B (en) | 2015-04-03 | 2015-04-03 | The analysis method and device of a kind of saliency |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104778238A CN104778238A (en) | 2015-07-15 |
CN104778238B true CN104778238B (en) | 2018-01-05 |
Family
ID=53619702
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510157666.2A Expired - Fee Related CN104778238B (en) | 2015-04-03 | 2015-04-03 | The analysis method and device of a kind of saliency |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104778238B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138991B (en) * | 2015-08-27 | 2016-08-31 | 山东工商学院 | A kind of video feeling recognition methods merged based on emotion significant characteristics |
CN106055570A (en) * | 2016-05-19 | 2016-10-26 | 中国农业大学 | Video retrieval device based on audio data and video retrieval method for same |
CN106529419B (en) * | 2016-10-20 | 2019-07-26 | 北京航空航天大学 | The object automatic testing method of saliency stacking-type polymerization |
CN109246474B (en) * | 2018-10-16 | 2021-03-02 | 维沃移动通信(杭州)有限公司 | Video file editing method and mobile terminal |
CN109862384A (en) * | 2019-03-13 | 2019-06-07 | 北京河马能量体育科技有限公司 | A kind of audio-video automatic synchronous method and synchronization system |
CN110399847B (en) * | 2019-07-30 | 2021-11-09 | 北京字节跳动网络技术有限公司 | Key frame extraction method and device and electronic equipment |
CN110458172A (en) * | 2019-08-16 | 2019-11-15 | 中国农业大学 | A kind of Weakly supervised image, semantic dividing method based on region contrast detection |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1595971A (en) * | 2003-09-10 | 2005-03-16 | 松下电器产业株式会社 | Image display method, image display program, and image display apparatus |
CN101470756A (en) * | 2007-12-20 | 2009-07-01 | 汤姆森许可贸易公司 | Method and device for calculating the silence of an audio video document |
CN102088597A (en) * | 2009-12-04 | 2011-06-08 | 成都信息工程学院 | Method for estimating video visual salience through dynamic and static combination |
-
2015
- 2015-04-03 CN CN201510157666.2A patent/CN104778238B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1595971A (en) * | 2003-09-10 | 2005-03-16 | 松下电器产业株式会社 | Image display method, image display program, and image display apparatus |
CN101470756A (en) * | 2007-12-20 | 2009-07-01 | 汤姆森许可贸易公司 | Method and device for calculating the silence of an audio video document |
CN102088597A (en) * | 2009-12-04 | 2011-06-08 | 成都信息工程学院 | Method for estimating video visual salience through dynamic and static combination |
Also Published As
Publication number | Publication date |
---|---|
CN104778238A (en) | 2015-07-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104778238B (en) | The analysis method and device of a kind of saliency | |
CN109145766A (en) | Model training method, device, recognition methods, electronic equipment and storage medium | |
CN106548169B (en) | Fuzzy literal Enhancement Method and device based on deep neural network | |
CN111160533B (en) | Neural network acceleration method based on cross-resolution knowledge distillation | |
CN111582104B (en) | Remote sensing image semantic segmentation method and device based on self-attention feature aggregation network | |
Zhuge et al. | Boundary-guided feature aggregation network for salient object detection | |
CN107562742A (en) | A kind of image processing method and device | |
CN109344740A (en) | Face identification system, method and computer readable storage medium | |
CN109978034A (en) | A kind of sound scenery identification method based on data enhancing | |
CN107958230A (en) | Facial expression recognizing method and device | |
CN105095860B (en) | character segmentation method and device | |
CN108597003A (en) | A kind of article cover generation method, device, processing server and storage medium | |
CN109284760A (en) | A kind of furniture detection method and device based on depth convolutional neural networks | |
CN109960988A (en) | Image analysis method, device, electronic equipment and readable storage medium storing program for executing | |
CN110263847A (en) | Track acquisition methods, device, computer equipment and storage medium | |
CN114117614A (en) | Method and system for automatically generating building facade texture | |
CN110046941A (en) | A kind of face identification method, system and electronic equipment and storage medium | |
CN110427915A (en) | Method and apparatus for output information | |
CN103839074B (en) | Image classification method based on matching of sketch line segment information and space pyramid | |
CN110826534B (en) | Face key point detection method and system based on local principal component analysis | |
CN111738310B (en) | Material classification method, device, electronic equipment and storage medium | |
CN109034070A (en) | A kind of displacement aliased image blind separating method and device | |
CN112560925A (en) | Complex scene target detection data set construction method and system | |
Meng et al. | IRIS: smart phone aided intelligent reimbursement system using deep learning | |
CN110674678A (en) | Method and device for identifying sensitive mark in video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180105 Termination date: 20180403 |