CN103824284B - Key frame extraction method based on visual attention model and system - Google Patents

Key frame extraction method based on visual attention model and system Download PDF

Info

Publication number
CN103824284B
CN103824284B CN201410039072.7A CN201410039072A CN103824284B CN 103824284 B CN103824284 B CN 103824284B CN 201410039072 A CN201410039072 A CN 201410039072A CN 103824284 B CN103824284 B CN 103824284B
Authority
CN
China
Prior art keywords
key
camera lens
frame
significance
time domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410039072.7A
Other languages
Chinese (zh)
Other versions
CN103824284A (en
Inventor
纪庆革
赵杰
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
National Sun Yat Sen University
Original Assignee
Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd, National Sun Yat Sen University filed Critical Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
Priority to CN201410039072.7A priority Critical patent/CN103824284B/en
Publication of CN103824284A publication Critical patent/CN103824284A/en
Application granted granted Critical
Publication of CN103824284B publication Critical patent/CN103824284B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a key frame extraction method based on a visual attention model and a system. In a spatial domain, the extraction method uses binomial coefficients to filter the global contrast for salience detection, and uses an adaptive threshold for carrying out extraction on a target region. The algorithm can well maintain the salient target region boundary, and the salience in the region is uniform. Then, in a time domain, the method defines the motion salience, motion of the target is estimated via a homography matrix, a key point is adopted for replacing the target for salience detection, data of salience in the spatial domain is converged, and a boundary extension method based on an energy function is brought forward to acquire a bounding box to serve as the salient target region of the time domain. Finally, the method reduces richness of the video through the salient target region and an online clustering lens adaptive method is adopted for key frame extraction.

Description

A kind of extraction method of key frame and system of view-based access control model attention model
Technical field
The present invention relates to Video Analysis Technology field, more particularly to a kind of key frame of view-based access control model attention model is carried Take method and system.
Background technology
With the fast development of Internet technology, we have marched toward information huge explosion epoch, various networks It is widely used using the fast development with multimedia technology.Video is raw as a kind of common network information carriers It is dynamic and directly perceived, with very strong sight and representability, so as to be widely used in every field so that video data Magnanimity increases, and by taking famous video website YouTube as an example, the video uploaded by user per minute there are about 60 hours(Data take From on January 23rd, 2012), and still remain growth trend.How fast and effeciently to store, manage and access regarding for magnanimity Frequency resource becomes a major issue of current video application.Video under traditional approach, is used because having relativity of time domain One section of video information is grasped at family to be needed to browse complete segment video from start to finish.While unrelated video occupies user plenty of time, Waste a large amount of network bandwidths.It would therefore be desirable to add auxiliary information to video, user is helped preferably to screen.At present Traditional label character method is generally adopted in ripe system, by manual type manual classification, is assigned with words such as title, descriptions Give video artefacts semantic.In the face of massive video, not only workload is big for this task, and different people understands different to video, Other people cannot judge whether video meets the interest of oneself by the label character of author.
Therefore, people are effectively summarized video in the urgent need to a kind of mode of automatization.
The content of the invention
In order to solve the deficiencies in the prior art, present invention firstly provides a kind of video crux of view-based access control model attention model Frame extracting method, can effectively be obtained using the method and have fine representational key frame to video lens.
A further object of the present invention is the video crux frame extraction system for proposing a kind of view-based access control model attention model.
To achieve these goals, the technical scheme is that:
A kind of video key frame extracting method of view-based access control model attention model, including:
On spatial domain, significance detection is carried out with binomial coefficient filtering global contrast, and utilize adaptive threshold Target area is extracted;Adopt and not only can preferably keep in this way well-marked target zone boundary, and show in region Work degree is more uniform.
In time domain, the significance of motion is defined, target motion is estimated by homography matrix, using key point Replace target to carry out significance detection, merge the data of spatial domain significance, propose to be obtained based on the method for energy function border extension Bounding box is obtained as the well-marked target region of time domain;
The rich of video is reduced by well-marked target region, is carried out using the camera lens adaptive approach with reference to on-line talking Key-frame extraction.
A kind of key frame of video extraction system of view-based access control model attention model, the system includes that marking area extracts mould Block, key-frame extraction module;
Specifically, the marking area extraction module includes:
Spatial domain marking area extraction module, for extracting the marking area on spatial domain;
Time domain key point significance acquisition module, for extracting the notable angle value of the key point in time domain;
Fusion Module, for the key point on the marking area and time domain on spatial domain to be merged, and finally obtains aobvious Write region.
The key-frame extraction module includes:
Static camera lens key-frame extraction module, for the key-frame extraction of static camera lens;
Dynamic camera lens key-frame extraction module, for the key-frame extraction of dynamic camera lens;
Camera lens adaptation module, between static camera lens key-frame extraction module and dynamic camera lens key-frame extraction module Control.
Compared with prior art, beneficial effects of the present invention are:Automatically can carry out ground to video using the present invention Summarize, effectively obtain and there is fine representational key frame to video lens.
Description of the drawings
Fig. 1 is the key-frame extraction flow chart of static state camera lens of the invention.
Fig. 2 is the key-frame extraction flow chart of dynamic camera lens of the invention.
Fig. 3 is the key-frame extraction flow chart of self adaptation camera lens of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawings the present invention is further detailed explanation.
A kind of video key frame extracting method of view-based access control model attention model disclosed by the invention, specific embodiment is such as Under:
First, on spatial domain, significance detection is carried out by filtering global contrast with binomial coefficient, and using certainly Adapt to threshold value to extract target area, concrete grammar is as follows:
(11)Binomial coefficient is constructed according to pascal's triangle, and the normalization factor of N shell is 2N.The 4th layer is selected, therefore is filtered Ripple device coefficient B4=(1/16)[1 4 6 4 1];
(12)If I is primary stimuli intensity,For the average of stimulus intensity around,For I and B4Convolution;Pixel is adopted Weigh the power for stimulating with the vector form of CIELAB color spaces, the contrast of stimulation be two CIELAB vectors it is European away from From, therefore for the stimulation degree of pixel (x, y) is detected as
(13)Obtain the measuring assembly S of significances=(s11,s12,...,sNM) after, using adaptive threshold to target area Domain is extracted, wherein sij(0≤i≤N, 0≤j≤M) for pixel (i, j) significance, M, N be respectively image width and Highly.
Specifically, it is realized by the following method adaptive threshold to extract target area:
(21)Define pixel (x, y) overall situation significance detection calculating formula
Wherein A is the area of detection,For the filtered device B of original image4The stimulation of Filtered Picture vegetarian refreshments (x, y) is strong Degree, I (i, j) is the primary stimuli intensity of pixel (i, j), and M, N is respectively the width and height of image;
(22)Computing acceleration is carried out by rectangular histogram, primary stimuli intensity I is mapped to into stimulation spaceIn, finally for The stimulation that user experiencesSignificance it is as follows
Wherein D is stimulationDistance between stimulating recently at mM is Manual control parameter, takes in the present embodiment m for 8;
(23)By changing threshold value TsSpecified foreground and background region, is then made with the threshold value for obtaining minimum energy function For optimal threshold;With TsEnergy function for threshold value is defined as follows:
Wherein SnBy formula(2)Obtain, λ is the weight of well-marked target energy, and λ=1.0 are taken in the present embodiment, and N is image Total pixel number, f (Ts,Sn)=max(0,sign(Sn-Ts)), V (I, Ts, s) be similarity to around stimulating measurement, select current Ts The pixel composition point of lower point of significance and its 8 neighborhood is calculated Pair, Dist (p, q) is the space length between 2 points, and σ is artificial control parameter, and σ=10.0 are taken in the present embodiment.
Therefore piece image and saliency map are given, by minimizing energy function to TsEstimated, worked as pixel 1 is marked as when belonging to well-marked target, remaining is labeled as 0, and parameter lambda and σ need manual setting in advance.
Then, in time domain, the significance of motion is defined, target motion is estimated by homography matrix, adopted Key point replaces target to carry out significance detection, and the data of spatial domain significance are merged afterwards, proposes to expand based on energy function border The method of exhibition obtains bounding box as the well-marked target region of time domain, and concrete grammar is as follows:
(31)Given piece image, using the good FAST of real-time(Features from Accelerated Segment Test)Feature point detection algorithm obtains the key point of image;
(32)Given adjacent two field pictures, using FLANN(Fast Library for Approximate Nearest Neighbor)Carry out quickly correlation Point matching;
(33)Use homography matrix(Homography Matrix)H describing the motion of key point, due to a H only A kind of forms of motion described, same section of video memory forms of motion be various, it is therefore desirable to multiple H are to different motions It is described.RANSAC algorithms are adopted in the present embodiment, by continuous iteration, obtain a series of estimation H=of homography matrixes {H1,H2,...,Hn};
(34)Define key point time domain significance be
Wherein AmFor kinestate HmAll key points distribution area, W and H for video image width and height;
(35)The notable angle value in spatial domain is merged with the notable angle value of time domain of the key point for obtaining;
(36)Bounding box is obtained as the well-marked target region of time domain using the method based on energy function border extension.
Specifically, the notable angle value and the notable angle value of time domain of the key point for obtaining for being realized by the following method spatial domain is melted Close:
(41)Define the contrast of a motion significanceWherein notable angle value S of key point time domaint By formula(5)Obtain,For the average of the notable angle value of key point time domain;
(42)The significance of motion should be directed to the target for still having stronger discrimination on spatial domain, therefore notable to time domain Degree StScope of statistics should limit, if piFor StI-th key point, then piShould meetWhereinFor The notable angle value average in spatial domain;
(43)Define the weight of time domainThe weight in spatial domainTo meet(42)Key The time domain of point is added with the notable angle value in spatial domain by weights.
Specifically, it is realized by the following method time domain well-marked target extracted region:
Using notable key point p in spatial domain as seed point, seed region adopts the bounding box B of rectangle, if biFor bounding box B Four edges, i ∈ { 1,2,3,4 } are numbering up and down, and the algorithm of border extension is as follows:
Initialization:The summit up and down of bounding box B is all set to key point p position, and point p is the internal point of bounding box B.
Step 1:B is calculated with the order being incremented by from i=1iSignificance ENERGY E on external boundaryouterI () is aobvious with inner boundary Work degree ENERGY Einner(i), the calculating such as formula of energy function(4), then calculating the weights that can extend out of border isWherein liFor the length on i-th side of current bounding box B.
Step 2:If w (i) >=ε, i-th side is to one pixel cell of external expansion.ε is the threshold value that extension judges, is needed Pre-set.In the experiment of this paper, 0.8T is set tos', Ts' for the spatial domain significance average in bounding box.
Step 3:If be expanded without new side in step 2, stop algorithm, export bounding box B;Otherwise, repeat Step 1 and step 2.
Finally, the rich of video is reduced by well-marked target region, using the camera lens self adaptation side with reference to on-line talking Method carries out key-frame extraction, and concrete grammar is as follows:
(51)The RGB color of marking area is converted to into hsv color space, wherein H components are taken(Tone)With S components (Saturation)Calculate form and aspect saturation histogram(Hue-saturation Histogram).Note HpI () is pth frame well-marked target I-th bin value of the form and aspect saturation histogram in region, the present embodiment weighed using Bhattacharyya distances two frames it Between visible sensation distance
(52)Key-frame extraction is carried out using the camera lens adaptive approach with reference to on-line talking, with the cluster side of static camera lens Based on formula, supplemented by the cluster mode of dynamic camera lens.For static camera lens, with the form and aspect saturation histogram of marking area as foundation On-line talking is carried out, any one frame is used as key frame in selection cluster.For dynamic camera lens, notable moving target is tracked first, Then using the tracking of notable moving target as the foundation of on-line talking, the positional information of well-marked target is extracted as from cluster The foundation of key frame.
Specifically, such as Fig. 1, by following steps static camera lens on-line talking is realized:
Initialization:Calculate the form and aspect saturation histogram of the static frame of camera lens firstInitial cell number N=1, and willAs cell Cell1Centre of form C1Vector, C1=f1
S11:If present frame p belongs to static camera lens, the form and aspect saturation histogram H of present frame is calculatedp
S12:The visible sensation distance of p and each cell centre of form is calculated, wherein minimum visible sensation distance cell is obtainedWherein m is the call number of cell.
S13:By Dsal(p,Cm) and threshold epsiloncIt is compared, works as Dsal(p,Cm)≤εcWhen, p is included into cell CellmIn, so After use HpSubstitute CellmThe centre of form.Otherwise, cell Cell is increasedN+1, by HpAs cell CellN+1Centre of form CN+1Vector, most Cell number N=N+1 is updated afterwards.
S14:For all static camera lens frames repeat S11, S12 and S13.
Specifically, such as Fig. 2, by following steps the key-frame extraction of dynamic camera lens is realized:
Initialization:Obtain the first frame of dynamic camera lens.
S21:Tracking target area is obtained, initialization particle or resampling extracts video next frame, judge that whether the frame is Sky, if sky, then terminates.
S22:Obtain FAST characteristic vectors, matched with FLANN algorithms, update characteristic vector weights, if feature to Amount is not enough, then terminate.
S23:Each particle weights are updated, key frame weights and target area is calculated, execution S21 is redirected.
A kind of key-frame extraction system of view-based access control model attention model disclosed by the invention includes that marking area extracts mould Block and key-frame extraction module.
Marking area extraction module includes:
Spatial domain marking area extraction module, for extracting the marking area on spatial domain;
Time domain key point significance acquisition module, for extracting the notable angle value of the key point in time domain;
Fusion Module, for the key point on the marking area and time domain on spatial domain to be merged, and finally obtains aobvious Write region.
Key-frame extraction module includes:
Static camera lens key-frame extraction module, for the key-frame extraction of static camera lens;
Dynamic camera lens key-frame extraction module, for the key-frame extraction of dynamic camera lens;
Camera lens adaptation module, between static camera lens key-frame extraction module and dynamic camera lens key-frame extraction module Control.
The above, only presently preferred embodiments of the present invention is not intended to limit protection scope of the present invention, should Understand, the present invention is not limited to implementation as described herein, the purpose of these implementation descriptions is to help this area In technical staff practice the present invention.Any those of skill in the art are easy to without departing from spirit and scope of the invention In the case of be further improved and perfect, therefore any done modification within the spiritual principles of the present invention, equivalent Replace and improve etc., should be included within the claims of the present invention.

Claims (8)

1. a kind of extraction method of key frame of view-based access control model attention model, for extracting to the key frame of video, it is special Levy and be, including:
On spatial domain, significance detection is carried out with binomial coefficient filtering global contrast, and using adaptive threshold to mesh Extracted in mark region;
In time domain, the significance of motion is defined, target motion is estimated by homography matrix, replaced using key point Target carries out significance detection, merges the data of spatial domain significance, proposes to be wrapped based on the method for energy function border extension Box is enclosed as the well-marked target region of time domain;
The rich of video is reduced by well-marked target region, key is carried out using the camera lens adaptive approach with reference to on-line talking Frame is extracted;
On spatial domain, significance detection is carried out by filtering global contrast with binomial coefficient, and utilize adaptive threshold Target area is extracted, concrete grammar is as follows:
(11) binomial coefficient is constructed according to pascal's triangle, and the normalization factor of N shell is 2N;The 4th layer is selected, filter coefficient B4 =(1/16) [1 464 1];
(12) I is set as primary stimuli intensity,For the average of stimulus intensity around,For I and B4Convolution;Pixel is adopted The vector form of CIELAB color spaces weighs the power for stimulating, the contrast of stimulation be two CIELAB vectors it is European away from From, therefore for the stimulation degree of pixel (x, y) is detected as
S ( x , y ) = | | I B 4 ( x , y ) - I ‾ | | - - - ( 1 )
(13) the measuring assembly S of significance is obtaineds=(s11,s12,…,sNM) after, target area is carried out using adaptive threshold Extract, wherein sijFor the significance of pixel (i, j), 0≤i≤N, 0≤j≤M, M, N is respectively the width and height of image;It is logical Cross following methods and realize that adaptive threshold is extracted to target area:
(21) pixel (x, y) overall situation significance detection calculating formula is defined
S g ( x , y ) = 1 A Σ i = 0 N Σ j = 0 M | | I B 4 ( x , y ) - I ( i , j ) | | - - - ( 2 )
Wherein A is the area of detection,For the filtered device B of original image4The stimulus intensity of Filtered Picture vegetarian refreshments (x, y), I (i, j) is the primary stimuli intensity of pixel (i, j), and M, N is respectively the width and height of image;
(22) computing acceleration is carried out by rectangular histogram, primary stimuli intensity I is mapped to into stimulation spaceIn, finally for user's sense What is be subject to stimulatesSignificance it is as follows
S ( I B 4 ( I ) ) = 1 ( m - 1 ) D ( I B 4 ( I ) ) Σ i = 1 m ( D ( I B 4 ( I ) ) - | | I B 4 ( I ) - I B 4 ( I i ) | | ) S g ( I B 4 ( I ) ) - - - ( 3 )
Wherein D is stimulationDistance between stimulating recently at m
(23) by changing threshold value TsSpecified foreground and background region, then to obtain the threshold value of minimum energy function as most Excellent threshold value;With TsEnergy function for threshold value is defined as follows:
E ( I , T s , λ , σ ) = λ Σ n = 1 N ( f ( T s , S n ) S n ) + V ( I , T s , σ ) - - - ( 4 )
Wherein SnBy formula (2) obtain, λ for well-marked target energy weight, N for image total pixel number, f (Ts,Sn)=max (0,sign(Sn-Ts)), V (I, Ts, σ) be similarity to around stimulating measurement, select current TsLower point of significance and it is 8 adjacent The pixel composition point in domain is calculated Pair,dist(p,q) For the space length between 2 points, σ is control parameter.
2. method according to claim 1, it is characterised in that in time domain, defines the significance of motion, by homography Matrix is moved to target to be estimated, replaces target to carry out significance detection using key point, and spatial domain significance is merged afterwards Data, propose that the method based on energy function border extension obtains bounding box as the well-marked target region of time domain, concrete grammar It is as follows:
(31) piece image is given, the key point of image is obtained using the good FAST feature point detection algorithms of real-time;
(32) adjacent two field pictures are given, quickly correlation Point matching is carried out using FLANN;
(33) motion of key point is described with multiple homography matrix H, using RANSAC algorithms, by continuous iteration, is obtained A series of estimation H={ H of homography matrixes1,H2,...,Hn};
(34) the time domain significance of definition key point is
S t ( p m ) = A m W × H Σ i = 1 n A i D ( p m , H i ) - - - ( 5 )
Wherein AmFor kinestate HmAll key points distribution area, W and H for video image width and height;
(35) bounding box is obtained as the well-marked target region of time domain using the method based on energy function border extension.
3. method according to claim 2, it is characterised in that be realized by the following method the notable angle value in spatial domain and obtain The notable angle value of time domain of key point merged:
(41) contrast of a motion significance is definedWherein notable angle value S of key point time domaintBy public affairs Formula (5) is obtained,For the average of the notable angle value of key point time domain;
(42) p is setiFor StI-th key point, then piShould meetWhereinFor the notable angle value average in spatial domain;
(43) weight of time domain is definedThe weight in spatial domainThe key in step (42) will be met The time domain of point is added with the notable angle value in spatial domain by weights.
4. method according to claim 2, it is characterised in that be realized by the following method time domain well-marked target region and carry Take:
Using notable key point p in spatial domain as seed point, seed region adopts the bounding box B of rectangle, if biFor the four of bounding box B Bar side, i ∈ { 1,2,3,4 } are numbering up and down, and the algorithm of border extension is as follows:
Initialization:The summit up and down of bounding box B is all set to key point p position, and point p is the internal point of bounding box B;
Step 1:B is calculated with the order being incremented by from i=1iSignificance ENERGY E on external boundaryouterThe significance of (i) and inner boundary ENERGY Einner(i), the calculating such as formula (4) of energy function, then calculating the weights that extend out of border isWherein liFor the length on i-th side of current bounding box B;
Step 2:If w (i) >=ε, i-th side is to one pixel cell of external expansion;ε is the threshold value that the extension of setting judges, It is set to 0.8Ts', Ts' for the spatial domain significance average in bounding box;
Step 3:If be expanded without new side in step 2, stop algorithm, export bounding box B;Otherwise, repeat step 1 With step 2.
5. method according to claim 1, it is characterised in that the rich of video is reduced by well-marked target region, is adopted Key-frame extraction is carried out with the camera lens adaptive approach with reference to on-line talking, concrete grammar is as follows:
(51) RGB color of marking area is converted to into hsv color space, takes wherein H components and S components calculate form and aspect and satisfy With degree rectangular histogram, H is rememberedpI () is i-th bin value of the form and aspect saturation histogram in pth frame well-marked target region, adopt Bhattacharyya distances are weighing the visible sensation distance between pth, the frames of q two
(52) key-frame extraction is carried out using the camera lens adaptive approach with reference to on-line talking, the cluster mode with static camera lens is It is main, supplemented by the cluster mode of dynamic camera lens;
For static camera lens, the form and aspect saturation histogram with marking area is that foundation carries out on-line talking, chooses in cluster and appoints A frame anticipate as key frame;
For dynamic camera lens, notable moving target is tracked first, then the tracking using notable moving target is used as on-line talking Foundation, the positional information of well-marked target is used as the foundation that key frame is extracted from cluster.
6. method according to claim 5, it is characterised in that realize static camera lens on-line talking by following steps:
Initialization:Calculate the form and aspect saturation histogram of the static frame of camera lens firstInitial cell number N=1, and willMake For cell Cell1Centre of form C1Vector, C1=f1
S11:If present frame p belongs to static camera lens, the form and aspect saturation histogram H of present frame is calculatedp
S12:The visible sensation distance of p and each cell centre of form is calculated, wherein minimum visible sensation distance cell is obtainedWherein m is the call number of cell;
S13:By Dsal(p,Cm) and threshold epsiloncIt is compared, works as Dsal(p,Cm)≤εcWhen, p is included into cell CellmIn, Ran Houyong HpSubstitute CellmThe centre of form;Otherwise, cell Cell is increasedN+1, by HpAs cell CellN+1Centre of form CN+1Vector, finally more New cell number N=N+1;
S14:For all static camera lens frames repeat S11, S12 and S13.
7. method according to claim 5, it is characterised in that realize that the key frame of dynamic camera lens is carried by following steps Take:
Initialization:Obtain the first frame of dynamic camera lens;
S21:Tracking target area, initialization particle or resampling are obtained, video next frame is extracted, judge whether the frame is sky, If sky, then terminate;
S22:FAST characteristic vectors are obtained, is matched with FLANN algorithms, characteristic vector weights are updated, if characteristic vector is not Foot, then terminate;
S23:Each particle weights are updated, key frame weights and target area is calculated, execution S21 is redirected.
8. the system of the extraction method of key frame of view-based access control model attention model described in a kind of any one of application claim 1 to 7, Characterized in that, including marking area extraction module, key-frame extraction module;
The marking area extraction module includes:
Spatial domain marking area extraction module, for extracting the marking area on spatial domain;
Time domain key point significance acquisition module, for extracting the notable angle value of the key point in time domain;
Fusion Module, for the key point on the marking area and time domain on spatial domain to be merged, and finally obtains notable area Domain;
The key-frame extraction module includes:
Static camera lens key-frame extraction module, for the key-frame extraction of static camera lens;
Dynamic camera lens key-frame extraction module, for the key-frame extraction of dynamic camera lens;
Camera lens adaptation module, for the control between static camera lens key-frame extraction module and dynamic camera lens key-frame extraction module System.
CN201410039072.7A 2014-01-26 2014-01-26 Key frame extraction method based on visual attention model and system Expired - Fee Related CN103824284B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410039072.7A CN103824284B (en) 2014-01-26 2014-01-26 Key frame extraction method based on visual attention model and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410039072.7A CN103824284B (en) 2014-01-26 2014-01-26 Key frame extraction method based on visual attention model and system

Publications (2)

Publication Number Publication Date
CN103824284A CN103824284A (en) 2014-05-28
CN103824284B true CN103824284B (en) 2017-05-10

Family

ID=50759326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410039072.7A Expired - Fee Related CN103824284B (en) 2014-01-26 2014-01-26 Key frame extraction method based on visual attention model and system

Country Status (1)

Country Link
CN (1) CN103824284B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598908B (en) * 2014-09-26 2017-11-28 浙江理工大学 A kind of crops leaf diseases recognition methods
CN104778721B (en) * 2015-05-08 2017-08-11 广州小鹏汽车科技有限公司 The distance measurement method of conspicuousness target in a kind of binocular image
CN105472380A (en) * 2015-11-19 2016-04-06 国家新闻出版广电总局广播科学研究院 Compression domain significance detection algorithm based on ant colony algorithm
CN106210444B (en) * 2016-07-04 2018-10-30 石家庄铁道大学 Motion state self adaptation key frame extracting method
CN107967476B (en) * 2017-12-05 2021-09-10 北京工业大学 Method for converting image into sound
CN110197107A (en) * 2018-08-17 2019-09-03 平安科技(深圳)有限公司 Micro- expression recognition method, device, computer equipment and storage medium
CN110322474B (en) * 2019-07-11 2021-06-01 史彩成 Image moving target real-time detection method based on unmanned aerial vehicle platform
CN110399847B (en) * 2019-07-30 2021-11-09 北京字节跳动网络技术有限公司 Key frame extraction method and device and electronic equipment
CN111191650B (en) * 2019-12-30 2023-07-21 北京市新技术应用研究所 Article positioning method and system based on RGB-D image visual saliency
CN111493935B (en) * 2020-04-29 2021-01-15 中国人民解放军总医院 Artificial intelligence-based automatic prediction and identification method and system for echocardiogram
CN112418012B (en) * 2020-11-09 2022-06-07 武汉大学 Video abstract generation method based on space-time attention model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2207111A1 (en) * 2009-01-08 2010-07-14 Thomson Licensing SA Method and apparatus for generating and displaying a video abstract
CN102088597A (en) * 2009-12-04 2011-06-08 成都信息工程学院 Method for estimating video visual salience through dynamic and static combination
CN102695056A (en) * 2012-05-23 2012-09-26 中山大学 Method for extracting compressed video key frames

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7263660B2 (en) * 2002-03-29 2007-08-28 Microsoft Corporation System and method for producing a video skim

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2207111A1 (en) * 2009-01-08 2010-07-14 Thomson Licensing SA Method and apparatus for generating and displaying a video abstract
CN102088597A (en) * 2009-12-04 2011-06-08 成都信息工程学院 Method for estimating video visual salience through dynamic and static combination
CN102695056A (en) * 2012-05-23 2012-09-26 中山大学 Method for extracting compressed video key frames

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Efficient visual attention based framework for extraction key frames from videos;Naveed Ejaz et al.;《Signal Processing:Image Communication》;20121017;第34-44页 *
Visual attention detection in video sequences using spatiotemporal cues;Yun Zhai et al.;《Proceedings of the 14th ACM international conference on multimedia》;20061031;第816-821页第1.2-第4节 *
基于视觉注意模型的自适应视频关键帧提取;蒋鹏 等;《中国图象图形学报》;20090831;第14卷(第8期);第1651-1653页第2-4节 *

Also Published As

Publication number Publication date
CN103824284A (en) 2014-05-28

Similar Documents

Publication Publication Date Title
CN103824284B (en) Key frame extraction method based on visual attention model and system
CN110111335B (en) Urban traffic scene semantic segmentation method and system for adaptive countermeasure learning
CN104809187B (en) A kind of indoor scene semanteme marking method based on RGB D data
CN106997597B (en) It is a kind of based on have supervision conspicuousness detection method for tracking target
CN108961349A (en) A kind of generation method, device, equipment and the storage medium of stylization image
CN108803617A (en) Trajectory predictions method and device
CN110147743A (en) Real-time online pedestrian analysis and number system and method under a kind of complex scene
CN103578119A (en) Target detection method in Codebook dynamic scene based on superpixels
CN102256065B (en) Automatic video condensing method based on video monitoring network
CN106937120B (en) Object-based monitor video method for concentration
CN109255357B (en) RGBD image collaborative saliency detection method
CN107798313A (en) A kind of human posture recognition method, device, terminal and storage medium
CN103226708A (en) Multi-model fusion video hand division method based on Kinect
Hua et al. Depth estimation with convolutional conditional random field network
CN107767416A (en) The recognition methods of pedestrian's direction in a kind of low-resolution image
CN110222760A (en) A kind of fast image processing method based on winograd algorithm
CN113223042B (en) Intelligent acquisition method and equipment for remote sensing image deep learning sample
CN107506792A (en) A kind of semi-supervised notable method for checking object
CN105574545B (en) The semantic cutting method of street environment image various visual angles and device
Jiang et al. Sparse attention module for optimizing semantic segmentation performance combined with a multi-task feature extraction network
CN107027051A (en) A kind of video key frame extracting method based on linear dynamic system
CN104008374B (en) Miner's detection method based on condition random field in a kind of mine image
CN103336830A (en) Image search method based on structure semantic histogram
CN108961196A (en) A kind of 3D based on figure watches the conspicuousness fusion method of point prediction attentively
CN108986103A (en) A kind of image partition method merged based on super-pixel and more hypergraphs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170510

CF01 Termination of patent right due to non-payment of annual fee