CN104113789B - On-line video abstraction generation method based on depth learning - Google Patents
On-line video abstraction generation method based on depth learning Download PDFInfo
- Publication number
- CN104113789B CN104113789B CN201410326406.9A CN201410326406A CN104113789B CN 104113789 B CN104113789 B CN 104113789B CN 201410326406 A CN201410326406 A CN 201410326406A CN 104113789 B CN104113789 B CN 104113789B
- Authority
- CN
- China
- Prior art keywords
- video
- frame block
- frame
- parameter
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an on-line video abstraction generation method based on depth learning. An original video is subjected to the following operation: 1) cutting the video uniformly into a group of small frame blocks, extracting statistical characteristics of each frame image and forming corresponding vectorization expressions; 2) pre-training video frame multilayer depth network and obtaining the nonlinearity expression of each frame; 3) selecting the front m frame blocks being as an initial concise video, and carrying out reconstruction on the concise video through a group sparse coding algorithm to obtain an initial dictionary and reconstruction coefficients; 4) updating depth network parameters according to the next frame block, carrying out reconstruction and reconstruction error calculation on the frame block, and adding the frame block to the concise video and updating the dictionary if the error is larger than a set threshold; and 5) processing new frame blocks till the end in sequence on line according to the step 4), and the updated concise video being generated video abstraction. With the method, latent high-level semantic information of the video can be excavated deeply, the video abstraction can be generated quickly, time of users is saved, and visual experience is improved.
Description
Technical field
The invention belongs to the technical field that video frequency abstract is generated, the video frequency abstract for being based particularly on deep learning generates online
Method.
Background technology
In recent years, becoming increasingly popular with portable sets such as digital camera, smart mobile phone, palm PCs, all kinds of to regard
The quantity of frequency increases in blowout.For example, adopt in intelligent transportation, safety monitoring, the public security video of social key areas such as deploy troops on garrison duty
Collection equipment is up to tens of thousands of roads in a medium-sized city, and the video data that these equipment are produced reaches PB levels.For lock onto target people
The personnel such as thing or vehicle, traffic police need to take a substantial amount of time the video flowing for retrieving for examination tedium monitoring, this greatly shadow
Work efficiency is rung, has been unfavorable for the establishment of safe city.Therefore, effectively choose from tediously long video flowing and include key message
Frame of video, i.e. video summarization technique receives the extensive concern of academia and industrial quarters.
Mainly for edited structuring video, a such as film can be divided into multiple fields to traditional video summarization technique
Scape, each scene is made up of multiple plots that same place occurs, and each plot is again by a series of frame of video structure of smooth and continuous
Into.Different from the structuring video such as traditional film, TV play, news report, monitor video is usually not cropped non-knot
Structure video, this brings larger challenge for the application of video summarization technique.
At present, main video frequency abstract field has based on key frame approach, creates new images, frame of video block, turns nature language
The technologies such as speech process.Method based on key frame includes plot rim detection, frame of video cluster, color histogram, having stable behavior
The strategies such as property;Create new images to generate using some successive frames comprising important content, the method is easily subject between different frame
Fuzzy factors affect;Frame of video block method is using technologies such as scene rim detection, the dialog analysis in structuring video to original
Beginning carries out cutting, forms short and small subject movies;Turn natural language processing to refer to using the captions in video and voice messaging general
Video frequency abstract is converted into the technology of text snippet, and the technology is not suitable for processing the monitor video without captions or sound.
A large amount of destructuring videos are continuously produced for intelligent transportation, the security protection key areas such as deploy to ensure effective monitoring and control of illegal activities, it is traditional
Video summarization method can not meet the application requirement of online treatment stream-type video.For this purpose, in the urgent need to can online treatment video
Stream, and can efficiently and accurately video summarization method of the selection comprising key content.
The content of the invention
In order to efficiently and accurately concentrate and simplifying tedious video flowing online, to save user time and strengthen video
The visual effect of content, the present invention proposes a kind of online generation method of video frequency abstract based on deep learning, and the method includes
Following steps:
1st, obtain after original video data, carry out following operation:
1) it is one group little frame block by the uniform cutting of video, each frame block includes multiframe, extracts the statistical nature of each two field picture,
Form corresponding vectorization to represent;
2) many layer depth networks of pre-training frame of video, obtain the non-linear expression of each frame;
3) m frames block is initially to simplify video before choosing, and it is reconstructed by a group sparse coding algorithm, obtains initial
Dictionary and reconstruction coefficients;
4) depth network parameter is updated according to next frame block, while reconstructed error is reconstructed and calculates to the frame block, if
Error is more than given threshold, then add the frame block and simplify in video and update dictionary;
5) according to step 4) successively the new frame block of online treatment until terminate, renewal simplify video be generate video
Summary.
Further, described step 1) described in the statistical nature of each two field picture of extraction form corresponding vectorization and represent,
Specifically:
1) set original video and be uniformly divided into n frame block, i.e.,Each frame block includes t two field pictures
(such as t=80), is scaled to each two field picture unified pixel size and keeps original vertical-horizontal proportion;
2) color histogram, color moment, edge orientation histogram, Gabor wavelet conversion, the local two of each two field picture are extracted
The global characteristics such as value pattern and Scale invariant features transform (SIFT:Scale-Invariant Feature Transform), plus
Fast robust features (SURF:Speeded Up Robust Feature) etc. local feature;
3) sequentially couple the above-mentioned characteristics of image of each frame, form dimension for nfVectorization represent.
Further, described step 2) in many layer depth networks of pre-training frame of video obtain the non-linear expression of each frame,
Specifically:
Using stacking denoising self-encoding encoder (SDA:Stacked Denoising Autoencoder) many layer depths of pre-training
Network (number of plies is less than 10);
A, each two field picture is proceeded as follows in each layer:First, by adding less Gaussian noise, setting defeated at random
Enter variable and generate each frame noise image for approach such as arbitrary values;Then, noise image passes through self-encoding encoder (AE:Auto
Encoder) carry out mapping and obtain its non-linear expression;
B, renewal is adjusted to each layer parameter of depth network using stochastic gradient descent algorithm;
Described step 3) in be reconstructed to initially simplifying video by group sparse coding algorithm, specifically:
1) initially simplify video and (m is the positive integer less than 50) is constituted by the front m frame block of original video, i.e.,Total ninit=m × t two field pictures, XkK-th primitive frame block of correspondence;By pre-training
Depth network obtains corresponding non-linear table and is shown asYkK-th frame block of correspondence it is non-linear
Represent;
2) initial dictionary D is set by ndIndividual atom composition, i.e.,djJ-th atom of correspondence;If reconstruction coefficients are
C, its element number correspondence frame number, the atom number of its dimension correspondence dictionary, i.e.,CkK-th frame block system of correspondence
Number,The i-th two field picture of correspondence;
3) using the group sparse coding object function of multiplier alternating direction implicit Optimal Regularization dictionary, can respectively obtain
Initial dictionary D and reconstruction coefficients C, that is, solve
Wherein, symbol | | | |2Represent the l of variable2Normal form, regularization parameter λ is the real number more than 0, function of many variables F
(Yk,Ck, D) be embodied as:
Wherein, parameter γ is the real number more than 0, symbolIn mathematical expression subrepresentation using dictionary D to the i-th two field picture
It is reconstructed.Here multiplier alternating direction implicit is specially:First preset parameter D, makes above-mentioned object function become for parameter C
Convex function;Then preset parameter C, makes above-mentioned object function become the convex function for parameter D, and iteration alternately updates two ginsengs
Number.
Described step 4) in depth network parameter is updated according to next frame block and the frame block is reconstructed and is calculated
Reconstructed error, specifically:
1) each two field picture of the frame block is done as follows successively:
A. the parameter of last layer in deep neural network, i.e. weight W and skew are updated using online gradient descent algorithm
Amount b;
B. the parameter of other layers in deep neural network is updated using Back Propagation Algorithm;
2) the non-linear expression of each two field picture is updated according to new parameter;
3) based on existing dictionary D, present frame block is reconstructed using group sparse coding and calculation error ε, i.e., to current
Frame block XkNon-linear expression YkIt is reconstructed, concretely comprises the following steps:First minimize function of many variables F (Yk,Ck, D) and obtain optimum reconstruct
CoefficientThen bring intoSection 1In and calculate its value and be current reconstructed error ε.
Described step 4) in if error be more than given threshold if by present frame block add simplify in video and more neologisms
Allusion quotation, specifically:
If 1) to present frame block XkNon-linear expression YkCalculated reconstructed error ε (learns from else's experience and tests more than given threshold θ
Value), then present frame block is added and simplified in video, i.e.,
If 2) currently simplify videoIn contain q frame block, then update dictionary two field picture it is non-linear expression set be
So useUpdate dictionary D and solve object function
Wherein, parameter lambda is the real number more than 0, for adjusting the impact of regularization term.
The present invention proposes the online generation method of video frequency abstract based on deep learning, has an advantage in that:Using depth
Practise the high-level semantics features excavated in video so that group sparse coding can more preferably reflect that dictionary reconstructs the journey of current video frame block
Degree, the video frequency abstract comprising interest region and key person's event is constituted so as to the frame of video block of most quantity of information;That what is simplified regards
Frequency summary saves the substantial amounts of time for user, while enhancing the visual experience of key content.
Description of the drawings
Fig. 1 is method of the present invention flow chart.
Specific embodiment
Referring to the drawings 1, the present invention is further illustrated:
1st, obtain after original video data, carry out following operation:
1) it is one group little frame block by the uniform cutting of video, each frame block includes multiframe, extracts the statistical nature of each two field picture,
Form corresponding vectorization to represent;
2) many layer depth networks of pre-training frame of video, obtain the non-linear expression of each frame;
3) m frames block is initially to simplify video before choosing, and it is reconstructed by a group sparse coding algorithm, obtains initial
Dictionary and reconstruction coefficients;
4) depth network parameter is updated according to next frame block, while reconstructed error is reconstructed and calculates to the frame block, if
Error is more than given threshold, then add the frame block and simplify in video and update dictionary;
5) according to step 4) successively the new frame block of online treatment until terminate, renewal simplify video be generate video
Summary.
Step 1) described in the statistical nature of each two field picture of extraction form corresponding vectorization and represent, specifically:
1) set original video and be uniformly divided into n frame block, i.e.,Each frame block includes t two field pictures (such as
T=80), each two field picture is scaled to into unified pixel size and keeps original vertical-horizontal proportion;
2) color histogram, color moment, edge orientation histogram, Gabor wavelet conversion, the local two of each two field picture are extracted
The global characteristics such as value pattern and Scale invariant features transform (SIFT:Scale-Invariant Feature Transform), plus
Fast robust features (SURF:Speeded Up Robust Feature) etc. local feature;
3) sequentially couple the above-mentioned characteristics of image of each frame, form the vectorization that dimension is nf and represent.
Step 2) in many layer depth networks of pre-training frame of video obtain the non-linear expression of each frame, specifically:
Using stacking denoising self-encoding encoder (SDA:Stacked Denoising Autoencoder) many layer depths of pre-training
Network (number of plies is less than 10);
A, each two field picture is proceeded as follows in each layer:First, by adding less Gaussian noise, setting defeated at random
Enter variable and generate each frame noise image for approach such as arbitrary values;Then, noise image passes through self-encoding encoder (AE:Auto
Encoder) carry out mapping and obtain its non-linear expression;
B, renewal is adjusted to each layer parameter of depth network using stochastic gradient descent algorithm;
Step 3) in be reconstructed to initially simplifying video by group sparse coding algorithm, specifically:
1) initially simplify video and (m is the positive integer less than 50) is constituted by the front m frame block of original video, i.e.,Total ninit=m × t two field pictures, XkK-th primitive frame block of correspondence;By pre-training
Depth network obtains corresponding non-linear table and is shown asYkK-th frame block of correspondence it is non-linear
Represent;
2) initial dictionary D is set by ndIndividual atom composition, i.e.,djJ-th atom of correspondence;If reconstruction coefficients are
C, its element number correspondence frame number, the atom number of its dimension correspondence dictionary, i.e.,CkK-th frame block system of correspondence
Number,The i-th two field picture of correspondence;
3) using the group sparse coding object function of multiplier alternating direction implicit Optimal Regularization dictionary, can respectively obtain
Initial dictionary D and reconstruction coefficients C, that is, solve
Wherein, symbol | | | |2Represent the l of variable2Normal form, regularization parameter λ is the real number more than 0, function of many variables F
(Yk,Ck, D) be embodied as:
Wherein, parameter γ is the real number more than 0, symbolIn mathematical expression subrepresentation using dictionary D to the i-th two field picture
It is reconstructed.Here multiplier alternating direction implicit is specially:First preset parameter D, makes above-mentioned object function become for parameter C
Convex function;Then preset parameter C, makes above-mentioned object function become the convex function for parameter D, and iteration alternately updates two ginsengs
Number.
Step 4) in depth network parameter is updated according to next frame block and reconstruct mistake is reconstructed to the frame block and calculates
Difference, specifically:
1) each two field picture of the frame block is done as follows successively:
A. the parameter of last layer in deep neural network, i.e. weight W and skew are updated using online gradient descent algorithm
Amount b;
B. the parameter of other layers in deep neural network is updated using Back Propagation Algorithm;
2) the non-linear expression of each two field picture is updated according to new parameter;
3) based on existing dictionary D, present frame block is reconstructed using group sparse coding and calculation error ε, i.e., to current
Frame block XkNon-linear expression YkIt is reconstructed, concretely comprises the following steps:First minimize function of many variables F (Yk,Ck, D) and obtain optimum reconstruct
CoefficientThen bring intoSection 1In and calculate its value and be current reconstructed error ε.
Step 4) in if error be more than given threshold if by present frame block add simplify in video and update dictionary, specifically
It is:
If 1) to present frame block XkNon-linear expression YkCalculated reconstructed error ε (learns from else's experience and tests more than given threshold θ
Value), then present frame block is added and simplified in video, i.e.,
If 2) currently simplify videoIn contain q frame block, then update dictionary two field picture it is non-linear expression set be
So useUpdate dictionary D and solve object function
Wherein, parameter lambda is the real number more than 0, for adjusting the impact of regularization term.
Claims (4)
1. a kind of online generation method of video frequency abstract based on deep learning, the method is characterized in that and obtain after original video,
Proceed as follows:
1) it is one group little frame block by the uniform cutting of video, each frame block includes t two field pictures, extracts the statistical nature of each two field picture,
Formation dimension is nfVectorization represent;
2) many layer depth networks of pre-training frame of video, obtain the non-linear expression of each frame;
3) m frames block is initially to simplify video before choosing, and it is reconstructed by a group sparse coding algorithm, obtains initial dictionary
And reconstruction coefficients;
4) depth network parameter is updated according to next frame block, while reconstructed error is reconstructed and calculates to the frame block, if error
More than given threshold, then the frame block is added and simplify in video and update dictionary;
5) according to step 4) successively the new frame block of online treatment until terminate, renewal simplify video be generate video pluck
Will.
2. the online generation method of video frequency abstract of deep learning is based on as claimed in claim 1, it is characterised in that:Step 1) in
The statistical nature of the described each two field picture of extraction forms corresponding vectorization and represents, comprises the concrete steps that:
1.1) set original video and be uniformly divided into n frame block, i.e.,Each frame block includes t two field pictures, will be each
Two field picture is scaled to unified pixel size and keeps original vertical-horizontal proportion;
1.2) global characteristics and local feature of each two field picture are extracted;
The global characteristics include color histogram, color moment, edge orientation histogram, Gabor wavelet conversion, local binary mould
Formula;
The local feature includes:Scale invariant features transform SIFT, acceleration robust features SURF;
1.3) sequentially couple the above-mentioned characteristics of image of each frame, form dimension for nfVectorization represent.
3. the online generation method of video frequency abstract of deep learning is based on as claimed in claim 1, it is characterised in that:Step 2) in
The many layer depth networks of described pre-training frame of video obtain the non-linear expression of each frame, specifically using stacking denoising self-encoding encoder
The many layer depth networks of SDA pre-training, including:
A, each two field picture is proceeded as follows in each layer:First, by adding Gaussian noise or setting input variable at random
Each frame noise image is generated for arbitrary value;Then, noise image carries out mapping and obtains its non-linear expression by self-encoding encoder AE;
B, renewal is adjusted to each layer parameter of depth network using stochastic gradient descent algorithm.
4. the online generation method of video frequency abstract of deep learning is based on as claimed in claim 1, it is characterised in that:Step 3) in
Described is reconstructed by group sparse coding algorithm to initially simplifying video, is comprised the concrete steps that:
3.1) initially simplify video to be made up of the front m frame block of original video, i.e.,It is total
ninit=m × t two field pictures, XkK-th primitive frame block of correspondence;Corresponding non-linear table is obtained by pre-training depth network to be shown asYkThe non-linear expression of k-th frame block of correspondence;
3.2) initial dictionary D is set by ndIndividual atom composition, i.e.,djJ-th atom of correspondence;If reconstruction coefficients are C, its
Element number correspondence frame number, the atom number of its dimension correspondence dictionary, i.e.,CkFor the reconstruct system of k-th frame block
Number,The i-th two field picture of correspondence;
3.3) using the group sparse coding object function of multiplier alternating direction implicit Optimal Regularization dictionary, can respectively obtain just
Beginning dictionary D and reconstruction coefficients C, that is, solve:
Wherein, symbol | | | |2Represent the l of variable2Normal form, regularization parameter λ is the real number more than 0, function of many variables F (Yk,Ck,
Being embodied as D):
Wherein, parameter γ is the real number more than 0, symbolIn mathematical expression subrepresentation the i-th two field picture is carried out using dictionary D
Reconstruct;Here multiplier alternating direction implicit is specially:First preset parameter D, makes above-mentioned object function become for the convex of parameter C
Function;Then preset parameter C, makes above-mentioned object function become the convex function for parameter D, and iteration alternately updates two parameters;
Step 4) described in depth network parameter is updated according to next frame block and reconstruct mistake is reconstructed to the frame block and calculates
Difference, comprises the concrete steps that:
4.1) each two field picture of the frame block is done as follows successively:
4.1.1 the parameter of last layer in deep neural network, i.e. weight W and skew) are updated using online gradient descent algorithm
Amount b;
4.1.2 the parameter of other layers in deep neural network) is updated using Back Propagation Algorithm;
4.2) the non-linear expression of each two field picture is updated according to new parameter;
4.3) based on existing dictionary D, present frame block is reconstructed using group sparse coding and calculation error ∈, i.e., to present frame
Block XkNon-linear expression YkIt is reconstructed, specially:First minimize function of many variables F (Yk,Ck, D) and obtain optimum reconstruction coefficientsThen bring intoSection 1In and calculate its value and be current reconstructed error ∈;
Step 4) described in if error be more than given threshold if by present frame block add simplify in video and update dictionary, specifically
It is:
(1) if to present frame block XkNon-linear expression YkCalculated reconstructed error ∈ is more than given threshold θ, then will be current
Frame block is added and simplified in video, i.e.,
(2) if currently simplifying videoIn contain q frame block, then update dictionary two field picture it is non-linear expression set beSo
UseUpdate dictionary D and solve object function:
Wherein, parameter lambda is the real number more than 0, for adjusting the impact of regularization term.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410326406.9A CN104113789B (en) | 2014-07-10 | 2014-07-10 | On-line video abstraction generation method based on depth learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410326406.9A CN104113789B (en) | 2014-07-10 | 2014-07-10 | On-line video abstraction generation method based on depth learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104113789A CN104113789A (en) | 2014-10-22 |
CN104113789B true CN104113789B (en) | 2017-04-12 |
Family
ID=51710398
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410326406.9A Active CN104113789B (en) | 2014-07-10 | 2014-07-10 | On-line video abstraction generation method based on depth learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104113789B (en) |
Families Citing this family (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016106595A1 (en) * | 2014-12-30 | 2016-07-07 | Nokia Technologies Oy | Moving object detection in videos |
CN104778659A (en) * | 2015-04-15 | 2015-07-15 | 杭州电子科技大学 | Single-frame image super-resolution reconstruction method on basis of deep learning |
CN105279495B (en) * | 2015-10-23 | 2019-06-04 | 天津大学 | A kind of video presentation method summarized based on deep learning and text |
CN105930314B (en) * | 2016-04-14 | 2019-02-05 | 清华大学 | System and method is generated based on coding-decoding deep neural network text snippet |
CN106331433B (en) * | 2016-08-25 | 2020-04-24 | 上海交通大学 | Video denoising method based on deep recurrent neural network |
CN106502985B (en) * | 2016-10-20 | 2020-01-31 | 清华大学 | neural network modeling method and device for generating titles |
CN106778571B (en) * | 2016-12-05 | 2020-03-27 | 天津大学 | Digital video feature extraction method based on deep neural network |
CN106686403B (en) * | 2016-12-07 | 2019-03-08 | 腾讯科技(深圳)有限公司 | A kind of video preview drawing generating method, device, server and system |
CN106993240B (en) * | 2017-03-14 | 2020-10-16 | 天津大学 | Multi-video abstraction method based on sparse coding |
CN107679031B (en) * | 2017-09-04 | 2021-01-05 | 昆明理工大学 | Advertisement and blog identification method based on stacking noise reduction self-coding machine |
CN107729821B (en) * | 2017-09-27 | 2020-08-11 | 浙江大学 | Video summarization method based on one-dimensional sequence learning |
CN107886109B (en) * | 2017-10-13 | 2021-06-25 | 天津大学 | Video abstraction method based on supervised video segmentation |
CN107911755B (en) * | 2017-11-10 | 2020-10-20 | 天津大学 | Multi-video abstraction method based on sparse self-encoder |
CN109803067A (en) * | 2017-11-16 | 2019-05-24 | 富士通株式会社 | Video concentration method, video enrichment facility and electronic equipment |
CN108388942A (en) * | 2018-02-27 | 2018-08-10 | 四川云淞源科技有限公司 | Information intelligent processing method based on big data |
CN108417204A (en) * | 2018-02-27 | 2018-08-17 | 四川云淞源科技有限公司 | Information security processing method based on big data |
CN108417206A (en) * | 2018-02-27 | 2018-08-17 | 四川云淞源科技有限公司 | High speed information processing method based on big data |
CN108419094B (en) | 2018-03-05 | 2021-01-29 | 腾讯科技(深圳)有限公司 | Video processing method, video retrieval method, device, medium and server |
CN110366050A (en) * | 2018-04-10 | 2019-10-22 | 北京搜狗科技发展有限公司 | Processing method, device, electronic equipment and the storage medium of video data |
CN108848422B (en) * | 2018-04-19 | 2020-06-02 | 清华大学 | Video abstract generation method based on target detection |
CN111046887A (en) * | 2018-10-15 | 2020-04-21 | 华北电力大学(保定) | Method for extracting characteristics of image with noise |
CN109360436B (en) * | 2018-11-02 | 2021-01-08 | Oppo广东移动通信有限公司 | Video generation method, terminal and storage medium |
CN111246246A (en) * | 2018-11-28 | 2020-06-05 | 华为技术有限公司 | Video playing method and device |
CN109635777B (en) * | 2018-12-24 | 2022-09-13 | 广东理致技术有限公司 | Video data editing and identifying method and device |
CN109905778B (en) * | 2019-01-03 | 2021-12-03 | 上海大学 | Method for scalable compression of single unstructured video based on group sparse coding |
CN110110646B (en) * | 2019-04-30 | 2021-05-04 | 浙江理工大学 | Gesture image key frame extraction method based on deep learning |
CN110225368B (en) * | 2019-06-27 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Video positioning method and device and electronic equipment |
CN110446067B (en) * | 2019-08-30 | 2021-11-02 | 杭州电子科技大学 | Tensor decomposition-based video concentration method |
US11295084B2 (en) | 2019-09-16 | 2022-04-05 | International Business Machines Corporation | Cognitively generating information from videos |
CN111563423A (en) * | 2020-04-17 | 2020-08-21 | 西北工业大学 | Unmanned aerial vehicle image target detection method and system based on depth denoising automatic encoder |
CN113626641B (en) * | 2021-08-11 | 2023-09-01 | 南开大学 | Method for generating video abstract based on neural network of multi-modal data and aesthetic principle |
CN117725148B (en) * | 2024-02-07 | 2024-06-25 | 湖南三湘银行股份有限公司 | Question-answer word library updating method based on self-learning |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102930518A (en) * | 2012-06-13 | 2013-02-13 | 上海汇纳网络信息科技有限公司 | Improved sparse representation based image super-resolution method |
CN103118220A (en) * | 2012-11-16 | 2013-05-22 | 佳都新太科技股份有限公司 | Keyframe pick-up algorithm based on multi-dimensional feature vectors |
CN103167284A (en) * | 2011-12-19 | 2013-06-19 | 中国电信股份有限公司 | Video streaming transmission method and system based on picture super-resolution |
CN103295242A (en) * | 2013-06-18 | 2013-09-11 | 南京信息工程大学 | Multi-feature united sparse represented target tracking method |
CN103413125A (en) * | 2013-08-26 | 2013-11-27 | 中国科学院自动化研究所 | Horror video identification method based on discriminant instance selection and multi-instance learning |
CN103531199A (en) * | 2013-10-11 | 2014-01-22 | 福州大学 | Ecological sound identification method on basis of rapid sparse decomposition and deep learning |
CN103761531A (en) * | 2014-01-20 | 2014-04-30 | 西安理工大学 | Sparse-coding license plate character recognition method based on shape and contour features |
-
2014
- 2014-07-10 CN CN201410326406.9A patent/CN104113789B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103167284A (en) * | 2011-12-19 | 2013-06-19 | 中国电信股份有限公司 | Video streaming transmission method and system based on picture super-resolution |
CN102930518A (en) * | 2012-06-13 | 2013-02-13 | 上海汇纳网络信息科技有限公司 | Improved sparse representation based image super-resolution method |
CN103118220A (en) * | 2012-11-16 | 2013-05-22 | 佳都新太科技股份有限公司 | Keyframe pick-up algorithm based on multi-dimensional feature vectors |
CN103295242A (en) * | 2013-06-18 | 2013-09-11 | 南京信息工程大学 | Multi-feature united sparse represented target tracking method |
CN103413125A (en) * | 2013-08-26 | 2013-11-27 | 中国科学院自动化研究所 | Horror video identification method based on discriminant instance selection and multi-instance learning |
CN103531199A (en) * | 2013-10-11 | 2014-01-22 | 福州大学 | Ecological sound identification method on basis of rapid sparse decomposition and deep learning |
CN103761531A (en) * | 2014-01-20 | 2014-04-30 | 西安理工大学 | Sparse-coding license plate character recognition method based on shape and contour features |
Non-Patent Citations (1)
Title |
---|
基于稀疏编码的自然特征提起及去噪;尚丽;《系统仿真学报》;20050731;第1782-1787页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104113789A (en) | 2014-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104113789B (en) | On-line video abstraction generation method based on depth learning | |
Qin et al. | Coverless image steganography: a survey | |
CN107704877B (en) | Image privacy perception method based on deep learning | |
Fu et al. | Fast crowd density estimation with convolutional neural networks | |
CN108171701B (en) | Significance detection method based on U network and counterstudy | |
Thai et al. | Image classification using support vector machine and artificial neural network | |
CN112907598B (en) | Method for detecting falsification of document and certificate images based on attention CNN | |
CN106845415A (en) | A kind of pedestrian based on deep learning becomes more meticulous recognition methods and device | |
CN108564166A (en) | Based on the semi-supervised feature learning method of the convolutional neural networks with symmetrical parallel link | |
Singh et al. | SiteForge: Detecting and localizing forged images on microblogging platforms using deep convolutional neural network | |
CN110457996B (en) | Video moving object tampering evidence obtaining method based on VGG-11 convolutional neural network | |
CN111597983B (en) | Method for realizing identification of generated false face image based on deep convolutional neural network | |
CN104661037A (en) | Tampering detection method and system for compressed image quantization table | |
CN109871749A (en) | A kind of pedestrian based on depth Hash recognition methods and device, computer system again | |
CN106203628A (en) | A kind of optimization method strengthening degree of depth learning algorithm robustness and system | |
CN111382305B (en) | Video deduplication method, video deduplication device, computer equipment and storage medium | |
Zhang et al. | No one can escape: A general approach to detect tampered and generated image | |
CN115600040B (en) | Phishing website identification method and device | |
CN109948639A (en) | A kind of picture rubbish recognition methods based on deep learning | |
CN110807369A (en) | Efficient short video content intelligent classification method based on deep learning and attention mechanism | |
CN107092935A (en) | A kind of assets alteration detection method | |
Liu et al. | Overview of image inpainting and forensic technology | |
Duan et al. | StegoPNet: Image steganography with generalization ability based on pyramid pooling module | |
Luo et al. | Image universal steganalysis based on best wavelet packet decomposition | |
Abdollahi et al. | Image steganography based on smooth cycle-consistent adversarial learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220809 Address after: Room 406, building 19, haichuangyuan, No. 998, Wenyi West Road, Yuhang District, Hangzhou City, Zhejiang Province Patentee after: HANGZHOU HUICUI INTELLIGENT TECHNOLOGY CO.,LTD. Address before: 310018 No. 2 street, Xiasha Higher Education Zone, Hangzhou, Zhejiang Patentee before: HANGZHOU DIANZI University |