CN106686472A - High-frame-rate video generation method and system based on depth learning - Google Patents
High-frame-rate video generation method and system based on depth learning Download PDFInfo
- Publication number
- CN106686472A CN106686472A CN201611241691.XA CN201611241691A CN106686472A CN 106686472 A CN106686472 A CN 106686472A CN 201611241691 A CN201611241691 A CN 201611241691A CN 106686472 A CN106686472 A CN 106686472A
- Authority
- CN
- China
- Prior art keywords
- frame
- video
- convolutional neural
- rate
- neural networks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/587—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0127—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a high-frame-rate video generation method based on depth learning. The method includes the steps that one or more original high-frame-rate video segments are used for generating a training sample set; multiple video frame subsets in the training sample set are used for training a dual-channel convolutional neural network model so as to obtain an optimized dual-channel convolutional neural network, wherein the dual-channel convolutional neural network model is a convolutional neural network formed by fusing two convolutional channels; the optimized dual-channel convolutional neural network is adopted, according to any two adjacent video frames in a low-frame-rate video, an insert frame of the two video frames is generated, and therefore a video whose frame rate is higher than that of the low-frame-rate video is generated. The whole process of the method is in an end-to-end mode, no subsequent processing is needed for video frames, the video frame rate conversion effect is good, the fluency of the synthetic video is high, and the method has excellent robustness on shaking, video scene changing and other problems in the video shooting process.
Description
Technical field
The invention belongs to technical field of computer vision, regards more particularly, to a kind of high frame per second based on deep learning
Frequency generation method and system.
Background technology
With the development of science and technology, the mode that people obtain video is more and more convenient, most of the reason for yet with hardware
Video is all that non-professional equipment is collected, and frame per second typically only has 24fps-30fps.The video of high frame per second has high smoothness
Degree, can bring more preferable visual experience.If people directly upload the video of high frame per second on the net, due to flow
Increase is consumed, the cost of people is also increased as.If directly above transmitting the video of low frame per second, due to due to network line,
There is unavoidably frame losing in video, video is more big easier this phenomenon occurs during transmission so that distal end regards
Frequency quality can not be effectively guaranteed, and this greatly have impact on the experience of people.It is therefore desirable in distal end using rational
Processing mode carries out subsequent treatment to the video that people upload so that the quality of video can meet the demand of people even further
Lift the experience of people.
The content of the invention
For the disadvantages described above or Improvement requirement of prior art, the invention provides a kind of high frame per second based on deep learning
Video generation method, its object is to the video that the video conversion of low frame per second is high frame per second thus be solved because low frame per second is regarded
Frame losing of the frequency during network transmission and cause video quality to decline the technical problem that the experience for giving people brings impact.
For achieving the above object, according to one aspect of the present invention, there is provided a kind of high frame per second based on deep learning is regarded
Frequency generation method, comprises the following steps:
(1) training sample set is generated using one or more original high frame-rate video fragments, the training sample concentrates bag
Multiple video frame set are included, comprising two training frames and a control frame in described each video frame set, described two
Training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, and the control frame is two training frames
Any one frame of midfeather;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold value;
(2) the multiple video frame set training dual pathways convolutional neural networks model concentrated using the training sample,
With dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways convolutional neural networks model is to be led to by two convolution
The convolutional neural networks of road fusion, two convolutional channels are respectively used to two frame of video in input video frame subclass simultaneously
Respectively the frame of video to being input into carries out convolution, and dual pathways convolutional neural networks model is carried out to the convolution results of two convolutional channels
Merge and be output as to predict frame, the bilateral is trained with the frame flyback that compares in the video frame set according to the prediction frame
Road convolutional neural networks model;
(3) video of arbitrary neighborhood two using dual pathways convolutional neural networks after the optimization, in low frame-rate video
Frame generates the insertion frame of this two frame of video, so as to generate video of the frame per second higher than the low frame-rate video.
In one embodiment of the present of invention, each convolutional channel in the dual pathways convolutional neural networks model includes k
Individual convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer
Output, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing
Number.
In one embodiment of the present of invention, in the convolutional channel, one is connected to respectively after front k-1 convolutional layer
To keep the openness of network, its mathematical description is the active coating of ReLU:
Fi(Y)=max (0, Zi)。
In one embodiment of the present of invention, in the feature that two frame of video are obtained after last convolutional layer
Response diagram is merged by the way of the addition of correspondence position value.
In one embodiment of the present of invention, the Sigmoid that is followed by for obtaining characteristic response figure in the mixing operation swashs
So that the pixel value of picture is mapped between 0-1, its mathematical description is layer living:
In one embodiment of the present of invention, average is adopted for 0, standard deviation is 1 Gauss distribution initialization convolution nuclear parameter,
Biasing is initialized as 0, and benchmark learning rate is initialized as 1e-6, benchmark learning rate reduces 10 times after the iteration m cycle, wherein m
For preset value.
In one embodiment of the present of invention, frame flyback instruction is compareed with the video frame set according to the prediction frame
The white silk dual pathways convolutional neural networks model, specially:
Using predicting frame and compareing the error between frame, the dual pathways convolution is trained using error backpropagation algorithm
Neutral net;Least squares error is wherein adopted for our majorized function, its mathematical description is:
Wherein i represents i-th samples pictures, and n represents the quantity of sample training collection, YiThe frame of video of neural network forecast is represented,
Represent the actual value of corresponding video frame.
In one embodiment of the present of invention, the k values are 3;First convolutional layer has the convolution kernel of 64 9*9, step-length
For 1 pixel, Filling power is 4, and Filling power refers to the number of turns in characteristic pattern periphery zero padding;Second convolutional layer has 32 1*1's
Convolution kernel, step-length is 1 pixel, and Filling power is 0;3rd convolutional layer has the convolution kernel of 3 5*5, and step-length is 1, and Filling power is
2。
It is another aspect of this invention to provide that additionally providing a kind of high frame-rate video based on deep learning generates system, bag
Training sample set generation module, dual pathways convolutional neural networks optimization module and high frame-rate video generation module are included, wherein:
The training sample set generation module, for generating training sample using one or more high frame-rate video fragments
Collection, the training sample is concentrated includes multiple video frame set, and two training frames are included in described each video frame set
With a control frame, two training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, described
Control frame is any one frame of the midfeather of two training frames;The frame per second of the high frame-rate video fragment is higher than setting frame
Rate threshold value;
The dual pathways convolutional neural networks optimization module, for the multiple video frames concentrated using the training sample
Set training dual pathways convolutional neural networks model, dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways volume
Product neural network model is the convolutional neural networks of two passage fusions, and two passages are respectively used to be input into the video frame
Two frame of video in conjunction and the frame of video to being input into carry out respectively convolution, dual pathways convolutional neural networks model it is logical to two
The result of road convolution is merged and is output as predicting frame, and frame is compareed with the video frame set according to the prediction frame
Dual pathways convolutional neural networks model described in regression training;
The high frame-rate video generation module, for using dual pathways convolutional neural networks after the optimization, according to low frame
The frame of video of arbitrary neighborhood two in rate video generates the insertion frame of this two frame of video, regards higher than the low frame per second so as to generate frame per second
The video of frequency.
In one embodiment of the present of invention, each convolutional channel in the dual pathways convolutional neural networks model includes k
Individual convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer
Output, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing
Number.
In general, by the above technical scheme that the present invention is contemplated, compared with prior art, the present invention has following
Technique effect:
(1) feature extraction of the invention and the prediction of frame are obtained by the supervised learning of training sample, without the need for artificial
Intervene, spatial diversity information can be preferably fitted under the scene of large-scale data;
(2) whole process of the invention is end to end, using the ability of self-teaching of convolutional neural networks, by self
The mode of study learns model parameter, it is succinct efficiently, overcome conventional art take time and effort when video frame rate is changed processing and
The characteristics of DeGrain.
Description of the drawings
Fig. 1 be the present invention the method for converting video frame rate based on deep learning flow chart, wherein FiRepresent i-th layer
Output, Yt-1、Yt、Yt+1Represent continuous three frames frame of video, YtIt is used for calculation error as actual value, Prediction represents net
The frame of video of network prediction.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, and
It is not used in the restriction present invention.As long as additionally, technical characteristic involved in invention described below each embodiment
Not constituting conflict each other just can be mutually combined.
Hereinafter first just the technical term of the present invention is explained and illustrated:
Convolutional neural networks (Convolutional Neural Network, CNN):One kind can be used for image classification, return
The neutral net of task, its particularity such as return to be embodied in two aspects, the interneuronal connection for being on the one hand it is non-complete
Connection, the weight of the connection in another aspect same layer between some neurons is shared.Network is generally by convolutional layer, pond
Change layer and full articulamentum is constituted.Convolutional layer and pond layer are responsible for extracting the hierarchy characteristic of image, and full articulamentum is responsible for extracting
Feature classified or returned.The parameter of network includes parameter and the biasing of convolution kernel and full articulamentum, and parameter can be with
Obtained from data learning by reverse conduction algorithm.
Reverse conduction algorithm (Backpropagation Algorithm, BP):Be it is a kind of with optimization method (such as gradient
Descent method) be used in combination, for training the common methods of artificial neural network.The method is damaged to all weight calculation in network
The gradient of function is lost, this gradient can feed back to optimization method, for updating weights to minimize loss function.Algorithm master
To include two stages:The forward direction of excitation, back propagation and the renewal of weight.
As the arrival in big data epoch, the scale of video database are also increasing, the solution of this problem is also more next
It is more urgent.The working method that deep neural network can simulate human brain in a kind of preferable mode is analyzed to data,
In recent years, deep learning is all achieved in the every field of computer vision and successfully applied, but turning for video frame rate
The problem of changing there is no significantly research, in view of traditional method for converting video frame rate process is complicated, time human cost is higher, this
The bright one kind that proposes is based on deep learning method for converting video frame rate.The method whole process be end to end, it is easy and efficiently,
The problems such as shake, scene for video switches all has stronger robustness.
As shown in figure 1, method for converting video frame rate of the present invention based on deep learning, may comprise steps of:
(1) training sample set is generated using one or more original high frame-rate video fragments, the training sample concentrates bag
Multiple video frame set are included, comprising two training frames and a control frame in described each video frame set, described two
Training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, and the control frame is two training frames
Any one frame of midfeather;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold value;
Specifically, high frame-rate video fragment can be extracted and obtains sets of video frames, training sample is obtained according to a certain percentage
Collection;
Training sample set is combined into by multiple video frames, comprising two instructions in described each video frame set
Practice frame and a control frame.Control frame is chosen for the most middle of two training frames or near that most middle frame.Typically
In the case of refer to and take continuous 3 frame, a middle frame is control frame, and another two frame is training frames;If frame per second is sufficiently high, can also take
Be separated by two frames of multiframe (depending on frame per second, it is impossible to too many) as training frames, and mesophase every multiframe in can choose middle ware
Every any one frame for control frame;High video frame rate for example for training is 60, and the video has N frames, then according to interval one
The mode of frame sample this training, from the 2nd to N-1 frames in take a frame at random as actual value (control frame), it is and the frame is adjacent
Two frames as training sample (two training frames) be input to network the inside.In the same manner, it is also possible to which the mode according to interval multiframe is come
Training sample, so can be used for the video of lower frame per second, i.e., the video conversion of lower frame per second is the video of high frame per second.
(2) the multiple video frame set training dual pathways convolutional neural networks model concentrated using the training sample,
With dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways convolutional neural networks model is to be led to by two convolution
The convolutional neural networks of road fusion, two convolutional channels are respectively used to two frame of video in input video frame subclass simultaneously
Respectively the frame of video to being input into carries out convolution, and dual pathways convolutional neural networks model is carried out to the convolution results of two convolutional channels
Merge and be output as to predict frame, the bilateral is trained with the frame flyback that compares in the video frame set according to the prediction frame
Road convolutional neural networks model;
First have to design and Implement a dual pathways convolutional neural networks, specifically:
The dual pathways convolutional neural networks model set up is the convolutional neural networks of two convolutional channel fusions, is included altogether
K convolutional layer, k>0, preferably 3, respectively convolution is individually carried out to two frame of video pictures (training frames).First convolutional layer has
The convolution kernel of 64 9*9, step-length is 1 pixel, and Filling power is 4, and Filling power refers to the number of turns in characteristic pattern periphery zero padding.Second
Individual convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, and Filling power is 0.3rd volume layer has the convolution kernel of 3 5*5,
Step-length is 1, and Filling power is 2.The mathematical description of convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of network, and input picture is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer defeated
Go out, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter;
In 3 convolutional layers, the active coating of a ReLU is connected to after the 1st and the 2nd convolutional layer respectively to keep
Network it is openness, its mathematical description is:
Fi(Y)=max (0, Zi)。
The characteristic response figure that two frame of video pictures are obtained after the 3rd convolutional layer is added using correspondence position value
Mode merged;
After the mixing operation, the characteristic response figure for obtaining is followed by a Sigmoid active coating with by the picture of picture
Plain value is mapped between 0-1, and its mathematical description is:
Before the dual pathways convolutional neural networks are trained, need to enter each pixel value in frame of video divided by 255
Row normalized, the pixel value after normalization is between 0 to 1;
Also, before the dual pathways convolutional neural networks are trained, need to initialize the employing of convolutional neural networks parameter
Average is 0, and standard deviation is 1 Gauss distribution initialization convolution nuclear parameter, and biasing is initialized as 0, the initialization of benchmark learning rate
For 1e-6, benchmark learning rate reduces 10 times after the iteration m cycle, and wherein m is preset value;For example, m preferably 2, then in front 1-m
In individual iteration cycle, learning rate=1e-6, after the iteration m cycle, learning rate=1e-7, and be always maintained at constant.
Specifically, it is possible to use the predictive value of network with compare between error, instructed using error backpropagation algorithm
Practice dual pathways convolutional neural networks.Least squares error is adopted for our majorized function, its mathematical description is:
Wherein i represents i-th samples pictures, and n represents the quantity of sample training collection, YiThe frame of video of neural network forecast is represented,
Represent the actual value of corresponding video frame;
(3) video of arbitrary neighborhood two using dual pathways convolutional neural networks after the optimization, in low frame-rate video
Frame generates the insertion frame of this two frame of video, so as to generate video of the frame per second higher than the low frame-rate video.
As it will be easily appreciated by one skilled in the art that the foregoing is only presently preferred embodiments of the present invention, not to
The present invention, all any modification, equivalent and improvement made within the spirit and principles in the present invention etc. are limited, all should be included
Within protection scope of the present invention.
Claims (10)
1. a kind of high frame-rate video generation method based on deep learning, it is characterised in that the method comprising the steps of:
(1) training sample set is generated using one or more original high frame-rate video fragments, the training sample is concentrated including many
Individual video frame set, comprising two training frames and a control frame, two training in described each video frame set
Frame is two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, and the control frame is in the middle of two training frames
Any one frame at interval;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold value;
(2) the multiple video frame set training dual pathways convolutional neural networks model concentrated using the training sample, to obtain
Dual pathways convolutional neural networks after must optimizing;Wherein, the dual pathways convolutional neural networks model is to be melted by two convolutional channels
The convolutional neural networks for closing, two convolutional channels are respectively used to two frame of video in input video frame subclass and distinguish
Frame of video to being input into carries out convolution, and dual pathways convolutional neural networks model merges to the convolution results of two convolutional channels
And be output as predicting frame, the dual pathways volume is trained with the frame flyback that compares in the video frame set according to the prediction frame
Product neural network model;
(3) frame of video of arbitrary neighborhood two life using dual pathways convolutional neural networks after the optimization, in low frame-rate video
Into the insertion frame of this two frame of video, so as to generate video of the frame per second higher than the low frame-rate video.
2. the high frame-rate video generation method of deep learning is based on as claimed in claim 1, it is characterised in that the dual pathways
Each convolutional channel in convolutional neural networks model includes k convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer defeated
Go out, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter.
3. the high frame-rate video generation method of deep learning is based on as claimed in claim 2, it is characterised in that in the convolution
In passage, the active coating of a ReLU is connected to respectively after front k-1 convolutional layer to keep the openness of network, its mathematics is retouched
State for:
Fi(Y)=max (0, Zi)。
4. the high frame-rate video generation method of deep learning is based on as claimed in claim 1 or 2, it is characterised in that described
The characteristic response figure that two frame of video are obtained after last convolutional layer is carried out by the way of the addition of correspondence position value
Fusion.
5. the high frame-rate video generation method of deep learning is based on as claimed in claim 1 or 2, it is characterised in that described
What mixing operation obtained characteristic response figure is followed by a Sigmoid active coating so that the pixel value of picture is mapped between 0-1, its
Mathematical description is:
6. the high frame-rate video generation method based on deep learning as claimed in claim 2, it is characterised in that adopt average for
0, standard deviation is 1 Gauss distribution initialization convolution nuclear parameter, and biasing is initialized as 0, and benchmark learning rate is initialized as 1e-6,
Benchmark learning rate reduces 10 times after the iteration m cycle, and wherein m is preset value.
7. the high frame-rate video generation method of deep learning is based on as claimed in claim 1 or 2, it is characterised in that according to institute
State prediction frame and train the dual pathways convolutional neural networks model with the frame flyback that compares in the video frame set, specifically
For:
Using predicting frame and compareing the error between frame, the dual pathways convolutional Neural is trained using error backpropagation algorithm
Network;Least squares error is wherein adopted for our majorized function, its mathematical description is:
Wherein i represents i-th samples pictures, and n represents the quantity of sample training collection, YiThe frame of video of neural network forecast is represented,Represent
The actual value of corresponding video frame.
8. the high frame-rate video generation method of deep learning is based on as claimed in claim 2, it is characterised in that the k values
For 3;First convolutional layer has the convolution kernel of 64 9*9, and step-length is 1 pixel, and Filling power is 4, and Filling power is referred in characteristic pattern
The number of turns of periphery zero padding;Second convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, and Filling power is 0;3rd volume
Lamination has the convolution kernel of 3 5*5, and step-length is 1, and Filling power is 2.
9. a kind of high frame-rate video based on deep learning generates system, it is characterised in that including training sample set generation module,
Dual pathways convolutional neural networks optimization module and high frame-rate video generation module, wherein:
The training sample set generation module, for generating training sample set, institute using one or more high frame-rate video fragments
Stating training sample and concentrating includes multiple video frame set, and two training frames and one are included in described each video frame set
Control frame, two training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, the control frame
For any one frame of the midfeather of two training frames;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold
Value;
The dual pathways convolutional neural networks optimization module, for the multiple video frame set concentrated using the training sample
Training dual pathways convolutional neural networks model, dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways convolution god
Jing network modeies are the convolutional neural networks of two passage fusions, and two passages are respectively used to be input in the video frame set
Two frame of video and the frame of video to being input into carries out respectively convolution, dual pathways convolutional neural networks model to two passages volumes
Long-pending result is merged and is output as predicting frame, and frame flyback is compareed with the video frame set according to the prediction frame
Train the dual pathways convolutional neural networks model;
The high frame-rate video generation module, for using dual pathways convolutional neural networks after the optimization, being regarded according to low frame per second
The frame of video of arbitrary neighborhood two in frequency generates the insertion frame of this two frame of video, so as to generate frame per second higher than the low frame-rate video
Video.
10. the high frame-rate video based on deep learning as claimed in claim generates system, it is characterised in that the dual pathways
Each convolutional channel in convolutional neural networks model includes k convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer defeated
Go out, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611241691.XA CN106686472B (en) | 2016-12-29 | 2016-12-29 | A kind of high frame-rate video generation method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611241691.XA CN106686472B (en) | 2016-12-29 | 2016-12-29 | A kind of high frame-rate video generation method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106686472A true CN106686472A (en) | 2017-05-17 |
CN106686472B CN106686472B (en) | 2019-04-26 |
Family
ID=58872327
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611241691.XA Active CN106686472B (en) | 2016-12-29 | 2016-12-29 | A kind of high frame-rate video generation method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106686472B (en) |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107481209A (en) * | 2017-08-21 | 2017-12-15 | 北京航空航天大学 | A kind of image or video quality Enhancement Method based on convolutional neural networks |
CN107613299A (en) * | 2017-09-29 | 2018-01-19 | 杭州电子科技大学 | A kind of method for improving conversion effect in frame rate using network is generated |
CN107886081A (en) * | 2017-11-23 | 2018-04-06 | 武汉理工大学 | Two-way U Net deep neural network mine down-holes hazardous act is intelligently classified discrimination method |
CN108111860A (en) * | 2018-01-11 | 2018-06-01 | 安徽优思天成智能科技有限公司 | Video sequence lost frames prediction restoration methods based on depth residual error network |
CN108322685A (en) * | 2018-01-12 | 2018-07-24 | 广州华多网络科技有限公司 | Video frame interpolation method, storage medium and terminal |
CN108600762A (en) * | 2018-04-23 | 2018-09-28 | 中国科学技术大学 | In conjunction with the progressive video frame generating method of motion compensation and neural network algorithm |
CN108600655A (en) * | 2018-04-12 | 2018-09-28 | 视缘(上海)智能科技有限公司 | A kind of video image synthetic method and device |
CN108810551A (en) * | 2018-06-20 | 2018-11-13 | Oppo(重庆)智能科技有限公司 | A kind of video frame prediction technique, terminal and computer storage media |
CN108830812A (en) * | 2018-06-12 | 2018-11-16 | 福建帝视信息科技有限公司 | A kind of high frame per second of video based on network deep learning remakes method |
CN108961236A (en) * | 2018-06-29 | 2018-12-07 | 国信优易数据有限公司 | Training method and device, the detection method and device of circuit board defect detection model |
CN109068174A (en) * | 2018-09-12 | 2018-12-21 | 上海交通大学 | Video frame rate upconversion method and system based on cyclic convolution neural network |
CN109120936A (en) * | 2018-09-27 | 2019-01-01 | 贺禄元 | A kind of coding/decoding method and device of video image |
CN109360436A (en) * | 2018-11-02 | 2019-02-19 | Oppo广东移动通信有限公司 | A kind of video generation method, terminal and storage medium |
CN109379550A (en) * | 2018-09-12 | 2019-02-22 | 上海交通大学 | Video frame rate upconversion method and system based on convolutional neural networks |
CN109922372A (en) * | 2019-02-26 | 2019-06-21 | 深圳市商汤科技有限公司 | Video data handling procedure and device, electronic equipment and storage medium |
CN110163061A (en) * | 2018-11-14 | 2019-08-23 | 腾讯科技(深圳)有限公司 | For extracting the method, apparatus, equipment and computer-readable medium of video finger print |
US10491941B2 (en) | 2015-01-22 | 2019-11-26 | Microsoft Technology Licensing, Llc | Predictive server-side rendering of scenes |
CN110636221A (en) * | 2019-09-23 | 2019-12-31 | 天津天地人和企业管理咨询有限公司 | System and method for super frame rate of sensor based on FPGA |
CN110780664A (en) * | 2018-07-25 | 2020-02-11 | 格力电器(武汉)有限公司 | Robot control method and device and sweeping robot |
CN111371983A (en) * | 2018-12-26 | 2020-07-03 | 清华大学 | Video online stabilization method and system |
US10924525B2 (en) | 2018-10-01 | 2021-02-16 | Microsoft Technology Licensing, Llc | Inducing higher input latency in multiplayer programs |
CN112584158A (en) * | 2019-09-30 | 2021-03-30 | 复旦大学 | Video quality enhancement method and system |
RU2747965C1 (en) * | 2020-10-05 | 2021-05-18 | Самсунг Электроникс Ко., Лтд. | Frc occlusion processing with deep learning |
WO2021104381A1 (en) * | 2019-11-27 | 2021-06-03 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and device for stylizing video and storage medium |
CN113420771A (en) * | 2021-06-30 | 2021-09-21 | 扬州明晟新能源科技有限公司 | Colored glass detection method based on feature fusion |
CN113516050A (en) * | 2021-05-19 | 2021-10-19 | 江苏奥易克斯汽车电子科技股份有限公司 | Scene change detection method and device based on deep learning |
CN113630621A (en) * | 2020-05-08 | 2021-11-09 | 腾讯科技(深圳)有限公司 | Video processing method, related device and storage medium |
CN113647064A (en) * | 2019-04-01 | 2021-11-12 | 株式会社电装 | Information processing apparatus |
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN202285412U (en) * | 2011-09-02 | 2012-06-27 | 深圳市华美特科技有限公司 | Low frame rate transmission or motion image twinkling elimination system |
CN104102919A (en) * | 2014-07-14 | 2014-10-15 | 同济大学 | Image classification method capable of effectively preventing convolutional neural network from being overfit |
CN105787510A (en) * | 2016-02-26 | 2016-07-20 | 华东理工大学 | System and method for realizing subway scene classification based on deep learning |
CN106022237A (en) * | 2016-05-13 | 2016-10-12 | 电子科技大学 | Pedestrian detection method based on end-to-end convolutional neural network |
CN106228124A (en) * | 2016-07-17 | 2016-12-14 | 西安电子科技大学 | SAR image object detection method based on convolutional neural networks |
-
2016
- 2016-12-29 CN CN201611241691.XA patent/CN106686472B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN202285412U (en) * | 2011-09-02 | 2012-06-27 | 深圳市华美特科技有限公司 | Low frame rate transmission or motion image twinkling elimination system |
CN104102919A (en) * | 2014-07-14 | 2014-10-15 | 同济大学 | Image classification method capable of effectively preventing convolutional neural network from being overfit |
CN105787510A (en) * | 2016-02-26 | 2016-07-20 | 华东理工大学 | System and method for realizing subway scene classification based on deep learning |
CN106022237A (en) * | 2016-05-13 | 2016-10-12 | 电子科技大学 | Pedestrian detection method based on end-to-end convolutional neural network |
CN106228124A (en) * | 2016-07-17 | 2016-12-14 | 西安电子科技大学 | SAR image object detection method based on convolutional neural networks |
Cited By (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10491941B2 (en) | 2015-01-22 | 2019-11-26 | Microsoft Technology Licensing, Llc | Predictive server-side rendering of scenes |
CN107481209A (en) * | 2017-08-21 | 2017-12-15 | 北京航空航天大学 | A kind of image or video quality Enhancement Method based on convolutional neural networks |
CN107481209B (en) * | 2017-08-21 | 2020-04-21 | 北京航空航天大学 | Image or video quality enhancement method based on convolutional neural network |
CN107613299A (en) * | 2017-09-29 | 2018-01-19 | 杭州电子科技大学 | A kind of method for improving conversion effect in frame rate using network is generated |
CN107886081A (en) * | 2017-11-23 | 2018-04-06 | 武汉理工大学 | Two-way U Net deep neural network mine down-holes hazardous act is intelligently classified discrimination method |
CN108111860A (en) * | 2018-01-11 | 2018-06-01 | 安徽优思天成智能科技有限公司 | Video sequence lost frames prediction restoration methods based on depth residual error network |
CN108111860B (en) * | 2018-01-11 | 2020-04-14 | 安徽优思天成智能科技有限公司 | Video sequence lost frame prediction recovery method based on depth residual error network |
CN108322685A (en) * | 2018-01-12 | 2018-07-24 | 广州华多网络科技有限公司 | Video frame interpolation method, storage medium and terminal |
CN108322685B (en) * | 2018-01-12 | 2020-09-25 | 广州华多网络科技有限公司 | Video frame insertion method, storage medium and terminal |
WO2019137248A1 (en) * | 2018-01-12 | 2019-07-18 | 广州华多网络科技有限公司 | Video frame interpolation method, storage medium and terminal |
CN108600655A (en) * | 2018-04-12 | 2018-09-28 | 视缘(上海)智能科技有限公司 | A kind of video image synthetic method and device |
CN108600762A (en) * | 2018-04-23 | 2018-09-28 | 中国科学技术大学 | In conjunction with the progressive video frame generating method of motion compensation and neural network algorithm |
CN108600762B (en) * | 2018-04-23 | 2020-05-15 | 中国科学技术大学 | Progressive video frame generation method combining motion compensation and neural network algorithm |
CN108830812A (en) * | 2018-06-12 | 2018-11-16 | 福建帝视信息科技有限公司 | A kind of high frame per second of video based on network deep learning remakes method |
CN108830812B (en) * | 2018-06-12 | 2021-08-31 | 福建帝视信息科技有限公司 | Video high frame rate reproduction method based on grid structure deep learning |
CN108810551A (en) * | 2018-06-20 | 2018-11-13 | Oppo(重庆)智能科技有限公司 | A kind of video frame prediction technique, terminal and computer storage media |
CN108961236B (en) * | 2018-06-29 | 2021-02-26 | 国信优易数据股份有限公司 | Circuit board defect detection method and device |
CN108961236A (en) * | 2018-06-29 | 2018-12-07 | 国信优易数据有限公司 | Training method and device, the detection method and device of circuit board defect detection model |
CN110780664A (en) * | 2018-07-25 | 2020-02-11 | 格力电器(武汉)有限公司 | Robot control method and device and sweeping robot |
CN109068174A (en) * | 2018-09-12 | 2018-12-21 | 上海交通大学 | Video frame rate upconversion method and system based on cyclic convolution neural network |
CN109068174B (en) * | 2018-09-12 | 2019-12-27 | 上海交通大学 | Video frame rate up-conversion method and system based on cyclic convolution neural network |
CN109379550A (en) * | 2018-09-12 | 2019-02-22 | 上海交通大学 | Video frame rate upconversion method and system based on convolutional neural networks |
CN109120936A (en) * | 2018-09-27 | 2019-01-01 | 贺禄元 | A kind of coding/decoding method and device of video image |
US10924525B2 (en) | 2018-10-01 | 2021-02-16 | Microsoft Technology Licensing, Llc | Inducing higher input latency in multiplayer programs |
CN109360436A (en) * | 2018-11-02 | 2019-02-19 | Oppo广东移动通信有限公司 | A kind of video generation method, terminal and storage medium |
CN110163061A (en) * | 2018-11-14 | 2019-08-23 | 腾讯科技(深圳)有限公司 | For extracting the method, apparatus, equipment and computer-readable medium of video finger print |
CN110163061B (en) * | 2018-11-14 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and computer readable medium for extracting video fingerprint |
CN111371983A (en) * | 2018-12-26 | 2020-07-03 | 清华大学 | Video online stabilization method and system |
CN113766313B (en) * | 2019-02-26 | 2024-03-05 | 深圳市商汤科技有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN109922372A (en) * | 2019-02-26 | 2019-06-21 | 深圳市商汤科技有限公司 | Video data handling procedure and device, electronic equipment and storage medium |
CN109922372B (en) * | 2019-02-26 | 2021-10-12 | 深圳市商汤科技有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN113766313A (en) * | 2019-02-26 | 2021-12-07 | 深圳市商汤科技有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN113647064A (en) * | 2019-04-01 | 2021-11-12 | 株式会社电装 | Information processing apparatus |
CN113647064B (en) * | 2019-04-01 | 2022-12-27 | 株式会社电装 | Information processing apparatus |
CN110636221A (en) * | 2019-09-23 | 2019-12-31 | 天津天地人和企业管理咨询有限公司 | System and method for super frame rate of sensor based on FPGA |
CN112584158A (en) * | 2019-09-30 | 2021-03-30 | 复旦大学 | Video quality enhancement method and system |
WO2021104381A1 (en) * | 2019-11-27 | 2021-06-03 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and device for stylizing video and storage medium |
CN113630621A (en) * | 2020-05-08 | 2021-11-09 | 腾讯科技(深圳)有限公司 | Video processing method, related device and storage medium |
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
RU2747965C1 (en) * | 2020-10-05 | 2021-05-18 | Самсунг Электроникс Ко., Лтд. | Frc occlusion processing with deep learning |
CN113516050A (en) * | 2021-05-19 | 2021-10-19 | 江苏奥易克斯汽车电子科技股份有限公司 | Scene change detection method and device based on deep learning |
CN113420771A (en) * | 2021-06-30 | 2021-09-21 | 扬州明晟新能源科技有限公司 | Colored glass detection method based on feature fusion |
CN113420771B (en) * | 2021-06-30 | 2024-04-19 | 扬州明晟新能源科技有限公司 | Colored glass detection method based on feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN106686472B (en) | 2019-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106686472A (en) | High-frame-rate video generation method and system based on depth learning | |
Kang et al. | Task-oriented image transmission for scene classification in unmanned aerial systems | |
CN109064507B (en) | Multi-motion-stream deep convolution network model method for video prediction | |
CN109271933B (en) | Method for estimating three-dimensional human body posture based on video stream | |
CN106503106B (en) | A kind of image hash index construction method based on deep learning | |
CN108229338A (en) | A kind of video behavior recognition methods based on depth convolution feature | |
CN105072373B (en) | Video super-resolution method and system based on bidirectional circulating convolutional network | |
CN107066445B (en) | The deep learning method of one attribute emotion word vector | |
CN110096950A (en) | A kind of multiple features fusion Activity recognition method based on key frame | |
CN108830252A (en) | A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic | |
CN110634108A (en) | Composite degraded live webcast video enhancement method based on element-cycle consistency countermeasure network | |
CN110135386B (en) | Human body action recognition method and system based on deep learning | |
TWI226193B (en) | Image segmentation method, image segmentation apparatus, image processing method, and image processing apparatus | |
CN104899921B (en) | Single-view videos human body attitude restoration methods based on multi-modal own coding model | |
CN106952271A (en) | A kind of image partition method handled based on super-pixel segmentation and EM/MPM | |
CN108111860B (en) | Video sequence lost frame prediction recovery method based on depth residual error network | |
CN107590518A (en) | A kind of confrontation network training method of multiple features study | |
CN108986166A (en) | A kind of monocular vision mileage prediction technique and odometer based on semi-supervised learning | |
CN107886169A (en) | A kind of multiple dimensioned convolution kernel method that confrontation network model is generated based on text image | |
CN113807318B (en) | Action recognition method based on double-flow convolutional neural network and bidirectional GRU | |
CN106709933B (en) | Motion estimation method based on unsupervised learning | |
CN111833590A (en) | Traffic signal lamp control method and device and computer readable storage medium | |
CN112257846A (en) | Neuron model, topology, information processing method, and retinal neuron | |
Kim et al. | Dynamic motion estimation and evolution video prediction network | |
Zhang et al. | Accurate and efficient event-based semantic segmentation using adaptive spiking encoder–decoder network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |