CN106686472A - High-frame-rate video generation method and system based on depth learning - Google Patents

High-frame-rate video generation method and system based on depth learning Download PDF

Info

Publication number
CN106686472A
CN106686472A CN201611241691.XA CN201611241691A CN106686472A CN 106686472 A CN106686472 A CN 106686472A CN 201611241691 A CN201611241691 A CN 201611241691A CN 106686472 A CN106686472 A CN 106686472A
Authority
CN
China
Prior art keywords
frame
video
convolutional neural
rate
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611241691.XA
Other languages
Chinese (zh)
Other versions
CN106686472B (en
Inventor
王兴刚
罗浩
姜玉静
刘文予
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201611241691.XA priority Critical patent/CN106686472B/en
Publication of CN106686472A publication Critical patent/CN106686472A/en
Application granted granted Critical
Publication of CN106686472B publication Critical patent/CN106686472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0127Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a high-frame-rate video generation method based on depth learning. The method includes the steps that one or more original high-frame-rate video segments are used for generating a training sample set; multiple video frame subsets in the training sample set are used for training a dual-channel convolutional neural network model so as to obtain an optimized dual-channel convolutional neural network, wherein the dual-channel convolutional neural network model is a convolutional neural network formed by fusing two convolutional channels; the optimized dual-channel convolutional neural network is adopted, according to any two adjacent video frames in a low-frame-rate video, an insert frame of the two video frames is generated, and therefore a video whose frame rate is higher than that of the low-frame-rate video is generated. The whole process of the method is in an end-to-end mode, no subsequent processing is needed for video frames, the video frame rate conversion effect is good, the fluency of the synthetic video is high, and the method has excellent robustness on shaking, video scene changing and other problems in the video shooting process.

Description

A kind of high frame-rate video generation method and system based on deep learning
Technical field
The invention belongs to technical field of computer vision, regards more particularly, to a kind of high frame per second based on deep learning Frequency generation method and system.
Background technology
With the development of science and technology, the mode that people obtain video is more and more convenient, most of the reason for yet with hardware Video is all that non-professional equipment is collected, and frame per second typically only has 24fps-30fps.The video of high frame per second has high smoothness Degree, can bring more preferable visual experience.If people directly upload the video of high frame per second on the net, due to flow Increase is consumed, the cost of people is also increased as.If directly above transmitting the video of low frame per second, due to due to network line, There is unavoidably frame losing in video, video is more big easier this phenomenon occurs during transmission so that distal end regards Frequency quality can not be effectively guaranteed, and this greatly have impact on the experience of people.It is therefore desirable in distal end using rational Processing mode carries out subsequent treatment to the video that people upload so that the quality of video can meet the demand of people even further Lift the experience of people.
The content of the invention
For the disadvantages described above or Improvement requirement of prior art, the invention provides a kind of high frame per second based on deep learning Video generation method, its object is to the video that the video conversion of low frame per second is high frame per second thus be solved because low frame per second is regarded Frame losing of the frequency during network transmission and cause video quality to decline the technical problem that the experience for giving people brings impact.
For achieving the above object, according to one aspect of the present invention, there is provided a kind of high frame per second based on deep learning is regarded Frequency generation method, comprises the following steps:
(1) training sample set is generated using one or more original high frame-rate video fragments, the training sample concentrates bag Multiple video frame set are included, comprising two training frames and a control frame in described each video frame set, described two Training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, and the control frame is two training frames Any one frame of midfeather;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold value;
(2) the multiple video frame set training dual pathways convolutional neural networks model concentrated using the training sample, With dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways convolutional neural networks model is to be led to by two convolution The convolutional neural networks of road fusion, two convolutional channels are respectively used to two frame of video in input video frame subclass simultaneously Respectively the frame of video to being input into carries out convolution, and dual pathways convolutional neural networks model is carried out to the convolution results of two convolutional channels Merge and be output as to predict frame, the bilateral is trained with the frame flyback that compares in the video frame set according to the prediction frame Road convolutional neural networks model;
(3) video of arbitrary neighborhood two using dual pathways convolutional neural networks after the optimization, in low frame-rate video Frame generates the insertion frame of this two frame of video, so as to generate video of the frame per second higher than the low frame-rate video.
In one embodiment of the present of invention, each convolutional channel in the dual pathways convolutional neural networks model includes k Individual convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer Output, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing Number.
In one embodiment of the present of invention, in the convolutional channel, one is connected to respectively after front k-1 convolutional layer To keep the openness of network, its mathematical description is the active coating of ReLU:
Fi(Y)=max (0, Zi)。
In one embodiment of the present of invention, in the feature that two frame of video are obtained after last convolutional layer Response diagram is merged by the way of the addition of correspondence position value.
In one embodiment of the present of invention, the Sigmoid that is followed by for obtaining characteristic response figure in the mixing operation swashs So that the pixel value of picture is mapped between 0-1, its mathematical description is layer living:
In one embodiment of the present of invention, average is adopted for 0, standard deviation is 1 Gauss distribution initialization convolution nuclear parameter, Biasing is initialized as 0, and benchmark learning rate is initialized as 1e-6, benchmark learning rate reduces 10 times after the iteration m cycle, wherein m For preset value.
In one embodiment of the present of invention, frame flyback instruction is compareed with the video frame set according to the prediction frame The white silk dual pathways convolutional neural networks model, specially:
Using predicting frame and compareing the error between frame, the dual pathways convolution is trained using error backpropagation algorithm Neutral net;Least squares error is wherein adopted for our majorized function, its mathematical description is:
Wherein i represents i-th samples pictures, and n represents the quantity of sample training collection, YiThe frame of video of neural network forecast is represented, Represent the actual value of corresponding video frame.
In one embodiment of the present of invention, the k values are 3;First convolutional layer has the convolution kernel of 64 9*9, step-length For 1 pixel, Filling power is 4, and Filling power refers to the number of turns in characteristic pattern periphery zero padding;Second convolutional layer has 32 1*1's Convolution kernel, step-length is 1 pixel, and Filling power is 0;3rd convolutional layer has the convolution kernel of 3 5*5, and step-length is 1, and Filling power is 2。
It is another aspect of this invention to provide that additionally providing a kind of high frame-rate video based on deep learning generates system, bag Training sample set generation module, dual pathways convolutional neural networks optimization module and high frame-rate video generation module are included, wherein:
The training sample set generation module, for generating training sample using one or more high frame-rate video fragments Collection, the training sample is concentrated includes multiple video frame set, and two training frames are included in described each video frame set With a control frame, two training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, described Control frame is any one frame of the midfeather of two training frames;The frame per second of the high frame-rate video fragment is higher than setting frame Rate threshold value;
The dual pathways convolutional neural networks optimization module, for the multiple video frames concentrated using the training sample Set training dual pathways convolutional neural networks model, dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways volume Product neural network model is the convolutional neural networks of two passage fusions, and two passages are respectively used to be input into the video frame Two frame of video in conjunction and the frame of video to being input into carry out respectively convolution, dual pathways convolutional neural networks model it is logical to two The result of road convolution is merged and is output as predicting frame, and frame is compareed with the video frame set according to the prediction frame Dual pathways convolutional neural networks model described in regression training;
The high frame-rate video generation module, for using dual pathways convolutional neural networks after the optimization, according to low frame The frame of video of arbitrary neighborhood two in rate video generates the insertion frame of this two frame of video, regards higher than the low frame per second so as to generate frame per second The video of frequency.
In one embodiment of the present of invention, each convolutional channel in the dual pathways convolutional neural networks model includes k Individual convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer Output, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing Number.
In general, by the above technical scheme that the present invention is contemplated, compared with prior art, the present invention has following Technique effect:
(1) feature extraction of the invention and the prediction of frame are obtained by the supervised learning of training sample, without the need for artificial Intervene, spatial diversity information can be preferably fitted under the scene of large-scale data;
(2) whole process of the invention is end to end, using the ability of self-teaching of convolutional neural networks, by self The mode of study learns model parameter, it is succinct efficiently, overcome conventional art take time and effort when video frame rate is changed processing and The characteristics of DeGrain.
Description of the drawings
Fig. 1 be the present invention the method for converting video frame rate based on deep learning flow chart, wherein FiRepresent i-th layer Output, Yt-1、Yt、Yt+1Represent continuous three frames frame of video, YtIt is used for calculation error as actual value, Prediction represents net The frame of video of network prediction.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, and It is not used in the restriction present invention.As long as additionally, technical characteristic involved in invention described below each embodiment Not constituting conflict each other just can be mutually combined.
Hereinafter first just the technical term of the present invention is explained and illustrated:
Convolutional neural networks (Convolutional Neural Network, CNN):One kind can be used for image classification, return The neutral net of task, its particularity such as return to be embodied in two aspects, the interneuronal connection for being on the one hand it is non-complete Connection, the weight of the connection in another aspect same layer between some neurons is shared.Network is generally by convolutional layer, pond Change layer and full articulamentum is constituted.Convolutional layer and pond layer are responsible for extracting the hierarchy characteristic of image, and full articulamentum is responsible for extracting Feature classified or returned.The parameter of network includes parameter and the biasing of convolution kernel and full articulamentum, and parameter can be with Obtained from data learning by reverse conduction algorithm.
Reverse conduction algorithm (Backpropagation Algorithm, BP):Be it is a kind of with optimization method (such as gradient Descent method) be used in combination, for training the common methods of artificial neural network.The method is damaged to all weight calculation in network The gradient of function is lost, this gradient can feed back to optimization method, for updating weights to minimize loss function.Algorithm master To include two stages:The forward direction of excitation, back propagation and the renewal of weight.
As the arrival in big data epoch, the scale of video database are also increasing, the solution of this problem is also more next It is more urgent.The working method that deep neural network can simulate human brain in a kind of preferable mode is analyzed to data, In recent years, deep learning is all achieved in the every field of computer vision and successfully applied, but turning for video frame rate The problem of changing there is no significantly research, in view of traditional method for converting video frame rate process is complicated, time human cost is higher, this The bright one kind that proposes is based on deep learning method for converting video frame rate.The method whole process be end to end, it is easy and efficiently, The problems such as shake, scene for video switches all has stronger robustness.
As shown in figure 1, method for converting video frame rate of the present invention based on deep learning, may comprise steps of:
(1) training sample set is generated using one or more original high frame-rate video fragments, the training sample concentrates bag Multiple video frame set are included, comprising two training frames and a control frame in described each video frame set, described two Training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, and the control frame is two training frames Any one frame of midfeather;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold value;
Specifically, high frame-rate video fragment can be extracted and obtains sets of video frames, training sample is obtained according to a certain percentage Collection;
Training sample set is combined into by multiple video frames, comprising two instructions in described each video frame set Practice frame and a control frame.Control frame is chosen for the most middle of two training frames or near that most middle frame.Typically In the case of refer to and take continuous 3 frame, a middle frame is control frame, and another two frame is training frames;If frame per second is sufficiently high, can also take Be separated by two frames of multiframe (depending on frame per second, it is impossible to too many) as training frames, and mesophase every multiframe in can choose middle ware Every any one frame for control frame;High video frame rate for example for training is 60, and the video has N frames, then according to interval one The mode of frame sample this training, from the 2nd to N-1 frames in take a frame at random as actual value (control frame), it is and the frame is adjacent Two frames as training sample (two training frames) be input to network the inside.In the same manner, it is also possible to which the mode according to interval multiframe is come Training sample, so can be used for the video of lower frame per second, i.e., the video conversion of lower frame per second is the video of high frame per second.
(2) the multiple video frame set training dual pathways convolutional neural networks model concentrated using the training sample, With dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways convolutional neural networks model is to be led to by two convolution The convolutional neural networks of road fusion, two convolutional channels are respectively used to two frame of video in input video frame subclass simultaneously Respectively the frame of video to being input into carries out convolution, and dual pathways convolutional neural networks model is carried out to the convolution results of two convolutional channels Merge and be output as to predict frame, the bilateral is trained with the frame flyback that compares in the video frame set according to the prediction frame Road convolutional neural networks model;
First have to design and Implement a dual pathways convolutional neural networks, specifically:
The dual pathways convolutional neural networks model set up is the convolutional neural networks of two convolutional channel fusions, is included altogether K convolutional layer, k>0, preferably 3, respectively convolution is individually carried out to two frame of video pictures (training frames).First convolutional layer has The convolution kernel of 64 9*9, step-length is 1 pixel, and Filling power is 4, and Filling power refers to the number of turns in characteristic pattern periphery zero padding.Second Individual convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, and Filling power is 0.3rd volume layer has the convolution kernel of 3 5*5, Step-length is 1, and Filling power is 2.The mathematical description of convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of network, and input picture is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer defeated Go out, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter;
In 3 convolutional layers, the active coating of a ReLU is connected to after the 1st and the 2nd convolutional layer respectively to keep Network it is openness, its mathematical description is:
Fi(Y)=max (0, Zi)。
The characteristic response figure that two frame of video pictures are obtained after the 3rd convolutional layer is added using correspondence position value Mode merged;
After the mixing operation, the characteristic response figure for obtaining is followed by a Sigmoid active coating with by the picture of picture Plain value is mapped between 0-1, and its mathematical description is:
Before the dual pathways convolutional neural networks are trained, need to enter each pixel value in frame of video divided by 255 Row normalized, the pixel value after normalization is between 0 to 1;
Also, before the dual pathways convolutional neural networks are trained, need to initialize the employing of convolutional neural networks parameter Average is 0, and standard deviation is 1 Gauss distribution initialization convolution nuclear parameter, and biasing is initialized as 0, the initialization of benchmark learning rate For 1e-6, benchmark learning rate reduces 10 times after the iteration m cycle, and wherein m is preset value;For example, m preferably 2, then in front 1-m In individual iteration cycle, learning rate=1e-6, after the iteration m cycle, learning rate=1e-7, and be always maintained at constant.
Specifically, it is possible to use the predictive value of network with compare between error, instructed using error backpropagation algorithm Practice dual pathways convolutional neural networks.Least squares error is adopted for our majorized function, its mathematical description is:
Wherein i represents i-th samples pictures, and n represents the quantity of sample training collection, YiThe frame of video of neural network forecast is represented, Represent the actual value of corresponding video frame;
(3) video of arbitrary neighborhood two using dual pathways convolutional neural networks after the optimization, in low frame-rate video Frame generates the insertion frame of this two frame of video, so as to generate video of the frame per second higher than the low frame-rate video.
As it will be easily appreciated by one skilled in the art that the foregoing is only presently preferred embodiments of the present invention, not to The present invention, all any modification, equivalent and improvement made within the spirit and principles in the present invention etc. are limited, all should be included Within protection scope of the present invention.

Claims (10)

1. a kind of high frame-rate video generation method based on deep learning, it is characterised in that the method comprising the steps of:
(1) training sample set is generated using one or more original high frame-rate video fragments, the training sample is concentrated including many Individual video frame set, comprising two training frames and a control frame, two training in described each video frame set Frame is two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, and the control frame is in the middle of two training frames Any one frame at interval;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold value;
(2) the multiple video frame set training dual pathways convolutional neural networks model concentrated using the training sample, to obtain Dual pathways convolutional neural networks after must optimizing;Wherein, the dual pathways convolutional neural networks model is to be melted by two convolutional channels The convolutional neural networks for closing, two convolutional channels are respectively used to two frame of video in input video frame subclass and distinguish Frame of video to being input into carries out convolution, and dual pathways convolutional neural networks model merges to the convolution results of two convolutional channels And be output as predicting frame, the dual pathways volume is trained with the frame flyback that compares in the video frame set according to the prediction frame Product neural network model;
(3) frame of video of arbitrary neighborhood two life using dual pathways convolutional neural networks after the optimization, in low frame-rate video Into the insertion frame of this two frame of video, so as to generate video of the frame per second higher than the low frame-rate video.
2. the high frame-rate video generation method of deep learning is based on as claimed in claim 1, it is characterised in that the dual pathways Each convolutional channel in convolutional neural networks model includes k convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer defeated Go out, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter.
3. the high frame-rate video generation method of deep learning is based on as claimed in claim 2, it is characterised in that in the convolution In passage, the active coating of a ReLU is connected to respectively after front k-1 convolutional layer to keep the openness of network, its mathematics is retouched State for:
Fi(Y)=max (0, Zi)。
4. the high frame-rate video generation method of deep learning is based on as claimed in claim 1 or 2, it is characterised in that described The characteristic response figure that two frame of video are obtained after last convolutional layer is carried out by the way of the addition of correspondence position value Fusion.
5. the high frame-rate video generation method of deep learning is based on as claimed in claim 1 or 2, it is characterised in that described What mixing operation obtained characteristic response figure is followed by a Sigmoid active coating so that the pixel value of picture is mapped between 0-1, its Mathematical description is:
F i ( Y ) = 1 1 + exp ( - Z i ) .
6. the high frame-rate video generation method based on deep learning as claimed in claim 2, it is characterised in that adopt average for 0, standard deviation is 1 Gauss distribution initialization convolution nuclear parameter, and biasing is initialized as 0, and benchmark learning rate is initialized as 1e-6, Benchmark learning rate reduces 10 times after the iteration m cycle, and wherein m is preset value.
7. the high frame-rate video generation method of deep learning is based on as claimed in claim 1 or 2, it is characterised in that according to institute State prediction frame and train the dual pathways convolutional neural networks model with the frame flyback that compares in the video frame set, specifically For:
Using predicting frame and compareing the error between frame, the dual pathways convolutional Neural is trained using error backpropagation algorithm Network;Least squares error is wherein adopted for our majorized function, its mathematical description is:
Wherein i represents i-th samples pictures, and n represents the quantity of sample training collection, YiThe frame of video of neural network forecast is represented,Represent The actual value of corresponding video frame.
8. the high frame-rate video generation method of deep learning is based on as claimed in claim 2, it is characterised in that the k values For 3;First convolutional layer has the convolution kernel of 64 9*9, and step-length is 1 pixel, and Filling power is 4, and Filling power is referred in characteristic pattern The number of turns of periphery zero padding;Second convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, and Filling power is 0;3rd volume Lamination has the convolution kernel of 3 5*5, and step-length is 1, and Filling power is 2.
9. a kind of high frame-rate video based on deep learning generates system, it is characterised in that including training sample set generation module, Dual pathways convolutional neural networks optimization module and high frame-rate video generation module, wherein:
The training sample set generation module, for generating training sample set, institute using one or more high frame-rate video fragments Stating training sample and concentrating includes multiple video frame set, and two training frames and one are included in described each video frame set Control frame, two training frames are two frame of video that a frame or multiframe are spaced in high frame-rate video fragment, the control frame For any one frame of the midfeather of two training frames;The frame per second of the high frame-rate video fragment is higher than setting frame per second threshold Value;
The dual pathways convolutional neural networks optimization module, for the multiple video frame set concentrated using the training sample Training dual pathways convolutional neural networks model, dual pathways convolutional neural networks after being optimized;Wherein, the dual pathways convolution god Jing network modeies are the convolutional neural networks of two passage fusions, and two passages are respectively used to be input in the video frame set Two frame of video and the frame of video to being input into carries out respectively convolution, dual pathways convolutional neural networks model to two passages volumes Long-pending result is merged and is output as predicting frame, and frame flyback is compareed with the video frame set according to the prediction frame Train the dual pathways convolutional neural networks model;
The high frame-rate video generation module, for using dual pathways convolutional neural networks after the optimization, being regarded according to low frame per second The frame of video of arbitrary neighborhood two in frequency generates the insertion frame of this two frame of video, so as to generate frame per second higher than the low frame-rate video Video.
10. the high frame-rate video based on deep learning as claimed in claim generates system, it is characterised in that the dual pathways Each convolutional channel in convolutional neural networks model includes k convolutional layer, wherein k>0, the mathematical description of each convolutional layer is:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i represents the number of plies of convolutional layer, and input video frame is the 0th layer, and * represents convolution operation, Fi-1Represent the i-th -1 layer defeated Go out, Zi(Y) output after i-th layer of convolution operation, W are representediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter.
CN201611241691.XA 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning Active CN106686472B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611241691.XA CN106686472B (en) 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611241691.XA CN106686472B (en) 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN106686472A true CN106686472A (en) 2017-05-17
CN106686472B CN106686472B (en) 2019-04-26

Family

ID=58872327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611241691.XA Active CN106686472B (en) 2016-12-29 2016-12-29 A kind of high frame-rate video generation method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN106686472B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107481209A (en) * 2017-08-21 2017-12-15 北京航空航天大学 A kind of image or video quality Enhancement Method based on convolutional neural networks
CN107613299A (en) * 2017-09-29 2018-01-19 杭州电子科技大学 A kind of method for improving conversion effect in frame rate using network is generated
CN107886081A (en) * 2017-11-23 2018-04-06 武汉理工大学 Two-way U Net deep neural network mine down-holes hazardous act is intelligently classified discrimination method
CN108111860A (en) * 2018-01-11 2018-06-01 安徽优思天成智能科技有限公司 Video sequence lost frames prediction restoration methods based on depth residual error network
CN108322685A (en) * 2018-01-12 2018-07-24 广州华多网络科技有限公司 Video frame interpolation method, storage medium and terminal
CN108600762A (en) * 2018-04-23 2018-09-28 中国科学技术大学 In conjunction with the progressive video frame generating method of motion compensation and neural network algorithm
CN108600655A (en) * 2018-04-12 2018-09-28 视缘(上海)智能科技有限公司 A kind of video image synthetic method and device
CN108810551A (en) * 2018-06-20 2018-11-13 Oppo(重庆)智能科技有限公司 A kind of video frame prediction technique, terminal and computer storage media
CN108830812A (en) * 2018-06-12 2018-11-16 福建帝视信息科技有限公司 A kind of high frame per second of video based on network deep learning remakes method
CN108961236A (en) * 2018-06-29 2018-12-07 国信优易数据有限公司 Training method and device, the detection method and device of circuit board defect detection model
CN109068174A (en) * 2018-09-12 2018-12-21 上海交通大学 Video frame rate upconversion method and system based on cyclic convolution neural network
CN109120936A (en) * 2018-09-27 2019-01-01 贺禄元 A kind of coding/decoding method and device of video image
CN109360436A (en) * 2018-11-02 2019-02-19 Oppo广东移动通信有限公司 A kind of video generation method, terminal and storage medium
CN109379550A (en) * 2018-09-12 2019-02-22 上海交通大学 Video frame rate upconversion method and system based on convolutional neural networks
CN109922372A (en) * 2019-02-26 2019-06-21 深圳市商汤科技有限公司 Video data handling procedure and device, electronic equipment and storage medium
CN110163061A (en) * 2018-11-14 2019-08-23 腾讯科技(深圳)有限公司 For extracting the method, apparatus, equipment and computer-readable medium of video finger print
US10491941B2 (en) 2015-01-22 2019-11-26 Microsoft Technology Licensing, Llc Predictive server-side rendering of scenes
CN110636221A (en) * 2019-09-23 2019-12-31 天津天地人和企业管理咨询有限公司 System and method for super frame rate of sensor based on FPGA
CN110780664A (en) * 2018-07-25 2020-02-11 格力电器(武汉)有限公司 Robot control method and device and sweeping robot
CN111371983A (en) * 2018-12-26 2020-07-03 清华大学 Video online stabilization method and system
US10924525B2 (en) 2018-10-01 2021-02-16 Microsoft Technology Licensing, Llc Inducing higher input latency in multiplayer programs
CN112584158A (en) * 2019-09-30 2021-03-30 复旦大学 Video quality enhancement method and system
RU2747965C1 (en) * 2020-10-05 2021-05-18 Самсунг Электроникс Ко., Лтд. Frc occlusion processing with deep learning
WO2021104381A1 (en) * 2019-11-27 2021-06-03 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and device for stylizing video and storage medium
CN113420771A (en) * 2021-06-30 2021-09-21 扬州明晟新能源科技有限公司 Colored glass detection method based on feature fusion
CN113516050A (en) * 2021-05-19 2021-10-19 江苏奥易克斯汽车电子科技股份有限公司 Scene change detection method and device based on deep learning
CN113630621A (en) * 2020-05-08 2021-11-09 腾讯科技(深圳)有限公司 Video processing method, related device and storage medium
CN113647064A (en) * 2019-04-01 2021-11-12 株式会社电装 Information processing apparatus
US11889227B2 (en) 2020-10-05 2024-01-30 Samsung Electronics Co., Ltd. Occlusion processing for frame rate conversion using deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202285412U (en) * 2011-09-02 2012-06-27 深圳市华美特科技有限公司 Low frame rate transmission or motion image twinkling elimination system
CN104102919A (en) * 2014-07-14 2014-10-15 同济大学 Image classification method capable of effectively preventing convolutional neural network from being overfit
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106228124A (en) * 2016-07-17 2016-12-14 西安电子科技大学 SAR image object detection method based on convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202285412U (en) * 2011-09-02 2012-06-27 深圳市华美特科技有限公司 Low frame rate transmission or motion image twinkling elimination system
CN104102919A (en) * 2014-07-14 2014-10-15 同济大学 Image classification method capable of effectively preventing convolutional neural network from being overfit
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106228124A (en) * 2016-07-17 2016-12-14 西安电子科技大学 SAR image object detection method based on convolutional neural networks

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10491941B2 (en) 2015-01-22 2019-11-26 Microsoft Technology Licensing, Llc Predictive server-side rendering of scenes
CN107481209A (en) * 2017-08-21 2017-12-15 北京航空航天大学 A kind of image or video quality Enhancement Method based on convolutional neural networks
CN107481209B (en) * 2017-08-21 2020-04-21 北京航空航天大学 Image or video quality enhancement method based on convolutional neural network
CN107613299A (en) * 2017-09-29 2018-01-19 杭州电子科技大学 A kind of method for improving conversion effect in frame rate using network is generated
CN107886081A (en) * 2017-11-23 2018-04-06 武汉理工大学 Two-way U Net deep neural network mine down-holes hazardous act is intelligently classified discrimination method
CN108111860A (en) * 2018-01-11 2018-06-01 安徽优思天成智能科技有限公司 Video sequence lost frames prediction restoration methods based on depth residual error network
CN108111860B (en) * 2018-01-11 2020-04-14 安徽优思天成智能科技有限公司 Video sequence lost frame prediction recovery method based on depth residual error network
CN108322685A (en) * 2018-01-12 2018-07-24 广州华多网络科技有限公司 Video frame interpolation method, storage medium and terminal
CN108322685B (en) * 2018-01-12 2020-09-25 广州华多网络科技有限公司 Video frame insertion method, storage medium and terminal
WO2019137248A1 (en) * 2018-01-12 2019-07-18 广州华多网络科技有限公司 Video frame interpolation method, storage medium and terminal
CN108600655A (en) * 2018-04-12 2018-09-28 视缘(上海)智能科技有限公司 A kind of video image synthetic method and device
CN108600762A (en) * 2018-04-23 2018-09-28 中国科学技术大学 In conjunction with the progressive video frame generating method of motion compensation and neural network algorithm
CN108600762B (en) * 2018-04-23 2020-05-15 中国科学技术大学 Progressive video frame generation method combining motion compensation and neural network algorithm
CN108830812A (en) * 2018-06-12 2018-11-16 福建帝视信息科技有限公司 A kind of high frame per second of video based on network deep learning remakes method
CN108830812B (en) * 2018-06-12 2021-08-31 福建帝视信息科技有限公司 Video high frame rate reproduction method based on grid structure deep learning
CN108810551A (en) * 2018-06-20 2018-11-13 Oppo(重庆)智能科技有限公司 A kind of video frame prediction technique, terminal and computer storage media
CN108961236B (en) * 2018-06-29 2021-02-26 国信优易数据股份有限公司 Circuit board defect detection method and device
CN108961236A (en) * 2018-06-29 2018-12-07 国信优易数据有限公司 Training method and device, the detection method and device of circuit board defect detection model
CN110780664A (en) * 2018-07-25 2020-02-11 格力电器(武汉)有限公司 Robot control method and device and sweeping robot
CN109068174A (en) * 2018-09-12 2018-12-21 上海交通大学 Video frame rate upconversion method and system based on cyclic convolution neural network
CN109068174B (en) * 2018-09-12 2019-12-27 上海交通大学 Video frame rate up-conversion method and system based on cyclic convolution neural network
CN109379550A (en) * 2018-09-12 2019-02-22 上海交通大学 Video frame rate upconversion method and system based on convolutional neural networks
CN109120936A (en) * 2018-09-27 2019-01-01 贺禄元 A kind of coding/decoding method and device of video image
US10924525B2 (en) 2018-10-01 2021-02-16 Microsoft Technology Licensing, Llc Inducing higher input latency in multiplayer programs
CN109360436A (en) * 2018-11-02 2019-02-19 Oppo广东移动通信有限公司 A kind of video generation method, terminal and storage medium
CN110163061A (en) * 2018-11-14 2019-08-23 腾讯科技(深圳)有限公司 For extracting the method, apparatus, equipment and computer-readable medium of video finger print
CN110163061B (en) * 2018-11-14 2023-04-07 腾讯科技(深圳)有限公司 Method, apparatus, device and computer readable medium for extracting video fingerprint
CN111371983A (en) * 2018-12-26 2020-07-03 清华大学 Video online stabilization method and system
CN113766313B (en) * 2019-02-26 2024-03-05 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
CN109922372A (en) * 2019-02-26 2019-06-21 深圳市商汤科技有限公司 Video data handling procedure and device, electronic equipment and storage medium
CN109922372B (en) * 2019-02-26 2021-10-12 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
CN113766313A (en) * 2019-02-26 2021-12-07 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
CN113647064A (en) * 2019-04-01 2021-11-12 株式会社电装 Information processing apparatus
CN113647064B (en) * 2019-04-01 2022-12-27 株式会社电装 Information processing apparatus
CN110636221A (en) * 2019-09-23 2019-12-31 天津天地人和企业管理咨询有限公司 System and method for super frame rate of sensor based on FPGA
CN112584158A (en) * 2019-09-30 2021-03-30 复旦大学 Video quality enhancement method and system
WO2021104381A1 (en) * 2019-11-27 2021-06-03 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and device for stylizing video and storage medium
CN113630621A (en) * 2020-05-08 2021-11-09 腾讯科技(深圳)有限公司 Video processing method, related device and storage medium
US11889227B2 (en) 2020-10-05 2024-01-30 Samsung Electronics Co., Ltd. Occlusion processing for frame rate conversion using deep learning
RU2747965C1 (en) * 2020-10-05 2021-05-18 Самсунг Электроникс Ко., Лтд. Frc occlusion processing with deep learning
CN113516050A (en) * 2021-05-19 2021-10-19 江苏奥易克斯汽车电子科技股份有限公司 Scene change detection method and device based on deep learning
CN113420771A (en) * 2021-06-30 2021-09-21 扬州明晟新能源科技有限公司 Colored glass detection method based on feature fusion
CN113420771B (en) * 2021-06-30 2024-04-19 扬州明晟新能源科技有限公司 Colored glass detection method based on feature fusion

Also Published As

Publication number Publication date
CN106686472B (en) 2019-04-26

Similar Documents

Publication Publication Date Title
CN106686472A (en) High-frame-rate video generation method and system based on depth learning
Kang et al. Task-oriented image transmission for scene classification in unmanned aerial systems
CN109064507B (en) Multi-motion-stream deep convolution network model method for video prediction
CN109271933B (en) Method for estimating three-dimensional human body posture based on video stream
CN106503106B (en) A kind of image hash index construction method based on deep learning
CN108229338A (en) A kind of video behavior recognition methods based on depth convolution feature
CN105072373B (en) Video super-resolution method and system based on bidirectional circulating convolutional network
CN107066445B (en) The deep learning method of one attribute emotion word vector
CN110096950A (en) A kind of multiple features fusion Activity recognition method based on key frame
CN108830252A (en) A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic
CN110634108A (en) Composite degraded live webcast video enhancement method based on element-cycle consistency countermeasure network
CN110135386B (en) Human body action recognition method and system based on deep learning
TWI226193B (en) Image segmentation method, image segmentation apparatus, image processing method, and image processing apparatus
CN104899921B (en) Single-view videos human body attitude restoration methods based on multi-modal own coding model
CN106952271A (en) A kind of image partition method handled based on super-pixel segmentation and EM/MPM
CN108111860B (en) Video sequence lost frame prediction recovery method based on depth residual error network
CN107590518A (en) A kind of confrontation network training method of multiple features study
CN108986166A (en) A kind of monocular vision mileage prediction technique and odometer based on semi-supervised learning
CN107886169A (en) A kind of multiple dimensioned convolution kernel method that confrontation network model is generated based on text image
CN113807318B (en) Action recognition method based on double-flow convolutional neural network and bidirectional GRU
CN106709933B (en) Motion estimation method based on unsupervised learning
CN111833590A (en) Traffic signal lamp control method and device and computer readable storage medium
CN112257846A (en) Neuron model, topology, information processing method, and retinal neuron
Kim et al. Dynamic motion estimation and evolution video prediction network
Zhang et al. Accurate and efficient event-based semantic segmentation using adaptive spiking encoder–decoder network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant