CN108257105A - A kind of light stream estimation for video image and denoising combination learning depth network model - Google Patents
A kind of light stream estimation for video image and denoising combination learning depth network model Download PDFInfo
- Publication number
- CN108257105A CN108257105A CN201810081519.5A CN201810081519A CN108257105A CN 108257105 A CN108257105 A CN 108257105A CN 201810081519 A CN201810081519 A CN 201810081519A CN 108257105 A CN108257105 A CN 108257105A
- Authority
- CN
- China
- Prior art keywords
- light stream
- image
- denoising
- module
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000013016 learning Effects 0.000 title claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000007781 pre-processing Methods 0.000 claims abstract description 24
- 238000000034 method Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 2
- 230000000694 effects Effects 0.000 description 6
- 239000012141 concentrate Substances 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000003475 lamination Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The present invention discloses a kind of light stream estimation for video image and denoising combination learning depth network model, belongs to image processing field.The model includes preprocessing module, light stream estimation module and denoising module, each module uses Encoder Decoder network structures, preprocessing module is individually trained first using sample data set, then the relevant parameter of preprocessing module is fixed, training preprocessing module and light stream estimation module simultaneously, finally fix the relevant parameter of preprocessing module and light stream estimation module, whole training includes the depth network model of three modules, and the depth network model completed using training directly can carry out light stream estimation and denoising to noisy video image.Combination learning depth network model proposed by the present invention, light stream estimation and denoising speed are fast, and precision is high, convenient for quickly handling multitude of video image in practice.
Description
Technical field
The present invention relates to image processing fields, refer specifically to a kind of light stream estimation for video image and denoising combination learning
Depth network model.
Background technology
Video image all suffers from noise jamming in the links such as acquisition, compression, storage, transmission, and noise can be significantly reduced and be regarded
The visual quality of frequency image, and difficulty is caused to intelligent analysis such as subsequent target recognition and trackings.Therefore, it is necessary to retain
The noise in video image is removed under the premise of video information, improve signal-to-noise ratio and improves visual effect.
Since video image has relativity of time domain, light stream estimation and video denoising can be combined, obtained more
Good denoising effect, but existing joint light stream estimation and video denoising algorithm, need a large amount of interative computations, expend big gauge
It calculates resource and time, inconvenience is applied in practice, and light stream estimation is easily interfered by video noise, so as to influence denoising
Effect.It is therefore proposed that quickly and effectively joint light stream estimation and video denoising algorithm, are that field of video image processing is badly in need of solving
The problem of.
Invention content
The present invention is overcomes the above situation insufficient, it is desirable to provide a kind of light stream estimation for video image is combined with denoising
Learn depth network model, using depth network model from a large amount of training samples, combination learning light stream estimation and video denoising,
The problem of light stream estimated accuracy is low in the prior art with solution, and denoising effect is poor, and time-consuming.
In order to solve the above technical problems, technical solution proposed by the present invention is:
A kind of light stream estimation for video image and denoising combination learning depth network model, it is characterised in that:The joint is deep
It spends learning network model and includes three modules:Preprocessing module, light stream estimation module and denoising module, first with sample data
Set pair depth network model is trained;Then it to the noise image im_n1 and im_n2 of input, is done just using preprocessing module
Denoising is walked, obtains pretreated image to im_p1 and im_p2;Using light stream estimation module to image to im_p1 and
Im_p2 carries out estimation, obtains light stream estimated result flow;Noise image im_n2 is done according to light stream estimated result flow
Transformation obtains image im_n2 ', then using image im_n2 ' and noise image im_n1 as the input picture of denoising module, made an uproar
The corresponding final denoising image im_dn of acoustic image im_n1.
Input noise the image im_n1 and im_n2 are adjacent two field pictures in the video comprising noise.
The quantity of the sample data set is no less than 20000, wherein each sample includes adjacent two frames noise in video
Image n1 and n2, the corresponding standard-definition image p1 and p2 of noise image n1 and n2, image is to p1 and p2 corresponding light streams estimation
As a result f.
The specific training method of the depth network model is the corresponding data concentrated using sample data, is individually instructed first
Practice preprocessing module;Then the relevant parameter of preprocessing module, while training preprocessing module and light stream estimation module are fixed;Most
The relevant parameter of preprocessing module and light stream estimation module is fixed afterwards, and whole training includes the depth network model of three modules.
The preprocessing module, light stream estimation module and denoising module use Encoder-Decoder network structures.
The Encoder-Decoder network structures include coding Encoder parts and decoding Decoder parts, wherein
It encodes Encoder parts and includes M layers of convolutional layer, decoding Decoder parts include M sub-network, and each sub-network includes 1 instead
Convolutional layer and N number of convolutional layer when each subnet network layers of decoding Decoder parts make deconvolution, call coding Encoder parts
Corresponding convolutional layer characteristics of image, the output result of last layer is as next layer of input.
The depth network model is trained using Caffe deep learnings frame.
Advantageous effect of the present invention:
1)Light stream estimation designed by the present invention can solve noisy in practice simultaneously with denoising combination learning depth network model
The light stream estimation of video and Denoising Problems, with combining light stream estimation and video denoising algorithm based on interative computation in the prior art
It compares, light stream estimated accuracy is high, and the denoising effect based on light stream estimation auxiliary is more preferable, once and combination learning depth network mould
Type training is completed, and the speed of light stream estimation and denoising is very fast, convenient for quickly handling multitude of video image in practice.
2)The present invention is for combination learning depth network model, using first individually training single network module carries out entirety again
The method of network model training, can effectively reduce the network parameter in training process, network model is avoided over-fitting occur.
3)The method of the present invention is special come the automatic image for learning light stream estimation image and denoising image using deep learning model
Sign carries out light stream estimation end to end and image denoising, without estimating that moving boundaries are assisted, and also it is used
Encoder-Decoder depth network model can fully excavate the multidimensional characteristic in input picture, can be promoted light stream estimation with
The effect of denoising.
Description of the drawings
Fig. 1 is the structure diagram of combination learning depth network model in the present invention;
Fig. 2 is that sample data concentrates adjacent two frames noise image n1 and n2 in video;
Fig. 3 is that sample data concentrates the corresponding standard-definition image p1 and p2 of noise image n1 and n2;
Fig. 4 be sample data concentrate image to p1 and p2 corresponding light stream estimated result f;
Fig. 5 is Encoder-Decoder schematic network structures;
Fig. 6 is the structure diagram of sub-network in Encoder-Decoder networks;
Fig. 7 is adjacent two frames noise image im_n1 and im_n2 in pending video;
Fig. 8 is the corresponding light stream estimated result of noise image im_n1 and im_n2;
Fig. 9 is the denoising result of noise image im_n1.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.Based on this
Embodiment in invention, the every other reality that those of ordinary skill in the art are obtained without making creative work
Example is applied, shall fall within the protection scope of the present invention.
As shown in Figure 1, a kind of light stream estimation for video image provided in this embodiment and denoising combination learning depth
Network model, including three modules:Preprocessing module, light stream estimation module and denoising module, first structure include 30000 samples
This sample data set, wherein each sample includes adjacent two frames noise image n1 and n2 in video, as shown in Fig. 2, noise pattern
As the corresponding standard-definition image p1 and p2 of n1 and n2, as shown in figure 3, image to p1 and p2 corresponding light stream estimated result f, such as
Shown in Fig. 4.
Preprocessing module, light stream estimation module and denoising module use Encoder-Decoder network structures, such as Fig. 5 institutes
Show, Encoder-Decoder network structures include coding Encoder parts and decoding Decoder parts, wherein encoding
Encoder parts include 6 layers of convolutional layer c1-c6, the characteristic pattern quantity of this 6 convolutional layers is respectively 64,64,128,128,256,
512, decoding Decoder part include 5 sub-network subnet1- subnet5, and the structure of sub-network is as shown in fig. 6, per height
Network includes 1 convolutional layer and 4 warp laminations, and the characteristic pattern quantity of convolutional layer is 64,4 warp laminations in each sub-network
Characteristic pattern quantity be respectively 512,256,128,64, when each subnet network layers of decoding Decoder parts make deconvolution, call
Encode the corresponding convolutional layer characteristics of image in Encoder parts, the output result of last layer is as next layer of input.
Combination learning depth network model is trained using sample data set, using in Ubuntu systems
The Caffe environmental trainings deep learning model, is trained using ADAGRAD optimization algorithms.Individually training pre-processes mould first
Block;Then the relevant parameter of preprocessing module, while training preprocessing module and light stream estimation module are fixed;Finally fixed pre- place
The relevant parameter of module and light stream estimation module is managed, whole training includes the depth network model of three modules.Individually training is pre-
When processing module and simultaneously training preprocessing module and light stream estimation module, initial learning rate is 0.01, and frequency of training is
600000 times, wherein, when frequency of training is 300000,400000 and 500000, learning rate difference divided by 10 reduces study
Rate.Whole training includes the depth network model of three modules, and initial learning rate is 0.02, and frequency of training is 500000 times,
In, when frequency of training is 200000,300000 and 400000, learning rate difference divided by 8 reduces learning rate.
After combination learning depth network model is completed in training, the noisy video image of the model treatment is directly utilized.By video
In adjacent two frames noise image im_n1 and im_n2 input the model, as shown in fig. 7, noise image im_ directly can be obtained quickly
The corresponding light stream estimated results of n1 and im_n2, as shown in figure 8, and noise image im_n1 denoising result, as shown in Figure 9.
Above disclosed is only a kind of preferred embodiment of the present invention, cannot limit the power of the present invention with this certainly
Sharp range, therefore equivalent variations made according to the claims of the present invention, are still within the scope of the present invention.
Claims (7)
1. a kind of light stream estimation for video image and denoising combination learning depth network model, it is characterised in that:The joint
Deep learning network model includes three modules:Preprocessing module, light stream estimation module and denoising module, first with sample number
It is trained according to set pair depth network model;Then it to the noise image im_n1 and im_n2 of input, is done using preprocessing module
Preliminary denoising obtains pretreated image to im_p1 and im_p2;Using light stream estimation module to image to im_p1 and
Im_p2 carries out estimation, obtains light stream estimated result flow;Noise image im_n2 is done according to light stream estimated result flow
Transformation obtains image im_n2 ', then using image im_n2 ' and noise image im_n1 as the input picture of denoising module, made an uproar
The corresponding final denoising image im_dn of acoustic image im_n1.
2. the light stream estimation according to claim 1 for video image and denoising combination learning depth network model,
It is characterized in that:Input noise the image im_n1 and im_n2 are adjacent two field pictures in the video comprising noise.
3. the light stream estimation according to claim 1 for video image and denoising combination learning depth network model,
It is characterized in that:The quantity of the sample data set is no less than 20000, wherein each sample includes adjacent two frames noise in video
Image n1 and n2, the corresponding standard-definition image p1 and p2 of noise image n1 and n2, image is to p1 and p2 corresponding light streams estimation
As a result f.
4. the light stream estimation according to claim 1 for video image and denoising combination learning depth network model,
It is characterized in that:The specific training method of the depth network model is the corresponding data concentrated using sample data, first individually
Training preprocessing module;Then the relevant parameter of preprocessing module, while training preprocessing module and light stream estimation module are fixed;
The relevant parameter of preprocessing module and light stream estimation module is finally fixed, whole training includes the depth network mould of three modules
Type.
5. light stream estimation for video image and denoising combination learning depth network model according to claim 1 or 4,
It is characterized in that:The preprocessing module, light stream estimation module and denoising module use Encoder-Decoder network structures.
6. the light stream estimation according to claim 5 for video image and denoising combination learning depth network model,
It is characterized in that:The Encoder-Decoder network structures include coding Encoder parts and decoding Decoder parts, wherein
It encodes Encoder parts and includes M layers of convolutional layer, decoding Decoder parts include M sub-network, and each sub-network includes 1 instead
Convolutional layer and N number of convolutional layer when each subnet network layers of decoding Decoder parts make deconvolution, call coding Encoder parts
Corresponding convolutional layer characteristics of image, the output result of last layer is as next layer of input.
7. the light stream estimation according to claim 4 for video image and denoising combination learning depth network model,
It is characterized in that:The depth network model is trained using Caffe deep learnings frame.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810081519.5A CN108257105B (en) | 2018-01-29 | 2018-01-29 | Optical flow estimation and denoising joint learning depth network model for video image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810081519.5A CN108257105B (en) | 2018-01-29 | 2018-01-29 | Optical flow estimation and denoising joint learning depth network model for video image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108257105A true CN108257105A (en) | 2018-07-06 |
CN108257105B CN108257105B (en) | 2021-04-20 |
Family
ID=62743478
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810081519.5A Active CN108257105B (en) | 2018-01-29 | 2018-01-29 | Optical flow estimation and denoising joint learning depth network model for video image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108257105B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034398A (en) * | 2018-08-10 | 2018-12-18 | 深圳前海微众银行股份有限公司 | Feature selection approach, device and storage medium based on federation's training |
CN109165683A (en) * | 2018-08-10 | 2019-01-08 | 深圳前海微众银行股份有限公司 | Sample predictions method, apparatus and storage medium based on federation's training |
CN111539879A (en) * | 2020-04-15 | 2020-08-14 | 清华大学深圳国际研究生院 | Video blind denoising method and device based on deep learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103686139A (en) * | 2013-12-20 | 2014-03-26 | 华为技术有限公司 | Frame image conversion method, frame video conversion method and frame video conversion device |
CN103700117A (en) * | 2013-11-21 | 2014-04-02 | 北京工业大学 | Robust optical flow field estimating method based on TV-L1 variation model |
CN106331433A (en) * | 2016-08-25 | 2017-01-11 | 上海交通大学 | Video denoising method based on deep recursive neural network |
CN106991646A (en) * | 2017-03-28 | 2017-07-28 | 福建帝视信息科技有限公司 | A kind of image super-resolution method based on intensive connection network |
US9767540B2 (en) * | 2014-05-16 | 2017-09-19 | Adobe Systems Incorporated | Patch partitions and image processing |
-
2018
- 2018-01-29 CN CN201810081519.5A patent/CN108257105B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103700117A (en) * | 2013-11-21 | 2014-04-02 | 北京工业大学 | Robust optical flow field estimating method based on TV-L1 variation model |
CN103686139A (en) * | 2013-12-20 | 2014-03-26 | 华为技术有限公司 | Frame image conversion method, frame video conversion method and frame video conversion device |
US9767540B2 (en) * | 2014-05-16 | 2017-09-19 | Adobe Systems Incorporated | Patch partitions and image processing |
CN106331433A (en) * | 2016-08-25 | 2017-01-11 | 上海交通大学 | Video denoising method based on deep recursive neural network |
CN106991646A (en) * | 2017-03-28 | 2017-07-28 | 福建帝视信息科技有限公司 | A kind of image super-resolution method based on intensive connection network |
Non-Patent Citations (2)
Title |
---|
A. BUADES等: "Patch-Based Video Denoising With Optical Flow Estimation", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 * |
许文丹: "视频信号压缩及图像稳定性算法的研究", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034398A (en) * | 2018-08-10 | 2018-12-18 | 深圳前海微众银行股份有限公司 | Feature selection approach, device and storage medium based on federation's training |
CN109165683A (en) * | 2018-08-10 | 2019-01-08 | 深圳前海微众银行股份有限公司 | Sample predictions method, apparatus and storage medium based on federation's training |
CN109165683B (en) * | 2018-08-10 | 2023-09-12 | 深圳前海微众银行股份有限公司 | Sample prediction method, device and storage medium based on federal training |
CN109034398B (en) * | 2018-08-10 | 2023-09-12 | 深圳前海微众银行股份有限公司 | Gradient lifting tree model construction method and device based on federal training and storage medium |
CN111539879A (en) * | 2020-04-15 | 2020-08-14 | 清华大学深圳国际研究生院 | Video blind denoising method and device based on deep learning |
WO2021208122A1 (en) * | 2020-04-15 | 2021-10-21 | 清华大学深圳国际研究生院 | Blind video denoising method and device based on deep learning |
CN111539879B (en) * | 2020-04-15 | 2023-04-14 | 清华大学深圳国际研究生院 | Video blind denoising method and device based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN108257105B (en) | 2021-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11928792B2 (en) | Fusion network-based method for image super-resolution and non-uniform motion deblurring | |
US20210390339A1 (en) | Depth estimation and color correction method for monocular underwater images based on deep neural network | |
CN110533044B (en) | Domain adaptive image semantic segmentation method based on GAN | |
CN108257105A (en) | A kind of light stream estimation for video image and denoising combination learning depth network model | |
CN106600632B (en) | A kind of three-dimensional image matching method improving matching cost polymerization | |
CN101009835A (en) | Background-based motion estimation coding method | |
CN108961227B (en) | Image quality evaluation method based on multi-feature fusion of airspace and transform domain | |
CN110930327A (en) | Video denoising method based on cascade depth residual error network | |
CN110458784A (en) | It is a kind of that compression noise method is gone based on image perception quality | |
CN113052764A (en) | Video sequence super-resolution reconstruction method based on residual connection | |
CN103905815B (en) | Based on the video fusion method of evaluating performance of Higher-order Singular value decomposition | |
CN110276777A (en) | A kind of image partition method and device based on depth map study | |
CN117745596B (en) | Cross-modal fusion-based underwater de-blocking method | |
CN113838102B (en) | Optical flow determining method and system based on anisotropic dense convolution | |
CN105049678A (en) | Self-adaptation camera path optimization video stabilization method based on ring winding | |
CN111080533B (en) | Digital zooming method based on self-supervision residual sensing network | |
CN110120009B (en) | Background blurring implementation method based on salient object detection and depth estimation algorithm | |
CN108010061A (en) | A kind of deep learning light stream method of estimation instructed based on moving boundaries | |
CN116071281A (en) | Multi-mode image fusion method based on characteristic information interaction | |
CN115131254A (en) | Constant bit rate compressed video quality enhancement method based on two-domain learning | |
CN115396683A (en) | Video optimization processing method and device, electronic equipment and computer readable medium | |
CN106709873A (en) | Super-resolution method based on cubic spline interpolation and iterative updating | |
CN109068083B (en) | Adaptive motion vector field smoothing method based on square | |
CN106303538A (en) | A kind of video spatial scalable coded method supporting multisource data fusion and framework | |
CN112529815A (en) | Method and system for removing raindrops in real image after rain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |