CN108257105A - A kind of light stream estimation for video image and denoising combination learning depth network model - Google Patents

A kind of light stream estimation for video image and denoising combination learning depth network model Download PDF

Info

Publication number
CN108257105A
CN108257105A CN201810081519.5A CN201810081519A CN108257105A CN 108257105 A CN108257105 A CN 108257105A CN 201810081519 A CN201810081519 A CN 201810081519A CN 108257105 A CN108257105 A CN 108257105A
Authority
CN
China
Prior art keywords
light stream
image
denoising
module
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810081519.5A
Other languages
Chinese (zh)
Other versions
CN108257105B (en
Inventor
李望秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of South China
Original Assignee
University of South China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of South China filed Critical University of South China
Priority to CN201810081519.5A priority Critical patent/CN108257105B/en
Publication of CN108257105A publication Critical patent/CN108257105A/en
Application granted granted Critical
Publication of CN108257105B publication Critical patent/CN108257105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention discloses a kind of light stream estimation for video image and denoising combination learning depth network model, belongs to image processing field.The model includes preprocessing module, light stream estimation module and denoising module, each module uses Encoder Decoder network structures, preprocessing module is individually trained first using sample data set, then the relevant parameter of preprocessing module is fixed, training preprocessing module and light stream estimation module simultaneously, finally fix the relevant parameter of preprocessing module and light stream estimation module, whole training includes the depth network model of three modules, and the depth network model completed using training directly can carry out light stream estimation and denoising to noisy video image.Combination learning depth network model proposed by the present invention, light stream estimation and denoising speed are fast, and precision is high, convenient for quickly handling multitude of video image in practice.

Description

A kind of light stream estimation for video image and denoising combination learning depth network model
Technical field
The present invention relates to image processing fields, refer specifically to a kind of light stream estimation for video image and denoising combination learning Depth network model.
Background technology
Video image all suffers from noise jamming in the links such as acquisition, compression, storage, transmission, and noise can be significantly reduced and be regarded The visual quality of frequency image, and difficulty is caused to intelligent analysis such as subsequent target recognition and trackings.Therefore, it is necessary to retain The noise in video image is removed under the premise of video information, improve signal-to-noise ratio and improves visual effect.
Since video image has relativity of time domain, light stream estimation and video denoising can be combined, obtained more Good denoising effect, but existing joint light stream estimation and video denoising algorithm, need a large amount of interative computations, expend big gauge It calculates resource and time, inconvenience is applied in practice, and light stream estimation is easily interfered by video noise, so as to influence denoising Effect.It is therefore proposed that quickly and effectively joint light stream estimation and video denoising algorithm, are that field of video image processing is badly in need of solving The problem of.
Invention content
The present invention is overcomes the above situation insufficient, it is desirable to provide a kind of light stream estimation for video image is combined with denoising Learn depth network model, using depth network model from a large amount of training samples, combination learning light stream estimation and video denoising, The problem of light stream estimated accuracy is low in the prior art with solution, and denoising effect is poor, and time-consuming.
In order to solve the above technical problems, technical solution proposed by the present invention is:
A kind of light stream estimation for video image and denoising combination learning depth network model, it is characterised in that:The joint is deep It spends learning network model and includes three modules:Preprocessing module, light stream estimation module and denoising module, first with sample data Set pair depth network model is trained;Then it to the noise image im_n1 and im_n2 of input, is done just using preprocessing module Denoising is walked, obtains pretreated image to im_p1 and im_p2;Using light stream estimation module to image to im_p1 and Im_p2 carries out estimation, obtains light stream estimated result flow;Noise image im_n2 is done according to light stream estimated result flow Transformation obtains image im_n2 ', then using image im_n2 ' and noise image im_n1 as the input picture of denoising module, made an uproar The corresponding final denoising image im_dn of acoustic image im_n1.
Input noise the image im_n1 and im_n2 are adjacent two field pictures in the video comprising noise.
The quantity of the sample data set is no less than 20000, wherein each sample includes adjacent two frames noise in video Image n1 and n2, the corresponding standard-definition image p1 and p2 of noise image n1 and n2, image is to p1 and p2 corresponding light streams estimation As a result f.
The specific training method of the depth network model is the corresponding data concentrated using sample data, is individually instructed first Practice preprocessing module;Then the relevant parameter of preprocessing module, while training preprocessing module and light stream estimation module are fixed;Most The relevant parameter of preprocessing module and light stream estimation module is fixed afterwards, and whole training includes the depth network model of three modules.
The preprocessing module, light stream estimation module and denoising module use Encoder-Decoder network structures.
The Encoder-Decoder network structures include coding Encoder parts and decoding Decoder parts, wherein It encodes Encoder parts and includes M layers of convolutional layer, decoding Decoder parts include M sub-network, and each sub-network includes 1 instead Convolutional layer and N number of convolutional layer when each subnet network layers of decoding Decoder parts make deconvolution, call coding Encoder parts Corresponding convolutional layer characteristics of image, the output result of last layer is as next layer of input.
The depth network model is trained using Caffe deep learnings frame.
Advantageous effect of the present invention:
1)Light stream estimation designed by the present invention can solve noisy in practice simultaneously with denoising combination learning depth network model The light stream estimation of video and Denoising Problems, with combining light stream estimation and video denoising algorithm based on interative computation in the prior art It compares, light stream estimated accuracy is high, and the denoising effect based on light stream estimation auxiliary is more preferable, once and combination learning depth network mould Type training is completed, and the speed of light stream estimation and denoising is very fast, convenient for quickly handling multitude of video image in practice.
2)The present invention is for combination learning depth network model, using first individually training single network module carries out entirety again The method of network model training, can effectively reduce the network parameter in training process, network model is avoided over-fitting occur.
3)The method of the present invention is special come the automatic image for learning light stream estimation image and denoising image using deep learning model Sign carries out light stream estimation end to end and image denoising, without estimating that moving boundaries are assisted, and also it is used Encoder-Decoder depth network model can fully excavate the multidimensional characteristic in input picture, can be promoted light stream estimation with The effect of denoising.
Description of the drawings
Fig. 1 is the structure diagram of combination learning depth network model in the present invention;
Fig. 2 is that sample data concentrates adjacent two frames noise image n1 and n2 in video;
Fig. 3 is that sample data concentrates the corresponding standard-definition image p1 and p2 of noise image n1 and n2;
Fig. 4 be sample data concentrate image to p1 and p2 corresponding light stream estimated result f;
Fig. 5 is Encoder-Decoder schematic network structures;
Fig. 6 is the structure diagram of sub-network in Encoder-Decoder networks;
Fig. 7 is adjacent two frames noise image im_n1 and im_n2 in pending video;
Fig. 8 is the corresponding light stream estimated result of noise image im_n1 and im_n2;
Fig. 9 is the denoising result of noise image im_n1.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.Based on this Embodiment in invention, the every other reality that those of ordinary skill in the art are obtained without making creative work Example is applied, shall fall within the protection scope of the present invention.
As shown in Figure 1, a kind of light stream estimation for video image provided in this embodiment and denoising combination learning depth Network model, including three modules:Preprocessing module, light stream estimation module and denoising module, first structure include 30000 samples This sample data set, wherein each sample includes adjacent two frames noise image n1 and n2 in video, as shown in Fig. 2, noise pattern As the corresponding standard-definition image p1 and p2 of n1 and n2, as shown in figure 3, image to p1 and p2 corresponding light stream estimated result f, such as Shown in Fig. 4.
Preprocessing module, light stream estimation module and denoising module use Encoder-Decoder network structures, such as Fig. 5 institutes Show, Encoder-Decoder network structures include coding Encoder parts and decoding Decoder parts, wherein encoding Encoder parts include 6 layers of convolutional layer c1-c6, the characteristic pattern quantity of this 6 convolutional layers is respectively 64,64,128,128,256, 512, decoding Decoder part include 5 sub-network subnet1- subnet5, and the structure of sub-network is as shown in fig. 6, per height Network includes 1 convolutional layer and 4 warp laminations, and the characteristic pattern quantity of convolutional layer is 64,4 warp laminations in each sub-network Characteristic pattern quantity be respectively 512,256,128,64, when each subnet network layers of decoding Decoder parts make deconvolution, call Encode the corresponding convolutional layer characteristics of image in Encoder parts, the output result of last layer is as next layer of input.
Combination learning depth network model is trained using sample data set, using in Ubuntu systems The Caffe environmental trainings deep learning model, is trained using ADAGRAD optimization algorithms.Individually training pre-processes mould first Block;Then the relevant parameter of preprocessing module, while training preprocessing module and light stream estimation module are fixed;Finally fixed pre- place The relevant parameter of module and light stream estimation module is managed, whole training includes the depth network model of three modules.Individually training is pre- When processing module and simultaneously training preprocessing module and light stream estimation module, initial learning rate is 0.01, and frequency of training is 600000 times, wherein, when frequency of training is 300000,400000 and 500000, learning rate difference divided by 10 reduces study Rate.Whole training includes the depth network model of three modules, and initial learning rate is 0.02, and frequency of training is 500000 times, In, when frequency of training is 200000,300000 and 400000, learning rate difference divided by 8 reduces learning rate.
After combination learning depth network model is completed in training, the noisy video image of the model treatment is directly utilized.By video In adjacent two frames noise image im_n1 and im_n2 input the model, as shown in fig. 7, noise image im_ directly can be obtained quickly The corresponding light stream estimated results of n1 and im_n2, as shown in figure 8, and noise image im_n1 denoising result, as shown in Figure 9.
Above disclosed is only a kind of preferred embodiment of the present invention, cannot limit the power of the present invention with this certainly Sharp range, therefore equivalent variations made according to the claims of the present invention, are still within the scope of the present invention.

Claims (7)

1. a kind of light stream estimation for video image and denoising combination learning depth network model, it is characterised in that:The joint Deep learning network model includes three modules:Preprocessing module, light stream estimation module and denoising module, first with sample number It is trained according to set pair depth network model;Then it to the noise image im_n1 and im_n2 of input, is done using preprocessing module Preliminary denoising obtains pretreated image to im_p1 and im_p2;Using light stream estimation module to image to im_p1 and Im_p2 carries out estimation, obtains light stream estimated result flow;Noise image im_n2 is done according to light stream estimated result flow Transformation obtains image im_n2 ', then using image im_n2 ' and noise image im_n1 as the input picture of denoising module, made an uproar The corresponding final denoising image im_dn of acoustic image im_n1.
2. the light stream estimation according to claim 1 for video image and denoising combination learning depth network model, It is characterized in that:Input noise the image im_n1 and im_n2 are adjacent two field pictures in the video comprising noise.
3. the light stream estimation according to claim 1 for video image and denoising combination learning depth network model, It is characterized in that:The quantity of the sample data set is no less than 20000, wherein each sample includes adjacent two frames noise in video Image n1 and n2, the corresponding standard-definition image p1 and p2 of noise image n1 and n2, image is to p1 and p2 corresponding light streams estimation As a result f.
4. the light stream estimation according to claim 1 for video image and denoising combination learning depth network model, It is characterized in that:The specific training method of the depth network model is the corresponding data concentrated using sample data, first individually Training preprocessing module;Then the relevant parameter of preprocessing module, while training preprocessing module and light stream estimation module are fixed; The relevant parameter of preprocessing module and light stream estimation module is finally fixed, whole training includes the depth network mould of three modules Type.
5. light stream estimation for video image and denoising combination learning depth network model according to claim 1 or 4, It is characterized in that:The preprocessing module, light stream estimation module and denoising module use Encoder-Decoder network structures.
6. the light stream estimation according to claim 5 for video image and denoising combination learning depth network model, It is characterized in that:The Encoder-Decoder network structures include coding Encoder parts and decoding Decoder parts, wherein It encodes Encoder parts and includes M layers of convolutional layer, decoding Decoder parts include M sub-network, and each sub-network includes 1 instead Convolutional layer and N number of convolutional layer when each subnet network layers of decoding Decoder parts make deconvolution, call coding Encoder parts Corresponding convolutional layer characteristics of image, the output result of last layer is as next layer of input.
7. the light stream estimation according to claim 4 for video image and denoising combination learning depth network model, It is characterized in that:The depth network model is trained using Caffe deep learnings frame.
CN201810081519.5A 2018-01-29 2018-01-29 Optical flow estimation and denoising joint learning depth network model for video image Active CN108257105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810081519.5A CN108257105B (en) 2018-01-29 2018-01-29 Optical flow estimation and denoising joint learning depth network model for video image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810081519.5A CN108257105B (en) 2018-01-29 2018-01-29 Optical flow estimation and denoising joint learning depth network model for video image

Publications (2)

Publication Number Publication Date
CN108257105A true CN108257105A (en) 2018-07-06
CN108257105B CN108257105B (en) 2021-04-20

Family

ID=62743478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810081519.5A Active CN108257105B (en) 2018-01-29 2018-01-29 Optical flow estimation and denoising joint learning depth network model for video image

Country Status (1)

Country Link
CN (1) CN108257105B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034398A (en) * 2018-08-10 2018-12-18 深圳前海微众银行股份有限公司 Feature selection approach, device and storage medium based on federation's training
CN109165683A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Sample predictions method, apparatus and storage medium based on federation's training
CN111539879A (en) * 2020-04-15 2020-08-14 清华大学深圳国际研究生院 Video blind denoising method and device based on deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103686139A (en) * 2013-12-20 2014-03-26 华为技术有限公司 Frame image conversion method, frame video conversion method and frame video conversion device
CN103700117A (en) * 2013-11-21 2014-04-02 北京工业大学 Robust optical flow field estimating method based on TV-L1 variation model
CN106331433A (en) * 2016-08-25 2017-01-11 上海交通大学 Video denoising method based on deep recursive neural network
CN106991646A (en) * 2017-03-28 2017-07-28 福建帝视信息科技有限公司 A kind of image super-resolution method based on intensive connection network
US9767540B2 (en) * 2014-05-16 2017-09-19 Adobe Systems Incorporated Patch partitions and image processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103700117A (en) * 2013-11-21 2014-04-02 北京工业大学 Robust optical flow field estimating method based on TV-L1 variation model
CN103686139A (en) * 2013-12-20 2014-03-26 华为技术有限公司 Frame image conversion method, frame video conversion method and frame video conversion device
US9767540B2 (en) * 2014-05-16 2017-09-19 Adobe Systems Incorporated Patch partitions and image processing
CN106331433A (en) * 2016-08-25 2017-01-11 上海交通大学 Video denoising method based on deep recursive neural network
CN106991646A (en) * 2017-03-28 2017-07-28 福建帝视信息科技有限公司 A kind of image super-resolution method based on intensive connection network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A. BUADES等: "Patch-Based Video Denoising With Optical Flow Estimation", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
许文丹: "视频信号压缩及图像稳定性算法的研究", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034398A (en) * 2018-08-10 2018-12-18 深圳前海微众银行股份有限公司 Feature selection approach, device and storage medium based on federation's training
CN109165683A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Sample predictions method, apparatus and storage medium based on federation's training
CN109165683B (en) * 2018-08-10 2023-09-12 深圳前海微众银行股份有限公司 Sample prediction method, device and storage medium based on federal training
CN109034398B (en) * 2018-08-10 2023-09-12 深圳前海微众银行股份有限公司 Gradient lifting tree model construction method and device based on federal training and storage medium
CN111539879A (en) * 2020-04-15 2020-08-14 清华大学深圳国际研究生院 Video blind denoising method and device based on deep learning
WO2021208122A1 (en) * 2020-04-15 2021-10-21 清华大学深圳国际研究生院 Blind video denoising method and device based on deep learning
CN111539879B (en) * 2020-04-15 2023-04-14 清华大学深圳国际研究生院 Video blind denoising method and device based on deep learning

Also Published As

Publication number Publication date
CN108257105B (en) 2021-04-20

Similar Documents

Publication Publication Date Title
US11928792B2 (en) Fusion network-based method for image super-resolution and non-uniform motion deblurring
US20210390339A1 (en) Depth estimation and color correction method for monocular underwater images based on deep neural network
CN110533044B (en) Domain adaptive image semantic segmentation method based on GAN
CN108257105A (en) A kind of light stream estimation for video image and denoising combination learning depth network model
CN106600632B (en) A kind of three-dimensional image matching method improving matching cost polymerization
CN101009835A (en) Background-based motion estimation coding method
CN108961227B (en) Image quality evaluation method based on multi-feature fusion of airspace and transform domain
CN110930327A (en) Video denoising method based on cascade depth residual error network
CN110458784A (en) It is a kind of that compression noise method is gone based on image perception quality
CN113052764A (en) Video sequence super-resolution reconstruction method based on residual connection
CN103905815B (en) Based on the video fusion method of evaluating performance of Higher-order Singular value decomposition
CN110276777A (en) A kind of image partition method and device based on depth map study
CN117745596B (en) Cross-modal fusion-based underwater de-blocking method
CN113838102B (en) Optical flow determining method and system based on anisotropic dense convolution
CN105049678A (en) Self-adaptation camera path optimization video stabilization method based on ring winding
CN111080533B (en) Digital zooming method based on self-supervision residual sensing network
CN110120009B (en) Background blurring implementation method based on salient object detection and depth estimation algorithm
CN108010061A (en) A kind of deep learning light stream method of estimation instructed based on moving boundaries
CN116071281A (en) Multi-mode image fusion method based on characteristic information interaction
CN115131254A (en) Constant bit rate compressed video quality enhancement method based on two-domain learning
CN115396683A (en) Video optimization processing method and device, electronic equipment and computer readable medium
CN106709873A (en) Super-resolution method based on cubic spline interpolation and iterative updating
CN109068083B (en) Adaptive motion vector field smoothing method based on square
CN106303538A (en) A kind of video spatial scalable coded method supporting multisource data fusion and framework
CN112529815A (en) Method and system for removing raindrops in real image after rain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant