CN109389569A

CN109389569A - Based on the real-time defogging method of monitor video for improving DehazeNet

Info

Publication number: CN109389569A
Application number: CN201811261910.XA
Authority: CN
Inventors: 陈天悦
Original assignee: Elephant Intelligent Technology (nanjing) Co Ltd
Current assignee: Elephant Intelligent Technology (nanjing) Co Ltd
Priority date: 2018-10-26
Filing date: 2018-10-26
Publication date: 2019-02-26
Anticipated expiration: 2038-10-26
Also published as: CN109389569B

Abstract

A kind of based on the monitor video real-time defogging method for improving DehazeNet, step includes: 1) to acquire video by video capture device, then video is cut into single picture frame by frame, is supplied to neural network and is handled；2) trained improved DehazeNet neural network is utilized, known to i.e. each layer weight, the transmitance t (x) and atmosphere light constant that each section is obtained for the picture piecemeal processing of input, finally constitute transmitance distribution map and atmosphere light constant distribution map；3) output for obtaining neural network solves fogless figure according to based on atmospherical scattering model defogging algorithm, then fogless figure is spliced into video again.Compared to traditional single image defogging, the present invention realizes the real-time defogging of video, while ensure that the effect of defogging, solves supersaturation existing for traditional defogging method and skyline fuzzy problem.

Description

Based on the real-time defogging method of monitor video for improving DehazeNet

Technical field

The invention belongs to computer technology video processing in application technology, it is specifically a kind of based on improve DehazeNet The real-time defogging method of monitor video.

Background technique

Mist is a kind of common meteor.Ubiquitous dust, cigarette or other particles, can all reduce atmosphere in air Clarity.Since light is by KPT Scatter in atmosphere, so that object contrast in visual effect reduces when imaging.Therefore, mist Many problems would generally be brought to photographic imagery.The presence of mist substantially changes atmospheric transmittance, pair of outdoor scene image It all can therefore be varied so that many features contained in image are capped or thicken than degree and color, video is caused to be supervised Control product cannot collect clearly live image, seriously affect to the safety precaution generation of important place in city.

The transmissivity of mist is often related with the depth of field in image, and object range image acquisition device is remoter, and transmissivity is lower. There are many defogging methods for being directed to single image, image restoration and two major class of image enhancement can be divided into.In addition to respectively According to histogram, the method for contrast and saturation degree, multiple image and image depth under Same Scene difference atmospheric conditions are utilized The method of information defogging has also been suggested.But in practical applications, several of the depth information of picture and stringent Same Scene Mist figure is not easy acquisition.

It there has been more reasonably hypothesis and priori knowledge in recent years, the defog effect of single image has length on this basis The progress of foot, but some problems are still had in picture quality and rapidity, such as:

1. precision is lower.It is higher according to the local contrast ratio image containing mist in fog free images it is assumed that be based on Ma Erke The maximized defogging method of local contrast of husband's random field (MRF) is although can be realized good defog effect, simultaneously Easily lead to the supersaturation of image.

2. speed is slower.Independent component analysis method (ICA) based on minimum input is used for defogging and works, but this side The processing time of method is very long and is consequently not used for handling the picture of mistiness.

Summary of the invention

To solve the problems, such as that precision present in traditional single image algorithm is lower, slow, in conjunction with convolutional Neural net Network and parallel computation structure have carried out collectivity Scheme Design to new video defogging algorithm.The invention proposes based on convolution mind Improved defogging network DehazeNet through network effectively increases image using atmospherical scattering model and every priori knowledge Defogging precision；The design structure and operand for analyzing each section in original DehazeNet are guaranteeing precision and defog effect Under the premise of network is adjusted and is optimized, thus reduce restore image the processing time；Meanwhile it is pixel recovery process is parallel Change, accelerate foggy image treatment process, to realize high definition real-time video dehazing function, and has and work well, is adaptive The features such as answering defogging.

The specific technical solution of the present invention is described as follows:

A kind of real-time defogging method of monitor video based on improvement DehazeNet, step include:

1) video is acquired by video capture device, then video is cut into single picture frame by frame, is supplied to nerve net Network is handled；

2) using trained neural network, i.e., each layer weight is it is known that the picture piecemeal for input is handled to obtain The transmitance and atmosphere light constant for taking each section, finally constitute transmitance distribution map and atmosphere light constant distribution map；

3) output for obtaining neural network solves fogless figure according to atmospherical scattering model, then fogless figure is spliced into again Video.

Specifically:

(1) defogging theory of algorithm model

1, atmospherical scattering model

In order to describe the composition of mist figure, atmospherical scattering model is proposed in the prior art, this is the basic model of image defogging, This model then has multiple improvement again.

Atmospherical scattering model is stated with following equation (1):

I (x)=J (x) t (x)+α (1-t (x)), (1)

Wherein, I (x) is the mist image primitive observed, J (x) is the true picture pixel after restoring, and t (x) is atmospheric transmission Rate, α are global atmosphere light constants.In formula (1), there are three unknown parameter J (x), t (x), α, are estimating t (x) and α Afterwards, real scene figure J (x) can be resumed to obtain.

Atmospheric transmissivity t (x) is used to describe not scattered and reached institute's accounting of the light of camera, fixed by formula (2) Justice:

T (x)=e^-βd(x), (2)

Wherein, d (x) is the corresponding scene point of the pixel to the distance of camera, the as depth of field, and β is atmospheric scattering coefficient, Show that t (x) tends to 0 when d (x) tends to be infinite.Comprehensive (1) (2) can obtain

α=I (x), d (x) → ∞, (3)

In actual imaging process, depth of field d (x) can not tend to be infinite, but one can be defined in remote situation The transmissivity t of very little₀.In this case, no longer using the method for atmosphere light is obtained in formula (3), according to the following formula (4) Method estimation atmosphere light can be more acurrate:

Based on discussed above, it can be deduced that conclusion: accurately estimating atmospheric transmissivity is the key that recover clear image.

2, defogging priori knowledge

Based on empiric observation, existing method proposes a variety of hypothesis for calculating feature relevant to mist or priori is known Know.The defogging of image can use these relevant features and realize.

1. dark channel prior:

Dark channel prior is obtained based on the outdoor fog free images of widely observation.In most fog free images block, At least one Color Channel has the low-down block of pixels of some intensity values, or even close to 0.Dark is defined as the image block The minimum value of middle all pixels:

Wherein, I^cIt is the RGB color channel of I, Ω_r(x) refer to the image block of r × r size centered on x.Dark channel prior Feature and the concentration of mist in image are closely connected, and can be directly used in estimation atmospheric transmissivity t (x) ∝ 1-D (x).

2. maximum-contrast:

According to atmospherical scattering model, mist influences that the contrast of image can be reduced caused by transmissivity:

Based on this observation as a result, using local contrast (in the image block Ω of s × s size_s(x) each picture in Difference between element) and in r × r size area Ω_rInterior local contrast maximizes, and carries out defogging.It is defined as follows:

Wherein | Ω_s(y) | it is the radix of local neighborhood, the relationship between contrast metric and transmissivity t is apparent , so local contrast defined in formula maximizes the visual effect that can enhance image.

3. color decays:

The saturation degree I of image block^s(x) it will receive the influence of mist to fall sharply, and brightness value I simultaneously^v(x) it can obviously increase, Difference between the two is caused to increase.According to color decaying priori, the difference between brightness and saturation degree can be used for estimating The concentration of mist:

A (x)=I^v(x)-I^s(x), (8)

Wherein I^v(x) and I^s(x) it can be expressed as in hsv color space

I^v(x)=max_{C ∈ { r, g, b }}I^c(x) (9)

Color decay characteristics are directly proportional to depth of field d (x), it is easy to be used for transmissivity estimation.

4. tone difference:

Original image I (x) and corresponding half anti-figure I_si(x) the tone difference between can be used for the mist in detection image.Half is anti- Figure is defined as follows:

I_si(x)=max_{C ∈ { r, g, b }}[I^c(x), 1-I^c(x)], (11)

For fog free images, the pixel value in three channels in half anti-figure will not be inverted all, lead to I_si(x) and I (x) it Between generate huge tone difference, this tone difference is defined as:

Wherein footmark ' h ' represents the luminance channel in hsv color space.According to formula (12), atmospheric transmissivity t (x) is anti- To having traveled to H (x).

(2) DehazeNet Crosslinking Structural

Existing DehazeNet network is made of convolutional layer and pond layer, and in network finally using bilateral linear Function is rectified as activation primitive.According to the general structure of neural network, network can be divided into feature extraction, multiple dimensioned reflect It penetrates, four major parts of local extremum and non-linear convergence.The overall structure of network is as shown in Figure 1.

1. feature extraction:

According to above-mentioned defogging priori knowledge, feature relevant to mist includes that the dark of picture, tone are different in image Cause property, maximum-contrast and color dough softening etc..The extraction of picture feature be substantially using filter appropriate and picture into Row convolution operation, while being usually associated with Nonlinear Mapping.Therefore the first part of network by convolutional layer conv1 and reshapes layer Reshape1 is constituted.Wherein convolutional layer for realizing filter function and to reshape layer then be in order to which the pond layer to after mentions For suitable data entry modality.The present invention is increasing pond layer pool1 for further feature extraction later.

RGB mist figure (each pixel of colored mist figure is all indicated with the different proportion of three kinds of colors of RGB RGB) Effective information is not often evenly distributed in tri- channels R, G, B, because the MAX function of pond layer is utilized in the invention, Purpose mainly more fully extracts the feature of image and reasonably simplifies data.While in order to which real-time Penetrating Fog institute is better achieved It is required that rapidity target, the design of pond layer of the invention be not single step sliding, but once sliding pond layer itself greatly Small step-length is with the feature of more promptly extraction image.Further, in order to reach the processing requirement of next part, integral net Network is provided with one layer again after the layer of pond and reshapes a layer reshape2.

2. multiple dimensioned mapping:

The Feature Mapping of the different scale of existing DehazeNet network has played important work in defogging treatment process With.Meanwhile the mapping of different scale facilitates further compressed data structure, reduces the computational burden of network, reaches rapidity Requirement.Second part main body in order to realize multiple dimensioned mapping, the invention proposes parallel convolutional coding structure as neural network. According to the universal experience of image procossing, the convolution kernel size that the present invention selects is 1 × 1,3 × 3,5 × 5 and 7 × 7.Pass through four A convolutional layer conv2/1 × 1 of different sizes, conv2/3 × 3, conv2/5 × 5, conv2/7 × 7, four scales in image Corresponding feature is extracted.Splice layer conv2/output by a convolution later to realize between four input datas Splicing.Splicing layer input dimension has used default setting, i.e., is spliced with all channel datas.

3. local extremum:

In the evaluation criterion of image restoration, space-invariance is very important a Xiang Zhibiao.Space-invariance feature It can use a series of pondization operation to realize.In convolutional neural networks, local extremum is then for overcoming local sensitivity to make At the inconsistent classical way in space.Simultaneously as the target of image acquisition equipment is that outdoor highway monitoring and security protection are supervised Control, the transmitance of air dielectric should not generate huge mutation in a small range, and the operation of local extremum can be eliminated effectively White noise that may be present.Therefore the overall network Part III pond layer pool2 that is set as MAX by one is constituted, it and the The distinguishing pond layer of a part is often caused by single pixel in view of local sensitivity, therefore, pond layer setting herein Sliding step be 1, to preferably keep the space-invariance of entire image.

4. non-linear convergence:

The activation primitive that the deep learning of earliest period uses is S type activation primitive, such as following formula (13)

Later in order to improve convergence rate, it is contemplated that gradient attenuation problem existing for S type activation primitive, most nerve Start when network training using line rectification unit as activation primitive, such as following formula (14).

But the problem of line rectification function be its output value be unlike S type activation primitive from 0 to 1, But from 0 to infinity.Such output obviously can be met with transmitance beyond the 1 physics fact.Therefore in order to make most Output is between 0 and 1 eventually, and the present invention has selected bilateral line rectification unit as activation primitive, such as following formula (15).

In caffe frame (Convolutional Architecture for Fast Feature Embedding, volume Product neural network framework) under, traditional excitation layer include line rectification unit (ReLU), S type function (Sigmoid), hyperbolic just Cut function (TanH), ABS function (AbsVal), weighting function (Power) and binomial natural logrithm likelihood score function (BNLL).Among these and do not include bilateral line rectification unit, therefore needs to utilize full articulamentum and linear whole in caffe frame Stream unit matching is just able to achieve the function of bilateral line rectification, the transmitance that final output invention needs.

(3) image restoration accelerates

In order to realize real-time defogging, accelerate image processing speed, since Python can call kernel function to realize GPU figure Processor concurrent operation, the present invention call library function to carry out multi-threaded parallel operation using Python, and from following two side Face, which is realized, to be accelerated:

(1) estimate global atmosphere light constant a.During obtaining atmospheric parameter, tentatively improves and atmosphere light can not be trained After the DehazeNet network of constant obtains transmissivity distribution, need to solve global atmosphere light constant α.The single block of the present invention Merger by Sharing Memory Realization transmittance minimum solves, and intermediate result is stored in shared drive, reduces to main memory Access, and then realize optimization；And it is same for the most value between row and row realized using single block, asked by merger The most value of all rows is obtained, finally using the intensity value of the smallest point of transmittance values as global atmosphere light constant α.

(2) image restoration.For the Multitask network of transmitance Yu atmosphere light constant can be trained after improving simultaneously, this Corresponding J (x) is calculated by parallel computation with peak efficiency in invention.Clear image J (x) can be according to t (x) and α by following public affairs Formula is restored:

When realizing parallel computation on GPU, parallel thread number is pixel number, and per thread can calculate pixel The corresponding clear image J (x) of x realizes parallel accelerate.

(4) evaluation criterion determines

1, neural network convergence

Mean square error (Mean Square Error, abbreviation MSE) is to evaluate the performance function of network.For example, there is n to defeated Enter output data, network is denoted as y by the output after training_i.The statistical parameter is that prediction data and initial data corresponding points are missed The mean value of the quadratic sum of difference, calculation formula is,

Wherein n is the number of sample, y_iIt is truthful data,It is the data of fitting, w_i> 0.By the definition of MSE it is found that its Value illustrates that model selection and fitting are better closer to 0, and data estimation is relatively more successful, and the result of network output has more Convincingness.

2, image restoration effect assessment

Comentropy: the richness of color can be quantified with probability.If the probability that each pixel value occurs in picture It is all not zero, that is, may be regarded as the gorgeous picture of color.The final quantized result of this probability, exactly comentropy are fitted Value.Entropy is bigger, and mist figure treatment effect is better.

Average gradient: nearby gray scale has notable difference on the boundary or hachure two sides of image, i.e. rate of gray level is big.This change The size of rate can be used to indicate image definition.It reflects the rate of image minor detail contrast variation, characterizes image Opposite readability.

Detailed description of the invention

Fig. 1 is DehazeNet network structure；

Fig. 2 is the flow chart of this defogging method；

Fig. 3 is the improvement DehazeNet structure chart of this defogging method；

Fig. 4 is that have mist tile data collection schematic diagram；

Fig. 5-1 is certain city's haze image (not using present invention processing)；

Fig. 5-2 is certain city's haze image (using present invention processing).

Specific embodiment

This case is further illustrated with specific embodiment with reference to the accompanying drawing:

(1) defogging method master-plan

Based on atmospherical scattering model, video defogging is exactly that transmitance and atmosphere light constant are estimated in the case where known mist figure To solve original image.Entire defogging process in real time is divided into three parts (the general frame is as shown in Figure 2) by the present invention:

First part is to acquire video by video capture device, then video is cut into single picture frame by frame, provided It is handled to neural network.

Second part is using trained neural network, i.e., each layer weight is it is known that picture piecemeal for input It handles to obtain the transmitance and atmosphere light constant of each section, finally constitutes transmitance distribution map and atmosphere light constant distribution map.

Part III is the output for obtaining neural network, solves fogless figure according to atmospherical scattering model, then by fogless figure weight Newly it is spliced into video.

(2) using the hardware design and environmental structure of the defogging system of this method

1, image acquisition equipment calls

Using the defogging system of this method, select web camera as image capture device.

Traditional analog video camera generally uses CMOS as sensor, and resolution ratio and illumination are relatively low, is suitable for one The videophone and conference telephone not high to image quality requirements a bit；And analog video camera can only carry out unidirectional signal biography Defeated, transmission is vision signal, and needing to connect hard disk video recorder either monitor when checking could read.In contrast, net Network video camera increases compression and handles the module of video, while having the function of video camera and video server.Therefore, network Video camera, which only needs an IP network interface, can be carried out the processing of data, meet the requirement monitored in real time.

In view of needing to be handled in real time video, web camera can not be passed by video recorder, directly by video It is defeated on computer, speed is faster；Meanwhile it can be by using the standard agreement RTSP on-link mode (OLM) of video camera for third party device It carries out taking stream, the code stream message on real-time reception network is decoded into video.

During realizing the installation of the end PC with routine call, camera is connected by the present invention with cable with computer.Make Used time first looks at the IP address of the machine computer and the IP address of IP Camera whether in same network segment, changes net as one sees fit The IP address of network camera, subnet mask, default gateway allow the machine computer to identify IP Camera, on computers may be used The standard successfully configured using real-time display monitor video as camera.The present invention accesses IP address by Opencv library function, Obtain the data flow of IP Camera transmission.An IP address is set by IP Camera, the ground is accessed by Opencv Location, so that it may realize and call camera indirectly, the data acquired are converted to image and show in computer end.Draw in python Enter cv2 function library, by calling the VideoCapture function of Opencv, obtains the real-time view transmitted from IP Camera Frequency evidence is handled and is shown in real time to video image.

2, host computer running environment configures

The hard-wired master system of this system is ubuntu16.04, memory headroom 7.7GB, video memory 4GB.On The Intel i7 core that position machine central processing unit is 2.80GHz, the graphics processing unit used is then Geforce GTX 1050Ti.Present invention uses the classical Integrated Development Environment Pycharm Community Editions of Python.In order to more easily carry out square Battle array arithmetic operation, the interpreter of Pycharm are set as Anaconda3, to be easy to use the numpy mould carried in Anaconda Block.In order to realize a series of processing for video and image, the present invention has invoked the function library in OpenCV3.4.0, accordingly Ground introduces cv2 module in Python.In order to be trained and examine to neural network, the present invention has been compiled containing CUDA9.0 The caffe module of interface and cuDNN network function, and basic calling is completed in Python.

(3) DehazeNet building is improved

1, network structure regulation and improvement

The present invention is based primarily upon two thinkings for the adjustment of network structure:

First is that in order to realize that rapidity, the present invention carry out the structure and parameter of four parts of existing DehazeNet It improves；

Second is that in order to reach better defog effect, it is contemplated that shade that may be present causes atmosphere light to shine in safety monitoring The problem of mutation, the invention proposes the multitask structures under caffe frame, at the same to image transmitance and atmosphere light constant into Study is gone.

Optimized network structure is as shown in Figure 3.

It can be seen that the present invention is provided with to reduce the calculation amount when processing of final picture in multiple dimensioned demapping section Parallel convolutional layer conv2/1 × 1 that one size is 1 × 1.Correspondingly, in reshaping layer, the present invention has marked off one newly Dimension reshape_a, and in convolution articulamentum later, the present invention has also accepted a new bottom structure, for will This smaller convolution kernel conv2/1x1 is included in network overall architecture.

In addition, it is contemplated that the size of existing training set is limited and existing network weight is it is known that present invention uses fine tunings Coaching method (fine-tuning).The present invention uses Ze Weier (Xaiver) initial method, passes through this initial method, mind It is as equal as possible through each layer of output variance in network, it ensure that information preferably flows in network.

And it is then similar with transmitance network design for the learning network mentality of designing of atmosphere light radiation, that is, utilize nerve net Network layers imitate traditional algorithm.The present invention still first with convolutional layer conv_a, reshape layer reshape_a for picture Essential characteristic is extracted.The atmosphere light constant of image is obtained using a maximum pond layer pool_a later.Finally, originally Invention, which is still reached using bilateral line rectification unit, accelerates convergent effect.

2, training set production and retraining

The training set of neural network is used increases haze effect based on atmospherical scattering model by the fogless figure of standard at random Fruit and generate have mist figure.

This training mode is proved to be feasible in the prior art.Fogless figure source of the invention is then The data set in 2014 that Middlebury Stereo database increases newly.By picture segmentation at 16x16's when present invention training Image block adds haze effect to these pictures according to formula (1).The transmitance of each picture be then by Monte Carlo Method with What machine generated, while recording as label.The jpg format of training set is as shown in figure 4, be converted later for lmdb lattice Formula:

In neural metwork training, the present invention has selected to be finely adjusted mode using newly-increased training set under existing weight Training, i.e., existing weight path, and newly-increased training set are added using-weights order in training, so that weight is more It adds kind.The present invention takes the scale of 50 picture of an iteration to be trained, and it is 4 that validation interval, which is arranged,.Finally, network There is certain progress for the effect of image procossing, all increases in selected three evaluation criterions of the present invention.

3, treatment effect is analyzed

After the present invention makes fine tuning to network structure and carries out retraining using new data set, the mean square error of network 0.0089 is reduced to from initial 0.0127.As can be seen that the training mode of bigger data set and fine tuning compensates for net The not convergence problem that network structural change may cause.In order to learn transmitance and atmosphere light constant simultaneously, in a multi-tasking mode, The present invention has re-started training, and with the further complication of network structure, mean square error is increased, but is mentioned herein Under the standard of comentropy and average gradient out, still improved for the treatment effect of single image, specific effect is shown in Table 1。

Table 1 single width figure DehazeNet, DehazeNet fine tuning are compared with Multitask defog effect

Other than quantitative analysis, defog effect is also that can feel visible, by taking certain city's haze image as an example.Such as Fig. 5-1 With Fig. 5-2, it can be clearly seen that, the clear many of the tower crane of Fig. 5-2.

Later using the different size single images of network checks function treatment, and the processing time is had recorded, is listed in simultaneously It in table 2, and is compared with the processing time required for dark defogging, it can be seen that utilize the DehazeNet after training Defogging is carried out, rapidity is significantly improved, and is added to the improvement DehazeNet defogging rapidity of atmosphere light constant learning structure Although being declined slightly, but obtain better defog effect (as shown in table 1).

Different size picture DCP, DehazeNet and Multitask the method defogging times of table 2

Dimension of picture	Dark channel prior	DehazeNet	Multitask
				800x600	0.154s	0.067s	0.089s
1280x720	0.298s	0.138s	0.177s
				1600x900	0.815s	0.421s	0.589s
1920x1080	0.926s	0.679s	0.790s

(4) image restoration parallelization

Call GPU Efficient Solution overall situation atmosphere light constant α.When calculating global atmosphere light constant, need every in movement images The transmitance t (x) of a bit, and therefrom obtain the smallest point of transmissivity.Comparison procedure can be accelerated with parallel.

Firstly, defining kernel function relatively obtains the smallest point of transmissivity in image.The image to be processed of camera acquisition It is 1920 × 1080 size, image shares 1080 rows.36 block are distributed, a block is enabled to handle the image of 30 rows, 30 thread are distributed in one block, a thread handles a line image.Each thread executes identical kernel function, Compare to obtain that the smallest point of transmissivity in each row of data.

Due to the processing between row and row be it is parallel, to obtain the minimum transmittance of entire image, it is necessary to do again row with Comparison between row thus needs temporarily to store the minimum value that every a line obtains.The present invention is in view of calling memory that can consume A large amount of resource, and waste time, therefore shared drive is used, reduce expense, while accelerating arithmetic speed.

Apply for 33 × 33 shared drive, in total 1089 bank, the minimum transmittance for 1080 rows that can be stored Data.The capable comparison result between row, the present invention are assigned with a block in order to obtain after comparison in current row, 30 thread are set in this block, and each thread is responsible for relatively and obtains the most value in 36 data；Similarly, Shen Bank that please be one 3 × 10 utilizes shared drive to record intermediate result；After this is calculated, 30 data are only left.This When, data processing amount has been greatly reduced, and only operation need to can be rapidly completed with a thread.

In this way, being substantially reduced by the solution of merger three times more just completed to piece image minimum transmittance Operand accelerates arithmetic speed.Transmissivity t (x) and global atmosphere light constant α have been obtained, recovery can be calculated Image.

Serial computing is carried out if it is with CPU, i.e., to carry out 1920 × 1080=2073600 times circulation, do so extremely It wasting time and resource, and the processing of each pixel and uncomplicated, the processing between pixel and pixel does not couple, Therefore arithmetic speed can be accelerated by GPU acceleration.

By formula (16) it is found that due to only needing once-through operation to can be obtained by output image, thus do not need for Thread distributes additional memory, directly exports.8 block are distributed, each block needs to handle 259200 pixels, 12 × 20=240 thread is distributed to each block, each thread need to handle 1080 pixels.Each thread is executed Identical kernel function kernel, equipment end dispatch warp and corresponding output image J (x) are calculated with peak efficiency.Through excessive Match, arithmetic speed can be accelerated, realizes processing in real time.

But since the process for solving transmissivity is excessively complicated, i.e., improving operational speed is also difficult to using GPU acceleration, it is contemplated that Under no wind condition, the variation of transmissivity in a short time is almost nil, so, the atmospheric transmission of calculating in 10 seconds is set Rate reduces unnecessary operation in this way, realizes the real-time processing of defogging.

Technological difficulties of the invention are mainly reflected in following three aspect: first is that the selection of defogging algorithm；Second is that neural network Training and building；Third is that system framework is built.Monitor video defogging system of the present invention is in conjunction with traditional defogging algorithm and newly Type defogging network, and it is only limited with recovery effect of the data processing algorithm to thick fog, once mist overrich, the result of Penetrating Fog algorithm It is just no longer credible.By the comparison of a large amount of pictures, a suitable transmitance is provided using " two classification " in neural network Threshold value, as judging the whether believable final foundation of Penetrating Fog algorithm final result.

The utility model has the advantages that

The present invention improves simultaneously retraining based on the DehazeNet system of neural convolutional network to obtain artwork transmission rate Figure, and real-time video defogging is realized using parallelization calculating.In building with training for neural network, present invention uses double Sideline property amending unit as activation primitive, use Ze Weier distribution as weights initialisation distribution to reduce search space and mention High convergence rate.The present invention is proposed using comentropy and average gradient as defog effect index, on this basis, using caffe Multi job mode under frame devises while learning the neural network of transmitance and atmosphere light constant, so that at single image Reason effect obtains further promotion under appraisement system of the invention.After obtaining transmitance figure, the present invention passes through CUDA calls kernel function, realizes that parallel computation restores speed with speeding up picture on GPU, has reached wanting for real-time video defogging It asks.

Compared to traditional single image defogging, the present invention realizes the real-time defogging of video, while ensure that defogging Effect solves supersaturation existing for traditional defogging method and skyline fuzzy problem.The present invention is realized direct using Python The processing of software defogging is carried out after calling the video data of camera, compared to traditional optics Penetrating Fog, without setting to image acquisition It is standby to carry out additional update and upgrading, dramatically save capital investment.

Claims

1. it is a kind of based on the real-time defogging method of monitor video for improving DehazeNet, it is characterized in that the step of defogging method Include:

1) by video capture device acquire video, video is then cut into single picture frame by frame, be supplied to neural network into Row processing；

2) trained improved DehazeNet neural network is utilized, i.e., each layer weight is it is known that the picture for input divides Block is handled to obtain the transmitance t (x) of each section and atmosphere light constant, finally constitutes transmitance distribution map and atmosphere light constant point Butut；

3) output for obtaining neural network solves fogless figure according to based on atmospherical scattering model defogging algorithm, then by fogless figure weight Newly it is spliced into video；

The step of step 2) includes:

2.1) feature extraction is carried out to the picture that step 1) obtains:

Improved DehazeNet network is to increase new convolutional layer conv_a in DehazeNet network and reshape layer reshape_a；Convolutional layer conv_a is parallel with the convolutional layer conv1 in DehazeNet network；Reshaping a layer reshape_a is Reshaping in DehazeNet network marks off in layer reshape1；

It by convolutional layer conv_a, reshapes layer reshape_a the essential characteristic of picture is extracted, then by maximum pond layer Pool_a extracts the atmosphere light constants of sampled images；

By convolutional layer conv1, reshape layer reshape1 the essential characteristic of picture extracted, then by pond layer pool_1 Lai Further extract the atmospheric transmissivity of image；It is set after the layer pool_1 of pond and reshapes a layer reshape2；

2.2) multiple dimensioned mapping:

Output to layer reshape2 is reshaped uses parallel convolutional coding structure, and the convolution kernel size selected is 1 × 1,3 × 3,5 × 5 and 7 × 7, pass through four convolutional layer conv2/1 × 1 of different sizes, conv2/3 × 3, conv2/5 × 5, conv2/7 × 7, four corresponding features of scale will be extracted in image；Then layer conv2/output is spliced by convolution and realizes four Splicing between a input data；

2.3) local extremum:

For step 2.2) as a result, being set as the pond layer pool2 extraction of maximum MAX, the cunning of pond layer pool2 by one again Dynamic step-length is 1；

2.4) non-linear convergence:

For step 2.3) as a result, using bilateral line rectification unit as activation primitive, such as following formula:

In the step 3):

Global atmosphere light constant α: the transmitance t (x) of each pixel x in movement images is first solved, and therefrom obtains transmissivity The smallest point of numerical value, the intensity value of the point is as global atmosphere light constant α；

Then fogless figure is solved according to based on atmospherical scattering model defogging algorithm, clear image J (x) is according to t (x) and α by formulaIt restores, wherein I (x) is the mist image primitive observed.

2. the real-time defogging method of monitor video according to claim 1, it is characterized in that in the step 3), fogless nomography Selection process be:

Atmospherical scattering model is stated with following equation (1):

I (x)=J (x) t (x)+α (1-t (x)), (1)

Wherein, I (x) is the mist image primitive observed, J (x) is the mist elimination image pixel after recovering, and t (x) is atmospheric transmission Rate, α are global atmosphere light constants；In formula (1), there are three unknown parameter J (x), t (x), α, are estimating t (x) and α Afterwards, real scene figure J (x) is resumed to obtain；

Atmospheric transmissivity t (x) is used to describe not scattered and reached institute's accounting of the light of camera, is defined by formula (2):

T (x)=e^-βd(x), (2)

Wherein, d (x) is distance of the corresponding scene point of the pixel to camera；

It can be obtained by (1), (2) synthesis

α=I (x), d (x) → ∞, (3)

In actual imaging process, depth of field d (x) can not tend to be infinite, but a very little can be defined in remote situation Transmissivity t₀；In this case, it no longer uses in formula (3) and obtains the method for atmosphere light, according to the following formula the side of (4) Method estimates that atmosphere light can be more acurrate:

Based on above-mentioned analysis, it can be deduced that conclusion: accurately estimating atmospheric transmissivity is the key that recover clear image.

3. the real-time defogging method of monitor video according to claim 1, it is characterized in that being solved global big in the step 3) The method of gas light constant α is: being solved with single block by the merger of Sharing Memory Realization transmittance minimum, centre is tied Fruit is stored in shared drive；Most value between row and row is also realized using single block, is acquired by merger all Capable most value, specific as follows finally using the intensity value of the smallest point of transmittance values as global atmosphere light constant α:

When calculating global atmosphere light constant, the transmitance t (x) of every bit in movement images is needed, and therefrom obtains transmissivity most Small point, comparison procedure are accelerated with parallel:

Firstly, defining kernel function relatively obtains the smallest point of transmissivity in image, the image to be processed of camera acquisition is The size of 1920*1080, image share 1080 rows；

36 block are distributed, a block is enabled to handle the image of 30 rows, 30 thread of distribution in a block, one Thread handles a line image；Each thread executes identical kernel function, and it is the smallest to compare to obtain transmissivity in each row of data That point；

Due to the processing between row and row be it is parallel, to obtain the minimum transmittance of entire image, it is necessary to do again capable with row Between comparison, thus need temporarily to store the minimum value that every a line obtains；

Apply for the shared drive of 33*33, in total 1089 bank, the minimum transmittance data for 1080 rows stored；Current row After interior comparison, the capable comparison result between row, is assigned with a block, is arranged in this block in order to obtain 30 thread, each thread are responsible for relatively and obtain the most value in 36 data；Similarly, apply for a 3*10's Bank records intermediate result using shared drive；After this is calculated, 30 data are only left；

Transmissivity t (x) and global atmosphere light constant α have been obtained, then has carried out restored image.

4. the real-time defogging method of monitor video according to claim 3, it is characterized in that in the step 3), restored image When, corresponding J (x) is obtained by parallel computation；When realizing parallel computation on GPU, parallel thread number is pixel number, often A thread can calculate the corresponding clear image J (x) of pixel, realize parallel accelerate；

ByKnown to, it is only necessary to once-through operation can be obtained by output image, so do not need for Thread distributes additional memory, directly exports；

8 block are distributed, each block needs to handle 259200 pixels, distributes 12 × 20=240 to each block Thread, each thread need to handle 1080 pixels；Each thread executes identical kernel function kernel, equipment end scheduling Corresponding output image J (x) is calculated with peak efficiency in warp.

5. the real-time defogging method of monitor video according to claim 3, it is characterized in that being arranged 10 seconds in the step 3) Calculate an atmospheric transmissivity.