CN111951191A

CN111951191A - Video image snow removing method and device and storage medium

Info

Publication number: CN111951191A
Application number: CN202010820587.6A
Authority: CN
Inventors: 贾振红; 杨斌
Original assignee: Xinjiang University
Current assignee: Xinjiang University
Priority date: 2020-08-14
Filing date: 2020-08-14
Publication date: 2020-11-17
Anticipated expiration: 2040-08-14
Also published as: CN111951191B

Abstract

The invention discloses a method, a device and a storage medium for removing snow from a video image, wherein the method comprises the following steps: adopting a first self-adaptive threshold value to judge the sparse snowflakes processed by the optical flow field estimation algorithm and the support vector machine in sequence for detection to obtain a final sparse snowflake detection graph; performing background modeling on an input video based on a block-based Gaussian mixture model, removing dense snowflakes, and obtaining a low-rank background; detecting a motion foreground in an input video by using a Markov random field to obtain a binary image, removing dense snowflakes in the area of the binary image by using a second self-adaptive threshold, and pasting the processed binary image back to a low-rank background to generate a new input video; and performing similarity matching on the final sparse snowflake detection image and a frame picture corresponding to the new input video, and removing the sparse snowflakes. According to the invention, through the processed snowing video, the pixel information shielded by the snowflakes in the video is recovered, more detailed characteristics can be kept than the original image, and the problem of pixel information loss of the area shielded by the snowflakes in the video is effectively solved.

Description

Video image snow removing method and device and storage medium

Technical Field

The invention relates to the field of video image processing, in particular to a method and a device for removing snow from a video image and a storage medium.

Background

With the rapid development of computer vision technology, outdoor vision systems are increasingly widely applied in the fields of military, traffic, safety and the like. However, due to image blurring and information coverage caused by various bad weather, the performance of the outdoor vision system is directly reduced, and serious influence is caused on target detection, identification, tracking, segmentation and monitoring. Therefore, it is necessary to establish an outdoor vision system that removes the effects of various inclement weather on video images. The snowflakes are a common product in severe weather, have the characteristics of high brightness, random distribution, unfixed size, different falling speed and the like, and can seriously block important information in a video image, so that the removal of the snowflakes in the video image has important research significance.

Although the existing video image snow removal algorithm achieves some achievements, when different snowing scenes are processed, some problems still exist:

firstly, in a snow scene, the continuous multi-frame pixel points in the video may be covered by the snow pixels at the same time, so that the existing video snow removing algorithm is difficult to achieve a good snow removing effect when processing a snow video.

Secondly, in a scene containing a moving object, the existing algorithm can cause the loss of the moving object, and large snowflakes in the close scene can be mistakenly judged as the moving object, so that the snowflakes are not completely removed.

Thirdly, the existing algorithm only sets a uniform pixel size or weight for snow detection and removal, which results in poor algorithm universality and difficulty in obtaining good snow removal effect when processing videos with too high or too low resolution.

Disclosure of Invention

Aiming at the problem that information is shaded by snowflakes in a real snowing video, the invention provides an effective video image snow removing method, device and storage medium, pixel information shaded by snowflakes in the video is recovered through the processed snowing video, and more detailed characteristics can be reserved compared with an original image, the problem that the pixel information of an area shaded by snowflakes in the video is lost is effectively solved, and the method is described in detail as follows:

in a first aspect, a method of snow removal from a video image, the method comprising:

adopting a first self-adaptive threshold value to judge the sparse snowflakes processed by the optical flow field estimation algorithm and the support vector machine in sequence for detection to obtain a final sparse snowflake detection graph;

performing background modeling on an input video based on a block-based Gaussian mixture model, removing dense snowflakes, and obtaining a low-rank background;

detecting a motion foreground in an input video by using a Markov random field to obtain a binary image, removing dense snowflakes in the area of the binary image by using a second self-adaptive threshold, and pasting the processed binary image back to a low-rank background to generate a new input video;

and performing similarity matching on the final sparse snowflake detection image and a frame picture corresponding to the new input video, and removing the sparse snowflakes.

In one implementation, the detecting with the first adaptive threshold judgment specifically includes:

when the value of each connected domain of the expanded snowflake detection map is larger than the threshold of the minimum area of the connected domain and smaller than the threshold of the maximum area, the self-adaptive threshold is 1; otherwise, the adaptive threshold is 0.

In one implementation, the removing of the dense snowflakes in the area in the binary image by using the second adaptive threshold, and attaching the processed binary image back to the low-rank background to generate a new input video specifically includes:

when the value of each connected domain of the expanded snowflake detection image is larger than the threshold value of the minimum area of the connected domain, the pixel value of the connected domain is 1, otherwise, the pixel value is 0;

the new input video is: multiplying the processed binary image by corresponding pixel points of the input video, acquiring an inverse value of the binary motion foreground detection image, multiplying the inverse value by corresponding pixel points of the video background, and summing 2 products.

In one implementation, the performing similarity matching on the final sparse snowflake detection map and the new frame picture corresponding to the input video specifically includes:

representing similar blocks of the image block in the corresponding frame as column vectors, thereby constituting a matrix; acquiring a binary snowflake mask matrix corresponding to the matrix;

and acquiring a filling matrix from the matrix by using a low-rank compensation technology, wherein the filling matrix is the problem of minimizing the nuclear norm and meets constraint conditions.

In one implementation, the constraint is:

constraint 1: the inverse value of the snowflake mask matrix theta is used for the snowflake-free mask matrix

Multiplying the matrix, reserving snow-free pixels of the matrix, adding the snow-free pixels to corresponding positions of the filling matrix X, and reserving the snow-free pixels;

constraint 2: multiplying the snowflake mask matrix theta by the filling matrix X to ensure that the pixel value of an element in the filling matrix X, which is the same as the position of the snow element in the matrix, is lower than the pixel value of the snow element in the matrix; snowflake-free mask matrix

The sum of the snowflake mask matrix theta is 1, the snowflake mask matrix theta is a binary snowy pixel matrix with snowflake positions and element values of 1, and the snowflake positions and element values of 0.

In a second aspect, a video image snow removal apparatus, the apparatus comprising:

the first acquisition module is used for detecting the sparse snowflakes processed by the optical flow field estimation algorithm and the support vector machine in sequence by adopting a first self-adaptive threshold value to judge and acquire a final sparse snowflake detection map;

the second acquisition module is used for carrying out background modeling on the input video based on the block-based Gaussian mixture model, removing dense snowflakes and acquiring a low-rank background;

the generating module is used for detecting a motion foreground in the input video by using a Markov random field to obtain a binary image, removing dense snowflakes in the area of the binary image by using a second self-adaptive threshold value, and pasting the processed binary image back to a low-rank background to generate a new input video;

and the similarity matching module is used for performing similarity matching on the final sparse snowflake detection image and a frame picture corresponding to the new input video and removing the sparse snowflakes.

In one implementation, the affinity matching module includes:

a first acquisition unit configured to represent similar blocks of the image block in a corresponding frame as column vectors, thereby constituting a matrix; acquiring a binary snowflake mask matrix corresponding to the matrix;

the second acquisition unit is used for acquiring a filling matrix from the matrix by using a low-rank compensation technology, wherein the filling matrix is the problem of minimizing the nuclear norm and meets constraint conditions;

and the removing unit is used for removing the sparse snowflakes.

In a third aspect, a video image snow removal apparatus, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling the program instructions stored in the memory to cause the apparatus to perform the method steps of the first aspect.

In a fourth aspect, a computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method steps of the first aspect.

The technical scheme provided by the invention has the beneficial effects that:

1. the method can effectively remove the snowflakes in the snowing video, and has certain practical application value;

2. the image processed by the method can well restore the pixel information blocked by the snowflakes in the video, and has more detailed information than the original video;

3. the video image processed by the invention can be well applied to computer vision application, and has higher accuracy in the fields of moving target detection, target identification and tracking and the like compared with the original video.

Drawings

FIG. 1 is a flow chart of a video image snow removal method;

FIG. 2 is a detailed flow chart of a video image snow removal method;

FIG. 3 is a schematic view of a snow removal target image of a snowing video;

FIG. 4 is a schematic view of the target image after the snow removal process of FIG. 3;

FIG. 5 is another schematic view of a snow removal target image of the snowing video;

FIG. 6 is a schematic view of the target image after the snow removal process of FIG. 5;

FIG. 7 is a comparison graph of experimental results of video images of the method of the present invention and other methods;

wherein, (a) is a certain frame of the input snowing video; (b) the effect of the snow removal method for the deep learning method; (c) the effect of the low-rank matrix compensation method is achieved; (d) the effect of the Gaussian mixture model method for hierarchical modeling; (e) is the effect of the multi-scale convolution sparse coding method; (f) the snow removal effect of the method of the invention.

FIG. 8 is another comparison graph of experimental results of video images of the method of the present invention and other methods;

FIG. 9 is a schematic diagram of an effective video image snow removal apparatus;

FIG. 10 is a schematic diagram of a configuration of an affinity matching module;

fig. 11 is another schematic diagram of an effective video image snow removal apparatus.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.

The removal of snow from video images is an important direction in recent years in computer vision research. When the existing video snow removing algorithm is used for processing a video containing a scene with large snow, most of the snow is difficult to remove. When processing video containing moving objects, deformation of the moving objects is caused. The existing algorithm has poor universality, and when snow videos with different resolutions shot by different devices are processed, snow can not be completely removed. There are also several methods for processing rain in video images in the prior art, and although rain and snow are very similar and are all fast falling objects, the reason why direct snow removal by a rain removal method cannot be used is that:

1) rain causes blurring of pixel information, while snow causes shading of the information, and the influence of the two on video information is different;

2) the direction in which the rain lines fall is almost uniform, and even in the case of wind, the deviation of the direction angle of the fall is small. Whereas snowflakes are blown with widely differing directional angles in the presence of wind.

The existing rain removing algorithm is completed based on the characteristics of rain lines, and the difference between the rain lines and snow flakes can cause that the rain removing method is difficult to be used for removing snow. For example, the conventional method uses the falling angle of the rain line to remove rain, which is obviously not suitable for removing snow. The existing deep learning method for removing rain is completed by using a synthesized raining data set, the used training sets are all synthesized raining scenes, the trained network can only be used for identifying the existence of rain lines, and obvious misjudgment and missed judgment can be caused when the network is used in snowflake scenes.

On the contrary, the research result disclosed by the invention can be applied to removing rainwater in the video image, and has obvious improvement effect.

In order to solve the problems in the background art, referring to fig. 1, an embodiment of the present invention provides an effective video image snow removing method, including the following steps:

step 101: the method comprises the steps of dividing snowflakes in a video into sparse snowflakes with large close-range size and dense snowflakes with small far-range size, judging and detecting the sparse snowflakes in the video by sequentially adopting an optical flow field estimation algorithm, a support vector machine and a self-adaptive threshold value, and obtaining a final sparse snowflake detection map;

and the obtained final sparse snowflake detection image is used for removing the sparse snowflakes in cooperation with the target frame in the subsequent step 104.

The sparse snowflakes with larger near view size and the dense snowflakes with smaller far view size in the step 101 are generally: if the area of the snowflakes is larger than 0.004% of the total pixels of the single-frame image, the snowflakes are judged to be sparse snowflakes, otherwise, the snowflakes are judged to be dense snowflakes.

Further, the value of 0.004% can be set according to the requirements in practical application, which is not limited in the embodiment of the present invention.

The step 101 is specifically:

a current frame image J of an input video_zAs the target frame image, a target frame image J is set_zPrevious frame J of_z-1Warped into a current frame, denoted as

Also using the target frame image J_zNext frame J of_z+1Warping into the current frame, denoted as

For the current frame J_zEach pixel point in (1) is

And

selecting the most similar pixel points from the corresponding positions, and forming the final distorted frame by using each most similar pixel point

Then using the target frame image J_zSubtracting warped frames

An initial snowflake detection map is obtained.

The initial snowflake detection map is represented using sparse representations, with pixels in the initial snowflake detection map being separated into snowflake components and other noise components.

Training by using a support vector machine, separating the snowflake component from other noise components, and converting the snowflake component into a new snowflake detection graph; and performing expansion operation on the new snowflake detection image, then using adaptive threshold processing to set a minimum size threshold of the sparse snowflakes, and removing the dense snowflakes in the new snowflake detection image to obtain a final sparse snowflake detection image.

The minimum size threshold is 0.004% of a single-frame pixel, and specific values can be set according to requirements in practical application, which is not limited in the embodiment of the invention.

The optical flow field estimation field, the support vector machine, and the adaptive threshold determination are well known techniques in the field of image processing, and are not described in detail in the embodiments of the present invention.

Step 102: performing background modeling on an input video by using a block-based Gaussian mixture model, removing dense snowflakes in the video and acquiring a clear low-rank background;

the clear low rank background with no moving objects, sparse snowflakes and dense snowflakes can be recovered by the processing of this step 102.

The block-based gaussian mixture model in step 102 is a known technique in the field of image processing, and is not described in detail in the embodiments of the present invention.

Step 103: extracting sparse snowflakes and moving objects in the video by using a Markov (MRF) random field and adaptive threshold processing, wherein the area threshold of the sparse snowflakes is set to be 0.0013% of the total pixels of a single-frame image, the snowflake pixels smaller than the area are removed, and the detected motion foreground (comprising the moving objects and the sparse snowflakes) is attached back to a clear low-rank background to generate a new input video without dense snowflakes;

namely, the MRF random field is used for detecting the motion foreground in the video, then the self-adaptive threshold processing is utilized for removing the dense snowflakes in the motion foreground, the moving objects and the sparse snowflakes are reserved, and the obtained moving objects and the sparse snowflakes are pasted to the clear low-rank background obtained in the step 102, so that the new input video without the dense snowflakes is obtained.

In specific implementation, the embodiment of the present invention only takes 0.0013% of total pixels of a single frame image with an area threshold as an example for description, and the embodiment of the present invention is not limited to this.

Step 104: and (3) performing similar matching on the front and rear six-frame snowflake detection images in the final sparse snowflake detection image and the front and rear six-frame pictures in the new input video, removing sparse snowflakes with larger foreground (the snowflakes are judged to be sparse snowflakes when the pixel size of the snowflakes exceeds 0.004 percent of a single-frame image), and simultaneously removing snowflakes sheltered in the front of a moving object.

The step 104 is specifically: dividing a target frame of a new input video needing snow removal (hereinafter referred to as a snow removal target frame for short) into disjoint square blocks, and for each block, finding similar blocks in adjacent frames; and then filling snowflake pixels of the target frame by utilizing the pixel information of the similar blocks according to the final sparse snowflake detection image of the target frame, and removing sparse snowflakes in the video.

And if the snowflakes in the 6 th frame of image need to be removed, taking the 6 th frame of image as a snowflake removing target frame, and sequentially removing the snowflakes from the images in each frame of video.

Since the six frames before and after the target frame are most similar to the information of the target frame, and the information in the six frames before and after can be well used for filling the snowflake pixels in the target frame, the embodiment of the present invention is described by taking the six frames before and after as an example, and is not limited in the specific implementation.

An effective video image snow-removing method in the above embodiment is detailed and expanded by combining with fig. 2 and a specific calculation formula, and the method includes the following steps:

step 201: obtaining a final sparse snowflake detection image by adopting an optical flow field estimation algorithm, a support vector machine and self-adaptive threshold judgment;

wherein, this step 201 includes:

1.1) a dense motion field can be found in two continuous frames of images through an optical flow field estimation algorithm, and a motion vector is determined for each pixel in a reference frame so as to find the most similar pixel in a target frame, and meanwhile, the similarity of the motion vectors between adjacent pixels is kept.

The optical flow field estimation algorithm is an energy minimization problem, and the energy function is as follows:

E(O)＝E_d(O)+λE_s(O) (1)

where O is the optical flow field and λ is the normalization parameter. E_dFor data items, for measuring target frame J₁And reference frame J₂The similarity between corresponding pixels. E_sFor the smoothing term, the similarity of neighboring vectors is constrained.

Where (x) is the optical flow vector for pixel point x,

is a penalty function. Smoothing term E_sCan be expressed as:

wherein ∑ represents a gradient.

1.2) in order to detect snowflakes, the previous frame and the next frame can be respectively distorted into the current frame by an optical flow field estimation algorithm. Respectively using the target frame J_zPrevious frame J of_z-1And the next frame J_z+1To synthesize warped frames, respectively expressed as:

wherein the content of the first and second substances,

the target frame distorted for the previous frame,

the target frame distorted for the next frame,

representing the optical loss of pixel x from the target frame to the previous frame,

representing the optical flow loss of pixel x from the target frame to the next frame.

1.3) to screen out which pixel point is more similar to the pixel J in the current frame (i.e. the target frame)_z(x) Labeling each pixel by using a binary label t (x), and selecting the most suitable pixel to form the final distorted frame

The representation method is as follows:

and T is used for representing a label graph which is composed of labels of all pixels in the target frame, when T (x) is 0, the pixels representing the target frame are higher in similarity with the pixels in the previous frame, and when T (x) is 1, the pixels representing the target frame are higher in similarity with the pixels in the next frame. And selecting the pixels in the corresponding front and back warped frames to combine into a warped frame with the highest similarity to the target frame according to the value of the binary label t (x) of each pixel. The dereferencing of the label graph T is completed by minimizing a label cost function:

C(T)＝C_d(T)+τC_s(T) (7)

where τ is 50 is a normalization parameter. C_d(T) a data cost function for measuring the similarity between the target frame and the warped frame. C_s(T) is a smoothing cost that constrains neighboring pixel labels to be consistent, defined as follows:

wherein N is_xA set of four-neighborhood pixels representing pixel point x,

representing an exclusive or operation, y representing a traversal of each pixel within x four neighbors, and t (y) representing the label of pixel y.

The initial snowflake detection map obtained by the optical flow field estimation algorithm is:

in the formula (I), the compound is shown in the specification,

for the warped frame obtained by equation (6)

Of the pixel(s).

1.4) the initial snowflake detection image obtained by the optical flow field estimation method also has a lot of information of background noise and information of other moving objects. To separate the snowflake components from it, an initial snowflake detection map is first represented using sparse representation techniques.

First, assume an initial snowflake detection map S₀There are h × w pixels, a block of size r is selected around each pixel, and a two-dimensional matrix Ψ of size r × (h × w) is constructed, where each column of the matrix Ψ represents each block taken. Then, an over-complete dictionary consisting of e basis vectors of r dimensions is constructed through sparse expression

Is a two-dimensional matrix with the size r × e. By representing each block with a linear combination of basis vectors in a complete dictionary Q, the two-dimensional matrix Ψ can be reconstructed:

||Ψ-QA|| (10)

where A is a sparse matrix of size e × (h × w). Assuming that A should be sparse, the optimal coefficient matrix A is found by solving the optimization problem^*：

Rho is a parameter, and the rho value in the embodiment of the present invention is set to 0.15, which may be set according to the needs in practical applications, which is not limited in the embodiment of the present invention.

1.5) after the initial snowflake detection graph detected using sparse representation, the basis vectors in the complete dictionary Q are to be classified as snowflake classes and remaining components. Firstly, a kernel degradation algorithm is adopted to analyze the structure of each block in the image, and the optimal matching kernel of the intensity distribution in each block is found out according to singular value decomposition. Then, the shape and the direction of each block are analyzed through singular value decomposition, and the falling angle of the snowflake is calculated. Then, a snowflake vector is separated out by using a support vector machine classifier. The method uses the characteristic vector of the angle component when the snowflake falls to help the support vector machine to separate the snowflake vector and the noise vector.

In order to enable the support vector machine to have a good classification effect, 3072 effective basis vectors extracted from the synthesized snowing image are used as positive samples, and 3072 snow-free noise vectors are used as negative samples to train the support vector machine. The training process only needs to be trained off line once, and can be directly called when the training process needs to be used each time. After classification using a support vector machine, most of the noise vectors in the complete dictionary Q are replaced by zero vectors

Representing the classified new dictionary. New dictionary

Sum coefficient matrix A^*A new matrix can be obtained

Will matrix

The column vectors in the image are reduced to the form of blocks, then each block is overlapped and arranged according to the arrangement form of the central pixel in the initial snowflake detection image, the pixel value of each pixel in the initial snowflake detection image is set as the average value of overlapped pixels, and then the new snowflake detection image can be obtained

Then, a new snowflake detection map is created by setting the threshold φ -3

Converting into a binary sparse snowflake detection map U:

1.6) the new snowflake detection profile obtained at this time

In which there are dense snowflakes, background noise information and some moving object information, and in order to remove these useless information, a new snowflake detection pattern is

And performing expansion operation to obtain a final sparse snowflake detection map.

After the snowflake pattern is expanded, whether the snowflake pattern is sparse or dense and background noise can be clearly shown in the pattern. According to the embodiment of the invention, unnecessary information in the detection image is removed by setting the area threshold of the connected domain, and a self-adaptive threshold is set to judge snowflakes in the high-pixel video and the low-pixel video simultaneously:

wherein, C^αRepresenting traversed expanded snowflake detection map

Each connected domain;

showing new snowflake detection map

Inner connected domain C^αEach of the pixels in (a); v. of_maxA threshold value representing a maximum area of a snowflake pixel; v. of_maxThe value of (a) is 0.25% of the total pixels of the image; wherein v is_minIs the threshold value of the minimum area of the connected domain of the snowflake, v_minThe value of (b) is 0.004% of the total pixels of the image.

Step 202: restoring a clear background image by using a block mixing Gaussian model;

wherein the step 202 comprises:

2.1) carrying out background modeling on an input snowing video, wherein each frame of picture in the video is linearly related to the rest frames, the background of each video can be regarded as a low-rank matrix, and the background modeling of the video can be regarded as a low-rank matrix decomposition problem:

B＝UV^Λ (15)

wherein the content of the first and second substances,

'Λ' represents the transpose of the matrix. The following constraints are added to background B:

rank(B)≤q (16)

where q is a constant used to constrain the complexity of the background modeling, rank represents the rank of the matrix.

2.2) to solve the background modeling problem described above, the background layer is modeled using a block-mixed Gaussian model (GPMM). The matrix f is defined to represent the entire video:

wherein p represents the size of the block, m_pRepresenting the total number of blocks in the video and n representing the number of frames in the video. f represents p²×m_pA matrix of pixels, which constitute the entire video. Using each tile to obtain local video information, f (B)_mRepresenting the mth column of the matrix f.

After decomposing the video into matrices, the block-mixed gaussian model used can be defined as:

wherein K represents the number of Gaussian mixture models; pi_k> 0 is the mixing coefficient, G (| mu,_k) Is provided with

Means is a mu covariance matrix of

Finally, the problem of background layer modeling can be defined as a block-mixed gaussian model (GPMM) with parameters Θ ═ U, V, pi, }:

clear background images can be obtained by the above process.

Step 203: detecting a motion foreground in the video;

3.1) detection of video Using Markov random fieldsUsing a binary tensor

Detection of representation of a moving foreground:

wherein i and j are respectively the horizontal and vertical coordinates of the pixels in the single frame image, z is the serial number of the current frame of the video, D_i,j,zA pixel value representing a pixel having coordinates (i, j) in the z-th frame of the video;

"the pixel (i, j) in frame z is the moving for the moving" indicates that the (i, j) th pixel in the detected z-th frame belongs to a moving object.

3.2) adding an adaptive threshold value to remove the dense snowflakes with smaller areas in the binary image:

wherein v is_minIs the threshold value of the minimum area of the connected domain of the snowflake, v_minThe value of (a) is 0.004% of the total pixels of the image;

snow flake representation detection diagram

Inner connected domain C^αEach pixel of (1). And finally, pasting the obtained moving foreground back to the static background to obtain a new input video. After motion foreground detection, the new input video L₁Can be expressed as:

where L denotes the original input video, B denotes the video background,

representing the inverse of the binary motion foreground detection map D,

satisfy the requirement of

Operator

Representing the multiplication of the corresponding pixel points.

Step 204: and removing the sparse snowflakes in the new input video according to the final sparse snowflake detection image.

Wherein, the step 204 specifically comprises:

and 4.1) removing the sparse snowflakes in the new input video according to the final sparse snowflake detection image and the adjacent frame information to obtain a final snowflake-removed video.

Firstly, a snow-removing target frame in a new input video is divided into disjoint square blocks, and in order to process videos with different pixels, the area of each square block is set to be 1.6% of a single-frame pixel of the video. Then, the pixels in the target frame are complemented using the information of the preceding and following frames. In the target frame J_zFront and rear six-frame image J_z-3,J_z-2,J_z-1,J_z+1,J_z+2,J_z+3Find the i most similar blocks in each frame. The similarity between square blocks is determined by the absolute average error function of the snow-free pixels between square blocks. Then 6l similar blocks of the square block γ in the six frame image are represented as column vectors, thus constituting a matrix:

＝[γ,γ₁,γ₂,…,γ_6l] (23)

where γ is a block in the selected target frame, γ₁,γ₂,…,γ_6lWhich is a block similar to block gamma found in the preceding and following 6 frame images.

Likewise, a snowflake mask matrix Θ is also defined that corresponds to the corresponding binary values:

Θ＝[o,o₁,o₂,…,o_6l] (24)

wherein each column in the matrix consists of binary snowflake detection mask values corresponding to the square blocks in the matrix.

4.2) a filling matrix X can be found from the matrix by using a low-rank compensation technology, and the filling matrix X is obtained by minimizing the kernel norm | | X | | luminance_*A problem that satisfies the following constraints:

wherein the content of the first and second substances,

to represent

Add all corresponding elements to Θ to 1, sign

Representing the dot product of the corresponding elements between the matrices.

Equation (25) indicates that snow-free pixels in the matrix should be preserved. Equation (26) indicates that the brightness of a snow-free pixel will be lower than a snow-containing pixel. To solve the above constrained optimization problem, an EM (expectation-maximization algorithm) algorithm is used for iteration. After the iteration is completed, the filled matrix can be obtained

Then replacing the snowflake elements of each block in each frame image with a matrix

The elements in the corresponding positions in the image are obtained, and the final snow-removed video is obtained.

The following describes feasibility of an effective video image snow removal method provided by an embodiment of the present invention with a video including a snow scene captured in a real-life scene as an illustration object, which is described in detail in the following:

as shown in fig. 3 and fig. 5, it can be found by observing the experimental images that there are many snowflakes in the single-frame image of the original video, and these snowflakes cover the useful information in the image, which hinders the information identification of the image and seriously affects the quality of the image.

As shown in fig. 4 and fig. 6, the method provided by the present invention can effectively remove the snow in the video image, recover the pixel information blocked by the snow in the video,

in order to verify the effectiveness and reliability of the method of the invention, the snow removal effect comparison was carried out using the method of Qian et al, Kim et al, Wei et al, Li et al and the method proposed by the invention, respectively. As shown in fig. 7 and 8, as can be seen from the labeled parts in the figures, the Qian method can remove some small snowflakes, but cannot process sparse snowflakes with large size, and the algorithm can cause a certain degree of blurring in the snow removing process. The Kim algorithm can process snowflakes with larger sizes in the close range, but partial snowflake information remains in the image, and the algorithm cannot process dense snowflakes at a distance. The algorithm of Wei can effectively remove distant dense snowflakes, but cannot remove sparse snowflakes cleanly, and the algorithm causes a certain degree of color change. The algorithm of Li can remove dense snowflakes in the video, but the algorithm is inferior to the previous three algorithms in removing sparse snowflakes. Compared with other four algorithms, the method provided by the invention has a good snow removing effect, can remove sparse snowflakes and dense snowflakes, and does not cause color change of moving objects.

Based on the same inventive concept, as an implementation of the above method, referring to fig. 9, an embodiment of the present invention further provides a video image snow removing device, which is described in detail in the following description:

the first acquisition module 1 is used for detecting the sparse snowflakes processed by the optical flow field estimation algorithm and the support vector machine in sequence by adopting a first self-adaptive threshold value to judge and acquire a final sparse snowflake detection map;

the second obtaining module 2 is used for performing background modeling on the input video based on the block-based Gaussian mixture model, removing dense snowflakes and obtaining a low-rank background;

the generating module 3 is used for detecting a motion foreground in the input video by using a Markov random field to obtain a binary image, removing dense snowflakes in the area of the binary image by using a second self-adaptive threshold, and pasting the processed binary image back to a low-rank background to generate a new input video;

and the similarity matching module 4 is used for performing similarity matching on the final sparse snowflake detection image and a frame picture corresponding to the new input video and removing the sparse snowflakes.

The description of the optical flow field estimation algorithm, the support vector machine, the first adaptive threshold, the second adaptive threshold removing binary image, and the similarity matching is referred to the description in the above method embodiment, which is not repeated in the embodiment of the present invention.

Referring to fig. 10, in one implementation, the affinity matching module 4 includes:

a first acquisition unit 41 for representing similar blocks of the image block in the corresponding frame as column vectors, thereby constituting a matrix; acquiring a binary snowflake mask matrix corresponding to the matrix;

a second obtaining unit 42, configured to obtain a padding matrix from the matrix by using a low-rank compensation technique, where the padding matrix is a problem of minimizing a kernel norm and meets a constraint condition;

a removal unit 43 for removing sparse snowflakes.

It should be noted that the device description in the above embodiments corresponds to the description of the method embodiments, and the embodiments of the present invention are not described herein again.

The execution main bodies of the modules and units can be devices with calculation functions, such as a computer, a single chip microcomputer and a microcontroller, and in the specific implementation, the execution main bodies are not limited in the embodiment of the invention and are selected according to the requirements in practical application.

Based on the same inventive concept, an embodiment of the present invention further provides a video image snow removing apparatus, referring to fig. 11, including: a processor 5 and a memory 6, the memory 6 having stored therein program instructions, the processor 5 calling upon the program instructions stored in the memory 6 to cause the apparatus to perform the following method steps in an embodiment:

In one implementation, the processor 5 performs similarity matching on the final sparse snowflake detection map and the corresponding frame picture of the new input video, and removing the sparse snowflake may specifically perform any one of the following operations:

acquiring a filling matrix from the matrix by using a low-rank compensation technology, wherein the filling matrix is the problem of minimizing the nuclear norm and meets constraint conditions; removing the sparse snowflakes.

It should be noted that the device description in the above embodiments corresponds to the method description in the embodiments, and the embodiments of the present invention are not described herein again.

The execution main bodies of the processor 5 and the memory 6 may be devices having a calculation function, such as a computer, a single chip, a microcontroller, and the like, and in the specific implementation, the execution main bodies are not limited in the embodiment of the present invention, and are selected according to the needs in the practical application.

The memory 6 and the processor 5 transmit data signals through the bus 7, which is not described in detail in the embodiment of the present invention.

Based on the same inventive concept, an embodiment of the present invention further provides a computer-readable storage medium, where the storage medium includes a stored program, and when the program runs, the apparatus on which the storage medium is located is controlled to execute the method steps in the foregoing embodiments.

The computer readable storage medium includes, but is not limited to, flash memory, hard disk, solid state disk, and the like.

It should be noted that the descriptions of the readable storage medium in the above embodiments correspond to the descriptions of the method in the embodiments, and the descriptions of the embodiments of the present invention are not repeated here.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer.

The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium or a semiconductor medium, etc.

In the embodiment of the present invention, except for the specific description of the model of each device, the model of other devices is not limited, as long as the device can perform the above functions.

Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method for removing snow from a video image, the method comprising:

2. A method as claimed in claim 1, wherein said detecting with the first adaptive threshold judgment is specifically:

3. The method for removing snow from a video image according to claim 1, wherein the removing of the dense snow in the binary image by using the second adaptive threshold and the attaching of the processed binary image back to the low-rank background to generate the new input video are specifically:

4. The video image snow removal method according to claim 1, wherein the step of performing similarity matching on the final sparse snow detection image and the corresponding frame picture of the new input video specifically comprises the following steps:

5. A method for removing snow from a video image according to claim 4, wherein said constraints are:

6. A video image snow removal apparatus, the apparatus comprising:

7. A video image snow removal apparatus as claimed in claim 6, wherein said similarity matching module comprises:

and the removing unit is used for removing the sparse snowflakes.

8. A video image snow removal apparatus, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause the apparatus to perform the method steps of any of claims 1-5.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method steps of any of claims 1-5.