CN114926348B - Device and method for removing low-illumination video noise - Google Patents
Device and method for removing low-illumination video noise Download PDFInfo
- Publication number
- CN114926348B CN114926348B CN202111583678.3A CN202111583678A CN114926348B CN 114926348 B CN114926348 B CN 114926348B CN 202111583678 A CN202111583678 A CN 202111583678A CN 114926348 B CN114926348 B CN 114926348B
- Authority
- CN
- China
- Prior art keywords
- network
- illumination
- picture
- low
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005286 illumination Methods 0.000 title claims abstract description 94
- 238000000034 method Methods 0.000 title claims abstract description 42
- 239000013598 vector Substances 0.000 claims abstract description 118
- 238000012549 training Methods 0.000 claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 30
- 238000012545 processing Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 12
- 230000008485 antagonism Effects 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 13
- 239000011159 matrix material Substances 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000001914 filtration Methods 0.000 description 8
- 238000013135 deep learning Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000001965 increasing effect Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 235000019800 disodium phosphate Nutrition 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000003416 augmentation Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 235000002566 Capsicum Nutrition 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 229920002430 Fibre-reinforced plastic Polymers 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 244000203593 Piper nigrum Species 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 239000011151 fibre-reinforced plastic Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 235000012434 pretzels Nutrition 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a device and a method for removing low-illumination video noise based on width learning and generation countermeasure network technology, wherein the method comprises the following steps: acquiring a low-illumination picture to be denoised; inputting the obtained low-illumination picture as input data into a trained width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and inputting the obtained feature vector into a generating network in a trained generating countermeasure network to generate a denoising picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.
Description
Technical Field
The present invention relates to the field of computer vision, and more particularly, to an apparatus and method for removing low-luminance video noise based on width learning and generation countermeasure network technology.
Background
With the development of 5G and video technologies, the development of housekeeping products based on cameras is rapid, and the storage scale of video files is also increasing. Taking the Chinese telecommunication sky wing housekeeping product as an example, the video files newly added every day need 25P storage space, and the most direct mass storage brings high hardware purchasing cost, and in addition, the mass storage brings high management cost for capacity expansion, operation and maintenance, disaster tolerance and the like. It is a technical challenge to compress video files as much as possible.
Particularly, in a low-illumination environment, the exposure time of the photosensitive element is increased, the temperature is increased, and a large amount of white noise and dark current can be caused by long-time exposure, and the white noise and the dark current can cause the output of a large amount of random noise points. These noise points can present the following three problems to the production operations:
1. the imaging quality is low, the user experience is poor, and the video can be covered by one or more of Gaussian noise, pretzel noise, rayleigh noise, exponential noise and the like;
2. noise can cause dense high-frequency signals in the video, and if the high-frequency signals are directly input into an encoder for encoding without processing, the storage space can be increased. For example, the code stream of the single-channel video camera is compressed to about 700K under the condition of normal brightness under the H265 standard, but the storage capacity can be increased by 30% -50% under the low-illumination environment such as rainy night; and
3. the success rate of the images is low when the images are used for AI recognition (such as face recognition and license plate recognition), which brings great hidden trouble to intelligent security.
Conventional codec techniques use low pass filtering and median filtering to handle noise. But the low pass filtering may blur the picture while removing noise. The median filtering adopts a nonlinear method, and at the same time, the sharp edge of the image is protected, and proper points are selected to replace the values of pollution points, so that the processing effect is not blurred like the low-pass filtering, but the algorithm of the median filtering determines that the processing effect is better for salt and pepper noise and the processing effect is poorer for other noise such as Gaussian noise. With the development of artificial intelligence in recent years, a machine learning algorithm and a deep learning algorithm are applied to the field of image denoising and have obvious results, but training of an artificial intelligence model depends on labeling and learning of a large number of sample sets, relies on the computational power of hardware in an reasoning stage, and increases the hardware cost when ISP processing and AI processing are performed on an end side.
At present, a method of combining median filtering with nonlinear mapping is used for suppressing noise of an image, so that noise of a low-illumination image is suppressed while the visibility of the original image is maintained, and the monitoring picture quality of a night image is improved. In addition, a method for enhancing images by using high-quality similar images in similar scenes is also proposed at present, but the method directly uses a histogram matching mode to match, so that the accuracy is lower, and the histogram is greatly influenced by illumination. Compared with the existing method, the method based on the deep learning achieves better effect, but the weight, bias and other parameters required to be trained in the deep learning are more than millions, the model training speed is low, the time is long, and the calculation cost is too high. Accordingly, in order to improve the performance in terms of image denoising, reduce the dependence on computational effort, it is desirable to provide an improved method of removing low-luminance picture noise.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Aiming at the defects existing in the prior art, the method combines the actual application situation of the camera service, is different from a blind denoising scheme in the main stream of the industry, introduces a video frame with good illumination in the daytime as a reference frame, uses the width learning and the generation countermeasure network technology to purposefully and quantitatively denoise the low-illumination image, for example, the method can be used for ISP processing before H264/H265 coding of a terminal to denoise I frames and P frames, can save the storage cost, improves the user experience, can also be used for a preprocessing module for deep learning image recognition, and improves the accuracy of image recognition.
According to one aspect of the present invention, there is provided a method for removing low-luminance picture noise, the method comprising: acquiring a low-illumination picture to be denoised; inputting the obtained low-illumination picture as input data into a trained width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and inputting the obtained feature vector into a generating network in a trained generating countermeasure network to generate a denoising picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.
According to one embodiment of the invention, the method further comprises: extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors; and when cosine similarity between the low-illuminance feature vector and one or more high-illuminance reference feature vectors in the high-illuminance reference feature library is greater than a predetermined threshold, returning a high-illuminance reference picture corresponding to a high-illuminance reference feature vector with the maximum cosine similarity with the low-illuminance feature vector in the one or more high-illuminance reference feature vectors as input data to be input into the width learning network.
According to a further embodiment of the invention, the high-illuminance reference feature library is constructed by: periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value; extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors; comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library; adding the high-illuminance reference feature vector to the high-illuminance reference feature library when cosine similarity between the high-illuminance reference feature vector and each of the existing feature vectors in the high-illuminance reference feature library is less than a predetermined threshold; gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector; and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.
According to a further embodiment of the invention, the training to generate the countermeasure network is achieved by repeating the following steps, wherein the clear picture without noise and the corresponding low-light picture are taken as a training dataset: inputting the low-illumination pictures in the training data set into a generation network in the generation countermeasure network through the feature vectors extracted by the width learning network to generate denoising pictures; inputting the generated denoising picture and the corresponding clear picture without noise into a discrimination network in the generating countermeasure network so as to be used for the discrimination network to discriminate the real picture; the generation network and the discrimination network are optimized based on a loss calculation.
According to another aspect of the present invention, there is provided a system for removing low-light picture noise, the system comprising: a feature extraction module configured to: acquiring a low-illumination picture to be denoised; extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors; a similar picture contrast module configured to: comparing the low-illumination characteristic vector with one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library one by one; and when cosine similarity between the low-illuminance feature vector and the one or more high-illuminance reference feature vectors is greater than a predetermined threshold, determining that a similar picture exists and returning, as input data, a high-illuminance reference picture corresponding to a high-illuminance reference feature vector having the highest cosine similarity to the low-illuminance feature vector among the one or more high-illuminance reference feature vectors; otherwise, determining that no similar picture exists and directly taking the acquired low-illumination picture as input data; a breadth-learning feature extraction module configured to input the input data into a trained breadth-learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the breadth-learning network is trained by determining a pseudo-inverse of feature nodes and enhancement nodes to which the input data is mapped to a given target value; and a generation countermeasure network denoising module configured to input the resulting feature vector into a generation network in a trained generation countermeasure network to generate a denoised picture, wherein the generation countermeasure network is obtained by alternately training the generation network and a discrimination network.
According to one embodiment of the invention, the high-illuminance reference feature library is constructed by: periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value; extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors; comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library; adding the high-illuminance reference feature vector to the high-illuminance reference feature library when cosine similarity between the high-illuminance reference feature vector and each of the existing feature vectors in the high-illuminance reference feature library is less than a predetermined threshold; gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector; and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.
According to a further embodiment of the invention, feature extraction of the low-luminance picture and the high-luminance reference picture is performed using a scale-invariant feature transform SIFT operator.
According to a further embodiment of the invention, the training to generate the countermeasure network is achieved by repeating the following steps, wherein the clear picture without noise and the corresponding low-light picture are taken as a training dataset: inputting the low-illumination pictures in the training data set into a generation network in the generation countermeasure network through the feature vectors extracted by the width learning network to generate denoising pictures; inputting the generated denoising picture and the corresponding clear picture without noise into a discrimination network in the generating countermeasure network so as to be used for the discrimination network to discriminate the real picture; the generation network and the discrimination network are optimized based on a loss calculation.
According to still another aspect of the present invention, there is provided a system for removing noise of a low-illuminance picture, the system comprising: a memory storing a trained breadth learning network and generating an antagonism network and computer executable instructions; and at least one processor, the computer-executable instructions, when executed, cause the at least one processor to: acquiring a low-illumination picture to be denoised; inputting the obtained low-illumination picture as input data into the width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining the pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and inputting the obtained feature vector into a generation network in the generation countermeasure network to generate a denoising picture, wherein the generation countermeasure network is obtained by alternately training the generation network and a discrimination network.
According to one embodiment of the invention, the memory further stores a high-light reference feature library, and the computer-executable instructions, when executed, cause the at least one processor to further perform the operations of: extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors; and when cosine similarity between the low-illuminance feature vector and one or more high-illuminance reference feature vectors in the high-illuminance reference feature library is greater than a predetermined threshold, returning a high-illuminance reference picture corresponding to a high-illuminance reference feature vector with the maximum cosine similarity to the low-illuminance feature vector in the one or more high-illuminance reference feature vectors as input data to the width learning network.
Compared with the scheme in the prior art, the method and the system for removing the noise of the low-illumination picture have at least the following advantages:
(1) The denoising effect is good: after the width learning is introduced and the countermeasure network is generated, a cleaner picture can be obtained, and the blurring problem caused by low-pass filtering denoising can be reduced;
(2) The calculation force requirement is low: compared with a large number of nonlinear calculations required by deep learning, most of the calculations are linear calculations based on a width learning method, so that the method has low calculation force requirements and can be used on low calculation force equipment such as cameras, visible doorbell, mobile phones and the like; and
(3) Flexible use: the denoising method of the invention can be realized not only as a preprocessing flow of the neural network, but also as an ISP processing flow of the H264/H265 encoder.
These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.
Drawings
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.
FIG. 1 illustrates an example architecture diagram of a system for removing low-light picture noise according to one embodiment of this disclosure;
FIG. 2 illustrates an example flow chart of a method for constructing a high-luminance reference feature library according to one embodiment of this disclosure;
FIG. 3 illustrates an example flow chart of a method for removing low-light picture noise according to one embodiment of this disclosure;
FIG. 4 illustrates an example block diagram of a breadth-learning network in accordance with one embodiment of the present invention;
FIG. 5 illustrates an example block diagram of generating an antagonism network according to one embodiment of the invention;
FIG. 6 illustrates an example block diagram of generating a discrimination network in an antagonism network according to one embodiment of the invention; and
fig. 7 illustrates an example architecture diagram of a system for removing low-light picture noise according to one embodiment of this disclosure.
Detailed Description
The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.
Fig. 1 illustrates an example architecture diagram of a system 100 for removing low-light picture noise according to one embodiment of this disclosure. As shown in fig. 1, the system 100 of the present invention includes at least: the system comprises a feature extraction module 101, a similar picture comparison module 102, a width learning feature extraction module 103 and a generation countermeasure network denoising module 104.
The feature extraction module 101 may be configured to obtain a low-illuminance picture to be denoised, and perform feature extraction on the obtained low-illuminance picture to obtain a corresponding low-illuminance feature vector. In some cases, the SIFT operator may be used to perform feature extraction on the obtained low-illumination image to obtain a corresponding 128-dimensional feature vector, where the SIFT operator is a very stable local feature extraction operator, and may implement rotation, scaling, brightness change, and so on, which remain unchanged.
The similar picture comparison module 102 may be configured to compare the low-luminance feature vector obtained via the feature extraction module 101 with each of the high-luminance reference feature vectors in the high-luminance reference feature library one by one, and when cosine similarity between the low-luminance feature vector and one or more high-luminance reference feature vectors is greater than a predetermined threshold (e.g., 0.85), determine that a similar picture exists and directly return, for further image processing, a high-luminance reference picture corresponding to a high-luminance reference feature vector having the greatest cosine similarity with the low-luminance feature vector among the one or more high-luminance reference feature vectors, otherwise determine that the similar picture does not exist. The construction of the high-luminance reference feature library will be described in further detail below.
The width learning feature extraction module 103 may be configured to extract spatial features of the acquired low-light pictures to obtain corresponding feature vectors using a trained width learning network that is trained by determining the pseudo-inverse of feature nodes and enhancement nodes to which the input data is mapped to a given target value, wherein the training process of the width learning network is described in further detail below.
The generation countermeasure network denoising module 104 may be configured to input the features extracted via the breadth-learning network into a generation network in a trained generation countermeasure network to generate a denoised picture, wherein the generation countermeasure network is a network with a denoising function obtained by alternately training the generation network and the discrimination network. Further, data of near original pictures denoised via a generation network may be input to an H264/H265 encoder for video encoding.
Those skilled in the art will appreciate that the system of the present invention and its various modules may be implemented in either hardware or software, and that the various modules may be combined or combined in any suitable manner.
FIG. 2 illustrates an example flow diagram of a method 200 for building a high-luminance reference feature library according to one embodiment of this disclosure. The method 200 starts in step 201 with periodically acquiring a high-luminance reference picture and discarding the picture when its luminance is below a certain threshold and entering the next picture processing when its luminance is above the threshold. As an example, pictures of the camera may be taken every one hour during 9:00 a.m. to 15:00 a.m., discarded if the brightness is below a certain threshold, and entered into the next step if the brightness is above the threshold.
In step 202, feature extraction is performed on the collected high-illuminance reference picture to obtain a corresponding high-illuminance reference feature vector. In a preferred embodiment, a SIFT operator may be employed to perform feature extraction on the acquired high-intensity reference picture to obtain a 128-dimensional characterization feature vector.
In step 203, the obtained high-luminance reference feature vector is compared with the existing feature vectors in the high-luminance reference feature library, and when the cosine similarity between the high-luminance reference feature vector and a certain feature vector in the existing feature vectors in the library is greater than a predetermined threshold (for example, 0.9), the high-luminance reference feature vector is discarded if a similar reference map is indicated, otherwise, the high-luminance reference feature vector is stored in the high-luminance reference feature library and the next step is entered.
In step 204, gray scale processing is performed on the high-luminance reference picture corresponding to the high-luminance reference feature vector, wherein the gray scale processing is mainly used for eliminating the influence of color on the picture edge.
In step 205, the gray-scale processed picture is subjected to a laplace transform and saved in series, wherein the laplace transform is used to extract the edge contour of the picture.
Thus, by the method 200 shown in fig. 2, a high-illuminance reference feature library can be constructed and updated continuously, so that image enhancement is performed by introducing the high-illuminance reference feature library in the present invention.
Fig. 3 illustrates an example flow chart of a method 300 for removing low-light picture noise according to one embodiment of this disclosure. The method 300 starts in step 301, the feature extraction module 101 may obtain a low-illumination picture to be denoised, and perform SIFT feature extraction on the obtained low-illumination picture to obtain a corresponding low-illumination feature vector.
In step 302, the similar picture comparison module 102 may compare SIFT features extracted in step 301 with SIFT features in a high-intensity reference feature library (e.g., a high-intensity reference feature library constructed using the method 200 described with reference to fig. 2) one by one.
If the cosine similarity of the extracted SIFT feature to a certain SIFT feature in the high-luminance reference feature library is greater than a predetermined threshold (e.g., 0.85) in step 303, then in step 304, it is determined that a similar picture exists and a grayscale-processed and laplacian-transformed high-luminance reference picture corresponding to the SIFT feature is returned as input data. Preferably, if the cosine similarity of the high-illuminance reference feature stock between the plurality of SIFT features and the extracted SIFT features is greater than a predetermined threshold, it is determined that there is a similar picture and the grayscale-processed and laplace-transformed high-illuminance reference picture corresponding to the SIFT feature having the greatest cosine similarity among the plurality of SIFT features is returned as the input data in step 304.
Otherwise, in step 305, it is determined that there is no similar picture and the acquired low-light picture is directly taken as input data.
In step 306, the width learning feature extraction module may input the input data into a width learning network for spatial feature extraction to obtain corresponding feature vectors. The core of the width learning is to calculate the pseudo-inverse of the characteristic nodes and the enhancement nodes to the target value, wherein the width learning network firstly utilizes the characteristic nodes to extract the characteristics of the picture, then enhances the respective characteristic nodes through an enhancement mapping function to form corresponding enhancement nodes, the characteristic nodes and the enhancement nodes serve as input layers of the width learning network, and the obtained inverse matrix is equivalent to the weight of the neural network. An example block diagram 400 of a breadth-learning network is shown in FIG. 4, in which the specific steps of a method for training a breadth-learning network are as follows:
s1: generating characteristic nodes and establishing a mapping from input data to the characteristic nodes.
Set training set as H' 1(S×f) Where s represents the number of samples and f represents the number of features. First, the training set is z-score normalized, normalizing the input data to between (0, 1). Then, in order to directly increase bias items through matrix operation after feature nodes are generated, the normalized data is subjected to augmentation operation, a column is added at the last of the training set, and the training set is converted into H' 1(s×(f+1)) . The feature nodes are then generated for each window starting as follows:
1) Generating a gaussian distribution-compliant dimension of (f+1) x N 1 Is a random weight matrix w of (2) e Wherein N is 1 Representing the number of each window characteristic node;
2) Will w e Put into w e { i }, i represents the iteration number, and the iteration number is recorded as N 2 ;
3) Let A 1 =H 1 ×w e ;
4) Pair A 1 Normalizing;
5) To effectively reduce the linear correlation degree of new generation characteristic nodes and obtain a sparse matrix W so as to enable H 1 ×W=A 1 For A 1 Performing sparse representation on the sparse representationAfter the representation, A 1 Is s x (f+1), where the optimization problem in the sparse representation is solved using, for example, the lasso method. Thus, it is possible to obtain:
6) Feature node for generating a window
T 1 :T 1 =normal(H 1 ×W)
Wherein normal represents normalization, and the normalization method of each window characteristic node is denoted as p s (i) A. The invention relates to a method for producing a fibre-reinforced plastic composite For N 2 Generating N for each feature window 1 Each node is an s-dimensional feature vector. Thus, for the whole network, the characteristic node matrix y is a matrix of dimensions sx (N 2 ×N 1 ) Is a matrix of (a) in the matrix.
S2: an enhanced node is generated.
Another feature of the breadth-learning network is that random feature nodes can be supplemented with enhancement nodes. The feature nodes generated by the above steps are all linear, and the purpose of introducing the enhancement nodes is to increase the non-linearity factor in the network.
1) The method comprises the steps of firstly, carrying out standardization and augmentation on a characteristic node matrix y to obtain H 2 . Unlike the feature nodes, the coefficient matrix of the enhanced node is not a random matrix, but a random matrix subjected to orthogonal normalization. Hypothesis (N) 2 ×N 1 )>N 3 The coefficient matrix w of the node is enhanced h Can be expressed as (N) 2 ×N 1 )×N 3 The dimensions are subjected to orthogonal normalized random matrices. The purpose of the method is to map the characteristic nodes to a high-dimensional subspace through nonlinearity, so that the expression capacity of the network is stronger to achieve the purpose of 'enhancement';
2) Activating the enhancement node:
where s represents the scaling of the enhancement node, which is one of the adjustable parameters in the network. the tan sig is an activation function commonly used in BP neural networks, and features expressed by the enhanced nodes can be activated to the greatest extent;
3) Ultimately generating an input T of the network 3 : compared with the characteristic nodes, the enhancement nodes do not need sparse representation and window iteration. While the iteration of orthogonalization may also take some computation time, the computation time required to add an enhancement node is often less than the computation time required to add a feature node. Thus, the final input to the network can be expressed asEach sample has a characteristic dimension of (N 1 ×N 2 )+N 3 。
S3: the pseudo-inverse, i.e., the mapping of inputs to outputs in the breadth network, is found.
Let Y be x For the output value of the network, i.e. Y x =T 3 X W is then
Wherein Y is the label of the training set. Thus, the input and weight of the whole network are trained, and the parameters to be saved after the training is completed are only W and p s Therefore, the amount of network parameters is small and the required computational effort is low compared to deep learning.
Subsequently, in step 307, the generated countermeasure network denoising module 104 may input the feature vector extracted in step 306 into a generation network in a trained generated countermeasure network obtained by alternately training the generation network and the discrimination network to generate a denoised picture. An example structure diagram 500 of generating an countermeasure network is shown in fig. 5, in which the specific steps of the method for training the generation of the countermeasure network are as follows:
step S1: and taking the clear picture without noise and the corresponding low-illumination picture as a training data set.
Step S2: the low-light pictures in the training dataset are input into a trained breadth-learning network to extract corresponding spatial features and into a generating network to generate denoised pictures (false samples).
Step S3: and inputting the generated de-noised picture (false sample) and the corresponding clear picture (true sample) without noise in the training data set into a discrimination network for real picture discrimination, wherein the label of the generated de-noised picture is 0, and the label of the clear picture without noise is 1. The discrimination network is designed to discriminate whether an input picture is a true picture (true sample) or a generated picture (false sample), and the discrimination capability of the discrimination network can be improved by continuous iterative training. The discrimination network may have a different structure, in one example, the structure 600 of the discrimination network is shown in fig. 6, where the image feature may be well preserved by performing a downsampling operation on the feature by means of averaging pooling, and high-dimensional texture information may be extracted by performing convolution with a step size of 2 through a plurality of 3*3 convolution kernels. In addition, in the discrimination network shown in fig. 6, two residual modules are added in the discrimination network by referring to the residual thought in the Resnet network, the characteristics are extracted and fused, and the residual modules can accelerate the extraction of the characteristics and the convergence of training and obtain better characteristics.
In the example of fig. 6, the output of the residual network may be expressed as:
x l+1 =x l +F(x l ,W l )
wherein x is l For the input of the residual structure, F is the convolution branch of the residual network, W l Is a convolution kernel parameter.
The stack residual may be expressed as:
wherein,convolved branches representing multiple residual structures, x L Is the output of the stacked residual block.
Outputting an optimal residual error network structure according to the back propagation result:
step S4: and carrying out loss calculation on the generating network and the judging network, and continuously optimizing parameters of the generating network and the judging network according to calculation results, wherein in order to ensure that the judging network can correctly guide the training of the generating network, pre-training is needed, and the pre-training adopts a mean square error as a loss function for preliminary optimization. After training is completed, the generated network is a network with a denoising function.
The generated denoised picture can then be further processed as input data to the H264/H265 encoder, thereby enabling a reduction in storage capacity and an improvement in user experience.
Fig. 7 illustrates an example architecture diagram of a system 700 for removing low-light picture noise according to one embodiment of this disclosure. The system 700 may be implemented at the cell phone end or cloud end. As shown in fig. 7, a system 700 may include a memory 701 and at least one processor 702. The memory 701 may store a trained breadth-learning network and generate an antagonism network. The memory 701 may include RAM, ROM, or a combination thereof. The memory 701 may store computer-executable instructions that, when executed by the at least one processor 702, cause the at least one processor to perform the various functions described herein, including obtaining a low-light picture to be denoised; the obtained low-illumination picture is used as input data to be input into a trained width learning network for spatial feature extraction so as to obtain a corresponding feature vector; and inputting the resulting feature vector into a generation network of the trained generation countermeasure network to generate the denoised picture. In some cases, memory 701 may include, among other things, a BIOS that may control basic hardware or software operations, such as interactions with peripheral components or devices. The processor 702 may include intelligent hardware devices (e.g., general purpose processors, DSPs, CPUs, microcontrollers, ASICs, FPGAs, programmable logic devices, discrete gate or transistor logic components, discrete hardware components, or any combinations thereof).
In a preferred embodiment, the memory 701 may also store a high-illuminance reference feature library, which may be constructed, for example, by the method shown in fig. 2. The computer-executable instructions stored in memory 701, when executed by at least one processor 702, further cause the at least one processor to perform additional functions including feature extraction of the acquired low-light pictures to obtain corresponding low-light feature vectors; and returning the high-illuminance reference picture corresponding to the high-illuminance reference feature vector with the maximum cosine similarity with the low-illuminance feature vector in the one or more high-illuminance reference feature vectors as input data to the width learning network when the cosine similarity between the low-illuminance feature vector and one or more high-illuminance reference feature vectors in the high-illuminance reference feature library is greater than a predetermined threshold.
The various illustrative blocks and modules described in connection with the disclosure herein may be implemented or performed with a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software for execution by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and the appended claims. For example, due to the nature of software, the functions described herein may be implemented using software executed by a processor, hardware, firmware, hardwired or any combination thereof. Features that implement the functions may also be physically located in various places including being distributed such that parts of the functions are implemented at different physical locations.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.
Claims (8)
1. A method for removing low-light picture noise, the method comprising:
acquiring a low-illumination picture to be denoised;
extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors;
when cosine similarity between the low-illumination characteristic vector and one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library is larger than a preset threshold value, returning a high-illumination reference picture corresponding to the high-illumination reference characteristic vector with the maximum cosine similarity with the low-illumination characteristic vector in the one or more high-illumination reference characteristic vectors as input data;
otherwise, directly taking the acquired low-illumination picture as input data;
inputting the input data into a trained breadth learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the breadth learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped to the input data to a given target value; and
the resulting feature vector is input to a generating network of a trained generating countermeasure network to generate a denoised picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.
2. The method of claim 1, wherein the high-luminance reference feature library is constructed by:
periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value;
extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors;
comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library;
when cosine similarity between the high-luminance reference feature vector and each of the existing feature vectors in the high-luminance reference feature library is less than a predetermined threshold,
adding the high-illumination reference feature vector into the high-illumination reference feature library;
gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector;
and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.
3. The method of claim 1, wherein the training to generate the countermeasure network is achieved by repeating the steps of:
inputting the low-illumination pictures in the training data set into a generation network in the generation countermeasure network through the feature vectors extracted by the width learning network to generate denoising pictures;
inputting the generated denoising picture and the corresponding clear picture without noise into a discrimination network in the generating countermeasure network so as to be used for the discrimination network to discriminate the real picture;
the generation network and the discrimination network are optimized based on a loss calculation.
4. A system for removing low-light picture noise, the system comprising:
a feature extraction module configured to:
acquiring a low-illumination picture to be denoised;
extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors;
a similar picture contrast module configured to:
comparing the low-illumination characteristic vector with one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library one by one; and is also provided with
When cosine similarity between the low-illuminance feature vector and the one or more high-illuminance reference feature vectors is greater than a predetermined threshold, determining that a similar picture exists and returning, as input data, a high-illuminance reference picture corresponding to a high-illuminance reference feature vector having the highest cosine similarity to the low-illuminance feature vector among the one or more high-illuminance reference feature vectors;
otherwise, determining that no similar picture exists and directly taking the acquired low-illumination picture as input data;
a breadth-learning feature extraction module configured to input the input data into a trained breadth-learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the breadth-learning network is trained by determining a pseudo-inverse of feature nodes and enhancement nodes to which the input data is mapped to a given target value; and
a generating countermeasure network denoising module configured to input the resulting feature vector into a generating network in a trained generating countermeasure network to generate a denoised picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.
5. The system of claim 4, wherein the high-light reference feature library is constructed by:
periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value;
extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors;
comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library;
when cosine similarity between the high-luminance reference feature vector and each of the existing feature vectors in the high-luminance reference feature library is less than a predetermined threshold,
adding the high-illumination reference feature vector into the high-illumination reference feature library;
gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector;
and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.
6. A system as described in claim 5 wherein feature extraction of said low-luminance picture and high-luminance reference picture is performed using a scale-invariant feature transform SIFT operator.
7. The system of claim 4, wherein the training to generate the countermeasure network is accomplished by repeating the steps of:
inputting the low-illumination pictures in the training data set into a generation network in the generation countermeasure network through the feature vectors extracted by the width learning network to generate denoising pictures;
inputting the generated denoising picture and the corresponding clear picture without noise into a discrimination network in the generating countermeasure network so as to be used for the discrimination network to discriminate the real picture;
the generation network and the discrimination network are optimized based on a loss calculation.
8. A system for removing low-light picture noise, the system comprising:
a memory storing a trained breadth learning network and generating an antagonism network and computer executable instructions; and
at least one processor, the computer-executable instructions, when executed, cause the at least one processor to:
acquiring a low-illumination picture to be denoised;
extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors;
when cosine similarity between the low-illumination characteristic vector and one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library is larger than a preset threshold value, returning a high-illumination reference picture corresponding to the high-illumination reference characteristic vector with the maximum cosine similarity with the low-illumination characteristic vector in the one or more high-illumination reference characteristic vectors as input data;
otherwise, directly taking the acquired low-illumination picture as input data;
inputting the input data into the width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and
inputting the obtained feature vector into a generating network in the generating countermeasure network to generate a denoising picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a judging network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111583678.3A CN114926348B (en) | 2021-12-22 | 2021-12-22 | Device and method for removing low-illumination video noise |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111583678.3A CN114926348B (en) | 2021-12-22 | 2021-12-22 | Device and method for removing low-illumination video noise |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114926348A CN114926348A (en) | 2022-08-19 |
CN114926348B true CN114926348B (en) | 2024-03-01 |
Family
ID=82804285
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111583678.3A Active CN114926348B (en) | 2021-12-22 | 2021-12-22 | Device and method for removing low-illumination video noise |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114926348B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108765319A (en) * | 2018-05-09 | 2018-11-06 | 大连理工大学 | A kind of image de-noising method based on generation confrontation network |
CN110675328A (en) * | 2019-08-02 | 2020-01-10 | 北京巨数数字技术开发有限公司 | Low-illumination image enhancement method and device based on condition generation countermeasure network |
CN111325671A (en) * | 2018-12-13 | 2020-06-23 | 北京嘀嘀无限科技发展有限公司 | Network training method and device, image processing method and electronic equipment |
KR102134405B1 (en) * | 2019-06-27 | 2020-07-15 | 중앙대학교 산학협력단 | System and Method for Improving Low Light Level Image Using Generative Adversarial Network |
CN111915525A (en) * | 2020-08-05 | 2020-11-10 | 湖北工业大学 | Low-illumination image enhancement method based on improved depth separable generation countermeasure network |
CN111915526A (en) * | 2020-08-05 | 2020-11-10 | 湖北工业大学 | Photographing method based on brightness attention mechanism low-illumination image enhancement algorithm |
KR20210048100A (en) * | 2019-10-23 | 2021-05-03 | 서울대학교산학협력단 | Condition monitoring data generating apparatus and method using generative adversarial network |
CN113313657A (en) * | 2021-07-29 | 2021-08-27 | 北京航空航天大学杭州创新研究院 | Unsupervised learning method and system for low-illumination image enhancement |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10803347B2 (en) * | 2017-12-01 | 2020-10-13 | The University Of Chicago | Image transformation with a hybrid autoencoder and generative adversarial network machine learning architecture |
US11790489B2 (en) * | 2020-04-07 | 2023-10-17 | Samsung Electronics Co., Ltd. | Systems and method of training networks for real-world super resolution with unknown degradations |
-
2021
- 2021-12-22 CN CN202111583678.3A patent/CN114926348B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108765319A (en) * | 2018-05-09 | 2018-11-06 | 大连理工大学 | A kind of image de-noising method based on generation confrontation network |
CN111325671A (en) * | 2018-12-13 | 2020-06-23 | 北京嘀嘀无限科技发展有限公司 | Network training method and device, image processing method and electronic equipment |
KR102134405B1 (en) * | 2019-06-27 | 2020-07-15 | 중앙대학교 산학협력단 | System and Method for Improving Low Light Level Image Using Generative Adversarial Network |
CN110675328A (en) * | 2019-08-02 | 2020-01-10 | 北京巨数数字技术开发有限公司 | Low-illumination image enhancement method and device based on condition generation countermeasure network |
KR20210048100A (en) * | 2019-10-23 | 2021-05-03 | 서울대학교산학협력단 | Condition monitoring data generating apparatus and method using generative adversarial network |
CN111915525A (en) * | 2020-08-05 | 2020-11-10 | 湖北工业大学 | Low-illumination image enhancement method based on improved depth separable generation countermeasure network |
CN111915526A (en) * | 2020-08-05 | 2020-11-10 | 湖北工业大学 | Photographing method based on brightness attention mechanism low-illumination image enhancement algorithm |
CN113313657A (en) * | 2021-07-29 | 2021-08-27 | 北京航空航天大学杭州创新研究院 | Unsupervised learning method and system for low-illumination image enhancement |
Non-Patent Citations (1)
Title |
---|
江泽涛 ; 覃露露 ; .一种基于U-Net生成对抗网络的低照度图像增强方法.电子学报.2020,(02),52-58. * |
Also Published As
Publication number | Publication date |
---|---|
CN114926348A (en) | 2022-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tian et al. | Deep learning on image denoising: An overview | |
CN112233038B (en) | True image denoising method based on multi-scale fusion and edge enhancement | |
Li et al. | Blind image quality assessment using statistical structural and luminance features | |
CN107529650B (en) | Closed loop detection method and device and computer equipment | |
CN111079764B (en) | Low-illumination license plate image recognition method and device based on deep learning | |
US20190294931A1 (en) | Systems and Methods for Generative Ensemble Networks | |
CN112164011B (en) | Motion image deblurring method based on self-adaptive residual error and recursive cross attention | |
CN110148088B (en) | Image processing method, image rain removing method, device, terminal and medium | |
US20140126808A1 (en) | Recursive conditional means image denoising | |
CN111861925A (en) | Image rain removing method based on attention mechanism and gate control circulation unit | |
CN111612741B (en) | Accurate reference-free image quality evaluation method based on distortion recognition | |
CN112465727A (en) | Low-illumination image enhancement method without normal illumination reference based on HSV color space and Retinex theory | |
CN114972107A (en) | Low-illumination image enhancement method based on multi-scale stacked attention network | |
CN113065645A (en) | Twin attention network, image processing method and device | |
CN114140346A (en) | Image processing method and device | |
CN114627034A (en) | Image enhancement method, training method of image enhancement model and related equipment | |
Anwar et al. | Attention-based real image restoration | |
CN111027564A (en) | Low-illumination imaging license plate recognition method and device based on deep learning integration | |
Jeon et al. | Low-light image enhancement using inverted image normalized by atmospheric light | |
CN115131229A (en) | Image noise reduction and filtering data processing method and device and computer equipment | |
Zin et al. | Local image denoising using RAISR | |
Soumya et al. | Self-organized night video enhancement for surveillance systems | |
CN114926348B (en) | Device and method for removing low-illumination video noise | |
CN116862809A (en) | Image enhancement method under low exposure condition | |
CN114648467B (en) | Image defogging method and device, terminal equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |