CN114926348B

CN114926348B - Device and method for removing low-illumination video noise

Info

Publication number: CN114926348B
Application number: CN202111583678.3A
Authority: CN
Inventors: 史国杰; 曹靖城; 吕超; 吴宇松
Original assignee: Tianyi Digital Life Technology Co Ltd
Current assignee: Tianyi Digital Life Technology Co Ltd
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2024-03-01
Anticipated expiration: 2041-12-22
Also published as: CN114926348A

Abstract

The invention provides a device and a method for removing low-illumination video noise based on width learning and generation countermeasure network technology, wherein the method comprises the following steps: acquiring a low-illumination picture to be denoised; inputting the obtained low-illumination picture as input data into a trained width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and inputting the obtained feature vector into a generating network in a trained generating countermeasure network to generate a denoising picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.

Description

Device and method for removing low-illumination video noise

Technical Field

The present invention relates to the field of computer vision, and more particularly, to an apparatus and method for removing low-luminance video noise based on width learning and generation countermeasure network technology.

Background

With the development of 5G and video technologies, the development of housekeeping products based on cameras is rapid, and the storage scale of video files is also increasing. Taking the Chinese telecommunication sky wing housekeeping product as an example, the video files newly added every day need 25P storage space, and the most direct mass storage brings high hardware purchasing cost, and in addition, the mass storage brings high management cost for capacity expansion, operation and maintenance, disaster tolerance and the like. It is a technical challenge to compress video files as much as possible.

Particularly, in a low-illumination environment, the exposure time of the photosensitive element is increased, the temperature is increased, and a large amount of white noise and dark current can be caused by long-time exposure, and the white noise and the dark current can cause the output of a large amount of random noise points. These noise points can present the following three problems to the production operations:

1. the imaging quality is low, the user experience is poor, and the video can be covered by one or more of Gaussian noise, pretzel noise, rayleigh noise, exponential noise and the like;

2. noise can cause dense high-frequency signals in the video, and if the high-frequency signals are directly input into an encoder for encoding without processing, the storage space can be increased. For example, the code stream of the single-channel video camera is compressed to about 700K under the condition of normal brightness under the H265 standard, but the storage capacity can be increased by 30% -50% under the low-illumination environment such as rainy night; and

3. the success rate of the images is low when the images are used for AI recognition (such as face recognition and license plate recognition), which brings great hidden trouble to intelligent security.

Conventional codec techniques use low pass filtering and median filtering to handle noise. But the low pass filtering may blur the picture while removing noise. The median filtering adopts a nonlinear method, and at the same time, the sharp edge of the image is protected, and proper points are selected to replace the values of pollution points, so that the processing effect is not blurred like the low-pass filtering, but the algorithm of the median filtering determines that the processing effect is better for salt and pepper noise and the processing effect is poorer for other noise such as Gaussian noise. With the development of artificial intelligence in recent years, a machine learning algorithm and a deep learning algorithm are applied to the field of image denoising and have obvious results, but training of an artificial intelligence model depends on labeling and learning of a large number of sample sets, relies on the computational power of hardware in an reasoning stage, and increases the hardware cost when ISP processing and AI processing are performed on an end side.

At present, a method of combining median filtering with nonlinear mapping is used for suppressing noise of an image, so that noise of a low-illumination image is suppressed while the visibility of the original image is maintained, and the monitoring picture quality of a night image is improved. In addition, a method for enhancing images by using high-quality similar images in similar scenes is also proposed at present, but the method directly uses a histogram matching mode to match, so that the accuracy is lower, and the histogram is greatly influenced by illumination. Compared with the existing method, the method based on the deep learning achieves better effect, but the weight, bias and other parameters required to be trained in the deep learning are more than millions, the model training speed is low, the time is long, and the calculation cost is too high. Accordingly, in order to improve the performance in terms of image denoising, reduce the dependence on computational effort, it is desirable to provide an improved method of removing low-luminance picture noise.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

Aiming at the defects existing in the prior art, the method combines the actual application situation of the camera service, is different from a blind denoising scheme in the main stream of the industry, introduces a video frame with good illumination in the daytime as a reference frame, uses the width learning and the generation countermeasure network technology to purposefully and quantitatively denoise the low-illumination image, for example, the method can be used for ISP processing before H264/H265 coding of a terminal to denoise I frames and P frames, can save the storage cost, improves the user experience, can also be used for a preprocessing module for deep learning image recognition, and improves the accuracy of image recognition.

According to one aspect of the present invention, there is provided a method for removing low-luminance picture noise, the method comprising: acquiring a low-illumination picture to be denoised; inputting the obtained low-illumination picture as input data into a trained width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and inputting the obtained feature vector into a generating network in a trained generating countermeasure network to generate a denoising picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.

According to one embodiment of the invention, the method further comprises: extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors; and when cosine similarity between the low-illuminance feature vector and one or more high-illuminance reference feature vectors in the high-illuminance reference feature library is greater than a predetermined threshold, returning a high-illuminance reference picture corresponding to a high-illuminance reference feature vector with the maximum cosine similarity with the low-illuminance feature vector in the one or more high-illuminance reference feature vectors as input data to be input into the width learning network.

According to a further embodiment of the invention, the high-illuminance reference feature library is constructed by: periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value; extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors; comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library; adding the high-illuminance reference feature vector to the high-illuminance reference feature library when cosine similarity between the high-illuminance reference feature vector and each of the existing feature vectors in the high-illuminance reference feature library is less than a predetermined threshold; gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector; and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.

According to a further embodiment of the invention, the training to generate the countermeasure network is achieved by repeating the following steps, wherein the clear picture without noise and the corresponding low-light picture are taken as a training dataset: inputting the low-illumination pictures in the training data set into a generation network in the generation countermeasure network through the feature vectors extracted by the width learning network to generate denoising pictures; inputting the generated denoising picture and the corresponding clear picture without noise into a discrimination network in the generating countermeasure network so as to be used for the discrimination network to discriminate the real picture; the generation network and the discrimination network are optimized based on a loss calculation.

According to another aspect of the present invention, there is provided a system for removing low-light picture noise, the system comprising: a feature extraction module configured to: acquiring a low-illumination picture to be denoised; extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors; a similar picture contrast module configured to: comparing the low-illumination characteristic vector with one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library one by one; and when cosine similarity between the low-illuminance feature vector and the one or more high-illuminance reference feature vectors is greater than a predetermined threshold, determining that a similar picture exists and returning, as input data, a high-illuminance reference picture corresponding to a high-illuminance reference feature vector having the highest cosine similarity to the low-illuminance feature vector among the one or more high-illuminance reference feature vectors; otherwise, determining that no similar picture exists and directly taking the acquired low-illumination picture as input data; a breadth-learning feature extraction module configured to input the input data into a trained breadth-learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the breadth-learning network is trained by determining a pseudo-inverse of feature nodes and enhancement nodes to which the input data is mapped to a given target value; and a generation countermeasure network denoising module configured to input the resulting feature vector into a generation network in a trained generation countermeasure network to generate a denoised picture, wherein the generation countermeasure network is obtained by alternately training the generation network and a discrimination network.

According to one embodiment of the invention, the high-illuminance reference feature library is constructed by: periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value; extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors; comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library; adding the high-illuminance reference feature vector to the high-illuminance reference feature library when cosine similarity between the high-illuminance reference feature vector and each of the existing feature vectors in the high-illuminance reference feature library is less than a predetermined threshold; gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector; and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.

According to a further embodiment of the invention, feature extraction of the low-luminance picture and the high-luminance reference picture is performed using a scale-invariant feature transform SIFT operator.

According to still another aspect of the present invention, there is provided a system for removing noise of a low-illuminance picture, the system comprising: a memory storing a trained breadth learning network and generating an antagonism network and computer executable instructions; and at least one processor, the computer-executable instructions, when executed, cause the at least one processor to: acquiring a low-illumination picture to be denoised; inputting the obtained low-illumination picture as input data into the width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining the pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and inputting the obtained feature vector into a generation network in the generation countermeasure network to generate a denoising picture, wherein the generation countermeasure network is obtained by alternately training the generation network and a discrimination network.

According to one embodiment of the invention, the memory further stores a high-light reference feature library, and the computer-executable instructions, when executed, cause the at least one processor to further perform the operations of: extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors; and when cosine similarity between the low-illuminance feature vector and one or more high-illuminance reference feature vectors in the high-illuminance reference feature library is greater than a predetermined threshold, returning a high-illuminance reference picture corresponding to a high-illuminance reference feature vector with the maximum cosine similarity to the low-illuminance feature vector in the one or more high-illuminance reference feature vectors as input data to the width learning network.

Compared with the scheme in the prior art, the method and the system for removing the noise of the low-illumination picture have at least the following advantages:

(1) The denoising effect is good: after the width learning is introduced and the countermeasure network is generated, a cleaner picture can be obtained, and the blurring problem caused by low-pass filtering denoising can be reduced;

(2) The calculation force requirement is low: compared with a large number of nonlinear calculations required by deep learning, most of the calculations are linear calculations based on a width learning method, so that the method has low calculation force requirements and can be used on low calculation force equipment such as cameras, visible doorbell, mobile phones and the like; and

(3) Flexible use: the denoising method of the invention can be realized not only as a preprocessing flow of the neural network, but also as an ISP processing flow of the H264/H265 encoder.

These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.

Drawings

So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.

FIG. 1 illustrates an example architecture diagram of a system for removing low-light picture noise according to one embodiment of this disclosure;

FIG. 2 illustrates an example flow chart of a method for constructing a high-luminance reference feature library according to one embodiment of this disclosure;

FIG. 3 illustrates an example flow chart of a method for removing low-light picture noise according to one embodiment of this disclosure;

FIG. 4 illustrates an example block diagram of a breadth-learning network in accordance with one embodiment of the present invention;

FIG. 5 illustrates an example block diagram of generating an antagonism network according to one embodiment of the invention;

FIG. 6 illustrates an example block diagram of generating a discrimination network in an antagonism network according to one embodiment of the invention; and

fig. 7 illustrates an example architecture diagram of a system for removing low-light picture noise according to one embodiment of this disclosure.

Detailed Description

The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.

Fig. 1 illustrates an example architecture diagram of a system 100 for removing low-light picture noise according to one embodiment of this disclosure. As shown in fig. 1, the system 100 of the present invention includes at least: the system comprises a feature extraction module 101, a similar picture comparison module 102, a width learning feature extraction module 103 and a generation countermeasure network denoising module 104.

The feature extraction module 101 may be configured to obtain a low-illuminance picture to be denoised, and perform feature extraction on the obtained low-illuminance picture to obtain a corresponding low-illuminance feature vector. In some cases, the SIFT operator may be used to perform feature extraction on the obtained low-illumination image to obtain a corresponding 128-dimensional feature vector, where the SIFT operator is a very stable local feature extraction operator, and may implement rotation, scaling, brightness change, and so on, which remain unchanged.

The similar picture comparison module 102 may be configured to compare the low-luminance feature vector obtained via the feature extraction module 101 with each of the high-luminance reference feature vectors in the high-luminance reference feature library one by one, and when cosine similarity between the low-luminance feature vector and one or more high-luminance reference feature vectors is greater than a predetermined threshold (e.g., 0.85), determine that a similar picture exists and directly return, for further image processing, a high-luminance reference picture corresponding to a high-luminance reference feature vector having the greatest cosine similarity with the low-luminance feature vector among the one or more high-luminance reference feature vectors, otherwise determine that the similar picture does not exist. The construction of the high-luminance reference feature library will be described in further detail below.

The width learning feature extraction module 103 may be configured to extract spatial features of the acquired low-light pictures to obtain corresponding feature vectors using a trained width learning network that is trained by determining the pseudo-inverse of feature nodes and enhancement nodes to which the input data is mapped to a given target value, wherein the training process of the width learning network is described in further detail below.

The generation countermeasure network denoising module 104 may be configured to input the features extracted via the breadth-learning network into a generation network in a trained generation countermeasure network to generate a denoised picture, wherein the generation countermeasure network is a network with a denoising function obtained by alternately training the generation network and the discrimination network. Further, data of near original pictures denoised via a generation network may be input to an H264/H265 encoder for video encoding.

Those skilled in the art will appreciate that the system of the present invention and its various modules may be implemented in either hardware or software, and that the various modules may be combined or combined in any suitable manner.

FIG. 2 illustrates an example flow diagram of a method 200 for building a high-luminance reference feature library according to one embodiment of this disclosure. The method 200 starts in step 201 with periodically acquiring a high-luminance reference picture and discarding the picture when its luminance is below a certain threshold and entering the next picture processing when its luminance is above the threshold. As an example, pictures of the camera may be taken every one hour during 9:00 a.m. to 15:00 a.m., discarded if the brightness is below a certain threshold, and entered into the next step if the brightness is above the threshold.

In step 202, feature extraction is performed on the collected high-illuminance reference picture to obtain a corresponding high-illuminance reference feature vector. In a preferred embodiment, a SIFT operator may be employed to perform feature extraction on the acquired high-intensity reference picture to obtain a 128-dimensional characterization feature vector.

In step 203, the obtained high-luminance reference feature vector is compared with the existing feature vectors in the high-luminance reference feature library, and when the cosine similarity between the high-luminance reference feature vector and a certain feature vector in the existing feature vectors in the library is greater than a predetermined threshold (for example, 0.9), the high-luminance reference feature vector is discarded if a similar reference map is indicated, otherwise, the high-luminance reference feature vector is stored in the high-luminance reference feature library and the next step is entered.

In step 204, gray scale processing is performed on the high-luminance reference picture corresponding to the high-luminance reference feature vector, wherein the gray scale processing is mainly used for eliminating the influence of color on the picture edge.

In step 205, the gray-scale processed picture is subjected to a laplace transform and saved in series, wherein the laplace transform is used to extract the edge contour of the picture.

Thus, by the method 200 shown in fig. 2, a high-illuminance reference feature library can be constructed and updated continuously, so that image enhancement is performed by introducing the high-illuminance reference feature library in the present invention.

Fig. 3 illustrates an example flow chart of a method 300 for removing low-light picture noise according to one embodiment of this disclosure. The method 300 starts in step 301, the feature extraction module 101 may obtain a low-illumination picture to be denoised, and perform SIFT feature extraction on the obtained low-illumination picture to obtain a corresponding low-illumination feature vector.

In step 302, the similar picture comparison module 102 may compare SIFT features extracted in step 301 with SIFT features in a high-intensity reference feature library (e.g., a high-intensity reference feature library constructed using the method 200 described with reference to fig. 2) one by one.

If the cosine similarity of the extracted SIFT feature to a certain SIFT feature in the high-luminance reference feature library is greater than a predetermined threshold (e.g., 0.85) in step 303, then in step 304, it is determined that a similar picture exists and a grayscale-processed and laplacian-transformed high-luminance reference picture corresponding to the SIFT feature is returned as input data. Preferably, if the cosine similarity of the high-illuminance reference feature stock between the plurality of SIFT features and the extracted SIFT features is greater than a predetermined threshold, it is determined that there is a similar picture and the grayscale-processed and laplace-transformed high-illuminance reference picture corresponding to the SIFT feature having the greatest cosine similarity among the plurality of SIFT features is returned as the input data in step 304.

Otherwise, in step 305, it is determined that there is no similar picture and the acquired low-light picture is directly taken as input data.

In step 306, the width learning feature extraction module may input the input data into a width learning network for spatial feature extraction to obtain corresponding feature vectors. The core of the width learning is to calculate the pseudo-inverse of the characteristic nodes and the enhancement nodes to the target value, wherein the width learning network firstly utilizes the characteristic nodes to extract the characteristics of the picture, then enhances the respective characteristic nodes through an enhancement mapping function to form corresponding enhancement nodes, the characteristic nodes and the enhancement nodes serve as input layers of the width learning network, and the obtained inverse matrix is equivalent to the weight of the neural network. An example block diagram 400 of a breadth-learning network is shown in FIG. 4, in which the specific steps of a method for training a breadth-learning network are as follows:

s1: generating characteristic nodes and establishing a mapping from input data to the characteristic nodes.

Set training set as H' _1(S×f) Where s represents the number of samples and f represents the number of features. First, the training set is z-score normalized, normalizing the input data to between (0, 1). Then, in order to directly increase bias items through matrix operation after feature nodes are generated, the normalized data is subjected to augmentation operation, a column is added at the last of the training set, and the training set is converted into H' _1(s×(f+1)) . The feature nodes are then generated for each window starting as follows:

1) Generating a gaussian distribution-compliant dimension of (f+1) x N ₁ Is a random weight matrix w of (2) _e Wherein N is ₁ Representing the number of each window characteristic node;

2) Will w _e Put into w _e { i }, i represents the iteration number, and the iteration number is recorded as N ₂ ；

3) Let A ₁ ＝H ₁ ×w _e ；

4) Pair A ₁ Normalizing;

5) To effectively reduce the linear correlation degree of new generation characteristic nodes and obtain a sparse matrix W so as to enable H ₁ ×W＝A ₁ For A ₁ Performing sparse representation on the sparse representationAfter the representation, A ₁ Is s x (f+1), where the optimization problem in the sparse representation is solved using, for example, the lasso method. Thus, it is possible to obtain:

6) Feature node for generating a window

T ₁ ：T ₁ ＝normal(H ₁ ×W)

Wherein normal represents normalization, and the normalization method of each window characteristic node is denoted as p _s (i) A. The invention relates to a method for producing a fibre-reinforced plastic composite For N ₂ Generating N for each feature window ₁ Each node is an s-dimensional feature vector. Thus, for the whole network, the characteristic node matrix y is a matrix of dimensions sx (N ₂ ×N ₁ ) Is a matrix of (a) in the matrix.

S2: an enhanced node is generated.

Another feature of the breadth-learning network is that random feature nodes can be supplemented with enhancement nodes. The feature nodes generated by the above steps are all linear, and the purpose of introducing the enhancement nodes is to increase the non-linearity factor in the network.

1) The method comprises the steps of firstly, carrying out standardization and augmentation on a characteristic node matrix y to obtain H ₂ . Unlike the feature nodes, the coefficient matrix of the enhanced node is not a random matrix, but a random matrix subjected to orthogonal normalization. Hypothesis (N) ₂ ×N ₁ )＞N ₃ The coefficient matrix w of the node is enhanced _h Can be expressed as (N) ₂ ×N ₁ )×N ₃ The dimensions are subjected to orthogonal normalized random matrices. The purpose of the method is to map the characteristic nodes to a high-dimensional subspace through nonlinearity, so that the expression capacity of the network is stronger to achieve the purpose of 'enhancement';

2) Activating the enhancement node:

where s represents the scaling of the enhancement node, which is one of the adjustable parameters in the network. the tan sig is an activation function commonly used in BP neural networks, and features expressed by the enhanced nodes can be activated to the greatest extent;

3) Ultimately generating an input T of the network ₃ : compared with the characteristic nodes, the enhancement nodes do not need sparse representation and window iteration. While the iteration of orthogonalization may also take some computation time, the computation time required to add an enhancement node is often less than the computation time required to add a feature node. Thus, the final input to the network can be expressed asEach sample has a characteristic dimension of (N ₁ ×N ₂ )+N ₃ 。

S3: the pseudo-inverse, i.e., the mapping of inputs to outputs in the breadth network, is found.

Let Y be _x For the output value of the network, i.e. Y _x ＝T ₃ X W is then

Wherein Y is the label of the training set. Thus, the input and weight of the whole network are trained, and the parameters to be saved after the training is completed are only W and p _s Therefore, the amount of network parameters is small and the required computational effort is low compared to deep learning.

Subsequently, in step 307, the generated countermeasure network denoising module 104 may input the feature vector extracted in step 306 into a generation network in a trained generated countermeasure network obtained by alternately training the generation network and the discrimination network to generate a denoised picture. An example structure diagram 500 of generating an countermeasure network is shown in fig. 5, in which the specific steps of the method for training the generation of the countermeasure network are as follows:

step S1: and taking the clear picture without noise and the corresponding low-illumination picture as a training data set.

Step S2: the low-light pictures in the training dataset are input into a trained breadth-learning network to extract corresponding spatial features and into a generating network to generate denoised pictures (false samples).

Step S3: and inputting the generated de-noised picture (false sample) and the corresponding clear picture (true sample) without noise in the training data set into a discrimination network for real picture discrimination, wherein the label of the generated de-noised picture is 0, and the label of the clear picture without noise is 1. The discrimination network is designed to discriminate whether an input picture is a true picture (true sample) or a generated picture (false sample), and the discrimination capability of the discrimination network can be improved by continuous iterative training. The discrimination network may have a different structure, in one example, the structure 600 of the discrimination network is shown in fig. 6, where the image feature may be well preserved by performing a downsampling operation on the feature by means of averaging pooling, and high-dimensional texture information may be extracted by performing convolution with a step size of 2 through a plurality of 3*3 convolution kernels. In addition, in the discrimination network shown in fig. 6, two residual modules are added in the discrimination network by referring to the residual thought in the Resnet network, the characteristics are extracted and fused, and the residual modules can accelerate the extraction of the characteristics and the convergence of training and obtain better characteristics.

In the example of fig. 6, the output of the residual network may be expressed as:

x _l+1 ＝x _l +F(x _l ，W _l )

wherein x is _l For the input of the residual structure, F is the convolution branch of the residual network, W _l Is a convolution kernel parameter.

The stack residual may be expressed as:

wherein,convolved branches representing multiple residual structures, x _L Is the output of the stacked residual block.

Outputting an optimal residual error network structure according to the back propagation result:

step S4: and carrying out loss calculation on the generating network and the judging network, and continuously optimizing parameters of the generating network and the judging network according to calculation results, wherein in order to ensure that the judging network can correctly guide the training of the generating network, pre-training is needed, and the pre-training adopts a mean square error as a loss function for preliminary optimization. After training is completed, the generated network is a network with a denoising function.

The generated denoised picture can then be further processed as input data to the H264/H265 encoder, thereby enabling a reduction in storage capacity and an improvement in user experience.

Fig. 7 illustrates an example architecture diagram of a system 700 for removing low-light picture noise according to one embodiment of this disclosure. The system 700 may be implemented at the cell phone end or cloud end. As shown in fig. 7, a system 700 may include a memory 701 and at least one processor 702. The memory 701 may store a trained breadth-learning network and generate an antagonism network. The memory 701 may include RAM, ROM, or a combination thereof. The memory 701 may store computer-executable instructions that, when executed by the at least one processor 702, cause the at least one processor to perform the various functions described herein, including obtaining a low-light picture to be denoised; the obtained low-illumination picture is used as input data to be input into a trained width learning network for spatial feature extraction so as to obtain a corresponding feature vector; and inputting the resulting feature vector into a generation network of the trained generation countermeasure network to generate the denoised picture. In some cases, memory 701 may include, among other things, a BIOS that may control basic hardware or software operations, such as interactions with peripheral components or devices. The processor 702 may include intelligent hardware devices (e.g., general purpose processors, DSPs, CPUs, microcontrollers, ASICs, FPGAs, programmable logic devices, discrete gate or transistor logic components, discrete hardware components, or any combinations thereof).

In a preferred embodiment, the memory 701 may also store a high-illuminance reference feature library, which may be constructed, for example, by the method shown in fig. 2. The computer-executable instructions stored in memory 701, when executed by at least one processor 702, further cause the at least one processor to perform additional functions including feature extraction of the acquired low-light pictures to obtain corresponding low-light feature vectors; and returning the high-illuminance reference picture corresponding to the high-illuminance reference feature vector with the maximum cosine similarity with the low-illuminance feature vector in the one or more high-illuminance reference feature vectors as input data to the width learning network when the cosine similarity between the low-illuminance feature vector and one or more high-illuminance reference feature vectors in the high-illuminance reference feature library is greater than a predetermined threshold.

The various illustrative blocks and modules described in connection with the disclosure herein may be implemented or performed with a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).

The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software for execution by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and the appended claims. For example, due to the nature of software, the functions described herein may be implemented using software executed by a processor, hardware, firmware, hardwired or any combination thereof. Features that implement the functions may also be physically located in various places including being distributed such that parts of the functions are implemented at different physical locations.

What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.

Claims

1. A method for removing low-light picture noise, the method comprising:

acquiring a low-illumination picture to be denoised;

extracting features of the obtained low-illumination pictures to obtain corresponding low-illumination feature vectors;

when cosine similarity between the low-illumination characteristic vector and one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library is larger than a preset threshold value, returning a high-illumination reference picture corresponding to the high-illumination reference characteristic vector with the maximum cosine similarity with the low-illumination characteristic vector in the one or more high-illumination reference characteristic vectors as input data;

otherwise, directly taking the acquired low-illumination picture as input data;

inputting the input data into a trained breadth learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the breadth learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped to the input data to a given target value; and

the resulting feature vector is input to a generating network of a trained generating countermeasure network to generate a denoised picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.

2. The method of claim 1, wherein the high-luminance reference feature library is constructed by:

periodically acquiring a high-illumination reference picture, wherein the brightness of the high-illumination reference picture is higher than a threshold value;

extracting features of the high-illuminance reference pictures to obtain corresponding high-illuminance reference feature vectors;

comparing the high-illumination reference feature vector with the existing feature vectors in the high-illumination reference feature library;

when cosine similarity between the high-luminance reference feature vector and each of the existing feature vectors in the high-luminance reference feature library is less than a predetermined threshold,

adding the high-illumination reference feature vector into the high-illumination reference feature library;

gray processing is carried out on the high-illumination reference picture corresponding to the high-illumination reference feature vector;

and carrying out Laplace transformation on the gray-scale processed picture and carrying out serialization storage.

3. The method of claim 1, wherein the training to generate the countermeasure network is achieved by repeating the steps of:

inputting the low-illumination pictures in the training data set into a generation network in the generation countermeasure network through the feature vectors extracted by the width learning network to generate denoising pictures;

inputting the generated denoising picture and the corresponding clear picture without noise into a discrimination network in the generating countermeasure network so as to be used for the discrimination network to discriminate the real picture;

the generation network and the discrimination network are optimized based on a loss calculation.

4. A system for removing low-light picture noise, the system comprising:

a feature extraction module configured to:

acquiring a low-illumination picture to be denoised;

a similar picture contrast module configured to:

comparing the low-illumination characteristic vector with one or more high-illumination reference characteristic vectors in a high-illumination reference characteristic library one by one; and is also provided with

When cosine similarity between the low-illuminance feature vector and the one or more high-illuminance reference feature vectors is greater than a predetermined threshold, determining that a similar picture exists and returning, as input data, a high-illuminance reference picture corresponding to a high-illuminance reference feature vector having the highest cosine similarity to the low-illuminance feature vector among the one or more high-illuminance reference feature vectors;

otherwise, determining that no similar picture exists and directly taking the acquired low-illumination picture as input data;

a breadth-learning feature extraction module configured to input the input data into a trained breadth-learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the breadth-learning network is trained by determining a pseudo-inverse of feature nodes and enhancement nodes to which the input data is mapped to a given target value; and

a generating countermeasure network denoising module configured to input the resulting feature vector into a generating network in a trained generating countermeasure network to generate a denoised picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a discriminating network.

5. The system of claim 4, wherein the high-light reference feature library is constructed by:

6. A system as described in claim 5 wherein feature extraction of said low-luminance picture and high-luminance reference picture is performed using a scale-invariant feature transform SIFT operator.

7. The system of claim 4, wherein the training to generate the countermeasure network is accomplished by repeating the steps of:

8. A system for removing low-light picture noise, the system comprising:

a memory storing a trained breadth learning network and generating an antagonism network and computer executable instructions; and

at least one processor, the computer-executable instructions, when executed, cause the at least one processor to:

acquiring a low-illumination picture to be denoised;

otherwise, directly taking the acquired low-illumination picture as input data;

inputting the input data into the width learning network for spatial feature extraction to obtain corresponding feature vectors, wherein the width learning network is trained by determining pseudo-inverse of feature nodes and enhancement nodes mapped by the input data to a given target value; and

inputting the obtained feature vector into a generating network in the generating countermeasure network to generate a denoising picture, wherein the generating countermeasure network is obtained by alternately training the generating network and a judging network.