CN116433498A

CN116433498A - Training method and device for image denoising model, and image denoising method and device

Info

Publication number: CN116433498A
Application number: CN202111680455.9A
Authority: CN
Inventors: 赵自然; 张经纬; 顾建平
Original assignee: Beijing Shenmu Technology Co ltd; Tsinghua University; Nuctech Co Ltd
Current assignee: Beijing Shenmu Technology Co ltd; Tsinghua University; Nuctech Co Ltd
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2023-07-14

Abstract

A training method and device for an image denoising model, and an image denoising method and device are provided. The training method comprises the following steps: acquiring a training sample set, wherein the training sample set comprises a first training sample and a second training sample, the first training sample comprises an original image data sample and a noise data sample, and the second training sample comprises a clean image data sample; inputting the original image data sample into a first convolutional neural network; the first convolution neural network processes an original image data sample to obtain image characteristic data; inputting the noise data samples into a second convolutional neural network; the second convolutional neural network processes the noise data sample to obtain noise characteristic data; fusing the image characteristic data and the noise characteristic data to obtain fused characteristic data; convolving the fusion characteristic data to obtain enhanced characteristic data for representing image characteristics of the object to be identified; and adjusting each parameter of the image denoising model according to the difference between the clean image data sample and the enhancement characteristic data.

Description

Training method and device for image denoising model, and image denoising method and device

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a training method and apparatus for an image denoising model, an image denoising method and apparatus, an electronic device, a computer readable storage medium, and a program product.

Background

The terahertz passive imaging mode has the advantages of real-time imaging, no radiation, no personnel cooperation, strong penetrability and the like, and is widely applied to the field of security inspection. The main principle of terahertz passive imaging is to utilize a terahertz sensor to receive radiation of human body and surrounding environment and output corresponding voltage values. In general, the temperature of a human body is higher than the ambient temperature, and the side, facing the outside, of a suspected object carried on the human body is close to the ambient temperature, and the terahertz sensor senses the radiant brightness temperature of surrounding objects, and then protrudes out the temperature difference between the human body part and the surrounding environment after basic processing to form a final image. The main factor affecting the terahertz image quality is the signal-to-noise ratio of the image, which is the power ratio of the energy of the effective signal in the image to the noise energy in the image. Based on the above principle of terahertz passive imaging, it is known that the magnitude of effective information of the terahertz image depends on the temperature difference between the human body and the surrounding environment, and because the human body temperature can be considered to be fixed, when the ambient temperature is low, the temperature difference between the human body and the environment is large, the effective input signal obtained by the terahertz sensor is large, and noise in the terahertz passive imaging image mainly comes from noise floor possessed by the terahertz sensor itself. Therefore, in order to obtain a terahertz image with high signal-to-noise ratio, the first idea is to increase the input of effective signals, and a common method is to reduce the radiation temperature of the surrounding environment, for example, to build a special house or channel to fix the security inspection area and to reduce the radiation temperature of the area by using air conditioning equipment or low-emissivity materials; the second idea is to directly reduce noise in the image.

For the first concept, it is necessary to provide a specific area and keep the imaging area at a low temperature, which is disadvantageous for the deployment of the terahertz passive imaging apparatus, and increases the cost of the entire imaging system. For the second idea, the existing image denoising method cannot well denoise terahertz images.

The above information disclosed in this section is only for understanding the background of the inventive concept of the present disclosure, and thus, the above information may contain information that does not constitute prior art.

Disclosure of Invention

In view of at least one aspect of the above technical problems, a training method and apparatus for an image denoising model, an image denoising method and apparatus, an electronic device, a computer-readable storage medium, and a program product are provided.

In one aspect, there is provided a training method of an image denoising model including a first convolutional neural network and a second convolutional neural network, the method comprising:

acquiring a training sample set, wherein the training sample set comprises a first training sample and a second training sample, the first training sample and the second training sample are used for representing the input and the output of the image denoising model respectively, the first training sample comprises an original image data sample and a noise data sample, the second training sample comprises a clean image data sample, and the original image data sample and the clean image data sample both comprise image data of an object to be identified;

Inputting the original image data sample into a first convolutional neural network;

the first convolution neural network processes the original image data sample to obtain image characteristic data;

inputting the noise data samples into a second convolutional neural network;

the second convolutional neural network processes the noise data sample to obtain noise characteristic data;

fusing the image characteristic data and the noise characteristic data to obtain fused characteristic data;

performing convolution operation on the fusion characteristic data to obtain enhanced characteristic data, wherein the enhanced characteristic data is used for representing image characteristics of the object to be identified; and

and adjusting each parameter of the image denoising model according to the difference between the clean image data sample and the enhancement characteristic data.

According to some exemplary embodiments, the fusing the image feature data and the noise feature data includes: the noise feature data is removed from the image feature data.

According to some exemplary embodiments, the first convolutional neural network comprises a first convolutional layer comprising a one-dimensional convolutional layer and a second convolutional layer comprising a two-dimensional convolutional layer, connected in sequence.

According to some exemplary embodiments, the second convolutional neural network includes a fifth convolutional layer including a one-dimensional convolutional layer and a sixth convolutional layer including a two-dimensional convolutional layer, which are sequentially connected.

According to some exemplary embodiments, the first convolutional neural network processing the raw image data samples comprises:

inputting the original image data sample into a first convolution layer, and obtaining a first image characteristic through a first convolution operation; and

and inputting the first image features into a second convolution layer, and obtaining second image features through a second convolution operation.

According to some example embodiments, the second convolutional neural network processing the noise data samples comprises:

inputting the noise data sample into a fifth convolution layer, and obtaining a first noise characteristic through fifth convolution operation; and

and inputting the first noise characteristic into a sixth convolution layer, and obtaining a second noise characteristic through sixth convolution operation.

According to some exemplary embodiments, fusing the image feature data and the noise feature data includes:

and carrying out first fusion on the second image feature and the second noise feature to obtain a third image feature.

According to some exemplary embodiments, the first convolutional neural network further comprises a third convolutional layer, the third convolutional layer comprising a two-dimensional convolutional layer,

the method further comprises the steps of: and inputting the third image feature into a third convolution layer, and obtaining a fourth image feature through third convolution operation.

According to some exemplary embodiments, the second convolutional neural network further comprises a seventh convolutional layer, the seventh convolutional layer comprising a two-dimensional convolutional layer,

the method further comprises the steps of: and inputting the second noise characteristic into a seventh convolution layer, and obtaining a third noise characteristic through seventh convolution operation.

According to some exemplary embodiments, fusing the image feature data and the noise feature data further comprises:

and performing second fusion on the fourth image feature and the third noise feature to obtain a fifth image feature.

According to some exemplary embodiments, the first convolutional neural network further comprises a fourth convolutional layer, the fourth convolutional layer comprising a two-dimensional convolutional layer,

the step of carrying out convolution operation on the fusion characteristic data to obtain enhanced characteristic data specifically comprises the following steps: and inputting the fifth image feature into a fourth convolution layer, and obtaining enhanced feature data through fourth convolution operation.

According to some exemplary embodiments, the acquiring a training sample set includes:

acquiring noisy data samples within a specified time period;

carrying out averaging treatment on noisy data samples within the specified time period to obtain expected data;

obtaining noise data in a specified time period according to the noisy data samples and the expected data;

establishing a noise signal model according to the noise data in the specified time period;

and obtaining noise data samples according to the noise signal model.

According to some exemplary embodiments, the acquiring a training sample set further comprises:

under a first condition, acquiring first original image data samples;

obtaining a second image data sample according to the first original image data sample and the noise data sample, wherein the second image data sample comprises clean image data from which noise is removed;

generating a third image data sample under at least one second condition from the second image data sample, wherein the third image data sample comprises clean image data from which noise was removed;

obtaining the clean image data sample based on the second image data sample and the third image data sample,

Wherein the signal-to-noise ratio of the raw image data samples acquired under the first condition is higher than the signal-to-noise ratio of the raw image data samples acquired under the second condition.

and obtaining the original image data sample according to the obtained noise data sample and the clean image data sample.

According to some exemplary embodiments, the acquiring the first original image data sample under the first condition specifically includes:

scanning the imaging region by a terahertz imaging device under the condition that the ambient temperature is lower than a threshold temperature to acquire a first original image data sample,

wherein the object to be identified is located in the imaging region.

According to some exemplary embodiments, the acquiring noisy data samples over a prescribed period of time comprises:

the background area in the imaging area is scanned by the terahertz imaging device for a prescribed period of time to acquire noise data samples.

According to some exemplary embodiments, the generating a third image data sample under at least one second condition from the second image data sample specifically includes:

Determining a scaling factor according to the temperature value in the first condition and the temperature value in the second condition; and

scaling the second image data samples according to the scaling factor to generate at least one third image data sample under a second condition.

According to some exemplary embodiments, the first convolution layer includes a first sub-convolution layer and a second sub-convolution layer connected in sequence, each of the first sub-convolution layer and the second sub-convolution layer being a one-dimensional convolution layer; and/or the second convolution layer comprises a third sub-convolution layer and a fourth sub-convolution layer which are sequentially connected, and the third sub-convolution layer and the fourth sub-convolution layer are two-dimensional convolution layers.

According to some exemplary embodiments, the fifth convolution layer includes a fifth sub-convolution layer and a sixth sub-convolution layer connected in sequence, each of the fifth sub-convolution layer and the sixth sub-convolution layer being a one-dimensional convolution layer; and/or the sixth convolution layer comprises a seventh sub-convolution layer and an eighth sub-convolution layer which are sequentially connected, wherein the seventh sub-convolution layer and the eighth sub-convolution layer are two-dimensional convolution layers.

In another aspect, there is provided an image denoising method, comprising:

Acquiring original image data, wherein the original image data comprises image data of an object to be identified;

acquiring noise data, the noise data being associated with the original image data;

inputting the original image data into a first convolutional neural network;

the first convolution neural network processes the original image data to obtain image characteristic data;

inputting the noise data into a second convolutional neural network;

the second convolutional neural network processes the noise data to obtain noise characteristic data;

and carrying out convolution operation on the fusion characteristic data to obtain enhanced characteristic data, wherein the enhanced characteristic data is used for representing the image characteristics of the object to be identified.

According to some exemplary embodiments, the first convolutional neural network processing the raw image data comprises:

inputting the original image data into a first convolution layer, and obtaining a first image characteristic through a first convolution operation; and

According to some example embodiments, the second convolutional neural network processing the noise data comprises:

inputting the noise data into a fifth convolution layer, and obtaining a first noise characteristic through fifth convolution operation; and

According to some exemplary embodiments, the method further comprises: and interpolating the enhancement characteristic data to obtain an interpolation image.

According to some exemplary embodiments, the acquiring the raw image data includes: and scanning an imaging area through a terahertz imaging device to acquire original image data, wherein the object to be identified is located in the imaging area.

According to some exemplary embodiments, the acquiring noise data includes: the background area in the imaging area is scanned by a terahertz imaging device to acquire noise data.

In still another aspect, there is provided a training apparatus of an image denoising model, comprising:

the training sample acquisition module is used for acquiring a training sample set, wherein the training sample set comprises a first training sample and a second training sample, the first training sample and the second training sample are used for respectively representing the input and the output of the image denoising model, the first training sample comprises an original image data sample and a noise data sample, the second training sample comprises a clean image data sample, and the original image data sample and the clean image data sample both comprise image data of an object to be identified;

A first convolutional neural network module for: receiving input of the original image data sample, and processing the original image data sample to obtain image characteristic data;

a second convolutional neural network module for: receiving an input of the noise data sample and processing the noise data sample to obtain noise characteristic data;

the fusion module is used for fusing the image characteristic data and the noise characteristic data to obtain fusion characteristic data;

the convolution operation module is used for carrying out convolution operation on the fusion characteristic data to obtain enhancement characteristic data, wherein the enhancement characteristic data is used for representing the image characteristics of the object to be identified; and

and the parameter adjustment module is used for adjusting each parameter of the image denoising model according to the difference between the clean image data sample and the enhancement characteristic data.

In still another aspect, there is provided an image denoising apparatus including:

the device comprises an original image data acquisition module, a data acquisition module and a data processing module, wherein the original image data is used for acquiring original image data, and the original image data comprises image data of an object to be identified;

a noise data acquisition module for acquiring noise data, the noise data being associated with the original image data;

A first convolutional neural network module for: receiving input of the original image data, and processing the original image data to obtain image characteristic data;

a second convolutional neural network module for: receiving the input of the noise data and processing the noise data to obtain noise characteristic data;

and the convolution operation module is used for carrying out convolution operation on the fusion characteristic data to obtain enhancement characteristic data, wherein the enhancement characteristic data is used for representing the image characteristics of the object to be identified.

In yet another aspect, there is provided an electronic device comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method as described above.

According to some exemplary embodiments, the electronic device is a passive terahertz imaging apparatus.

In yet another aspect, a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the method as described above is provided.

In yet another aspect, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a method as described above.

According to the embodiment of the disclosure, the real signals can be recovered under different signal-to-noise ratios by utilizing the denoising mode of the double-branch input neural network, so that a good and stable denoising effect is achieved.

Drawings

For a better understanding of the present invention, the present invention will be described in detail with reference to the following drawings:

fig. 1 is a schematic structural view of a passive terahertz imaging device according to an exemplary embodiment of the present disclosure.

Fig. 2A and 2B are schematic diagrams of images of different signal-to-noise ratios acquired by a passive terahertz imaging device according to an exemplary embodiment of the present disclosure, respectively.

Fig. 3 is a schematic flow chart of a training method of an image denoising model according to an exemplary embodiment of the present disclosure.

Fig. 4 is a schematic diagram of a dual-branch neural network included in an image denoising model according to an exemplary embodiment of the present disclosure.

Fig. 5 schematically shows a schematic flow chart of image denoising according to the dual-branch neural network shown in fig. 4.

Fig. 6 to 8 are schematic flowcharts of acquiring a training sample set in a training method of an image denoising model according to an exemplary embodiment of the present disclosure, respectively.

Fig. 9 is a schematic diagram of one noise data according to an exemplary embodiment of the present disclosure.

Fig. 10 is a schematic diagram of a power spectrum of the noise data shown in fig. 9.

Fig. 11 is a schematic diagram of raw image data acquired by a passive terahertz imaging device according to an exemplary embodiment of the present disclosure.

Fig. 12 is a schematic flow chart of an image denoising method according to an exemplary embodiment of the present disclosure.

Fig. 13A to 13C are schematic diagrams of original images with different signal-to-noise ratios, respectively.

Fig. 14A to 14C are schematic diagrams of images obtained by denoising the images of fig. 13A to 13C using a conventional BM3D denoising algorithm, respectively.

Fig. 15A to 15C are schematic diagrams of images obtained after denoising the images of fig. 13A to 13C, respectively, using an image denoising method according to an embodiment of the present disclosure.

Fig. 16 is a block diagram of a training apparatus of an image denoising model according to an exemplary embodiment of the present disclosure.

Fig. 17 is a block diagram of a structure of an image denoising apparatus according to an exemplary embodiment of the present disclosure.

Fig. 18 schematically illustrates a block diagram of an electronic device adapted to implement a training method of an image denoising model or an image denoising method according to an exemplary embodiment of the present disclosure.

Detailed Description

Specific embodiments of the invention will be described in detail below, it being noted that the embodiments described herein are for illustration only and are not intended to limit the invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that: no such specific details are necessary to practice the invention. In other instances, well-known structures, materials, or methods have not been described in detail in order to avoid obscuring the present invention.

Throughout the specification, references to "one embodiment," "an embodiment," "one example," or "an example" mean: a particular feature, structure, or characteristic described in connection with the embodiment or example is included within at least one embodiment of the invention. Thus, the appearances of the phrases "in one embodiment," "in an embodiment," "one example," or "an example" in various places throughout this specification are not necessarily all referring to the same embodiment or example. Furthermore, the particular features, structures, or characteristics may be combined in any suitable combination and/or sub-combination in one or more embodiments or examples. Furthermore, it will be understood by those of ordinary skill in the art that the term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

In this context, the expressions "raw image", "raw image data", "raw data", and the like, refer to noisy images, image data, i.e. it is raw images, image data, data acquired by a sensor or imaging device without noise. The expressions "clean image", "clean image data", "clean data" and the like refer to denoised images, image data, i.e. it is denoised images, image data, data after acquisition by a sensor or imaging device.

The inventor finds that the denoising technical route of the terahertz image can be divided into two main categories. In the first major denoising technical route, denoising is performed on a display image of a terahertz imaging device, wherein the image of the terahertz imaging device is a standard image, each pixel is between 0 and 255, and for such an image, a denoising algorithm such as traditional bilateral filtering, BM3D and the like can be adopted to improve the image quality, or an end-to-end image denoising neural network can be trained to realize the denoising effect. Training an end-to-end denoising neural network requires a large number of noisy and non-noisy image pairs, and the method of acquiring the non-noisy images is generally to expose the same scene multiple times and then superimpose the exposed images. One difficulty with standard image denoising neural networks is the high training cost, which is mainly reflected in the high cost of obtaining training samples, and the poor stability of model effects due to the diversity of field data in actual scenes. The traditional BM3D method has a general effect on images with low signal-to-noise ratio, and the BM3D algorithm cannot recover when the contrast ratio of an object and a background in an original image is low. In the second major denoising technical route, the original data generated by the terahertz imaging device is directly denoised, and then the denoised data is subjected to image reconstruction operation. The difficulty with processing the raw data is that the range of the data is not fixed, and a certain degree of knowledge of the characteristics of the raw data is required.

To this end, embodiments of the present disclosure provide a training method of an image denoising model including a first convolutional neural network and a second convolutional neural network, the method comprising: acquiring a training sample set, wherein the training sample set comprises a first training sample and a second training sample, the first training sample and the second training sample are used for representing the input and the output of the image denoising model respectively, the first training sample comprises an original image data sample and a noise data sample, the second training sample comprises a clean image data sample, and the original image data sample and the clean image data sample both comprise image data of an object to be identified; inputting the original image data sample into a first convolutional neural network; the first convolution neural network processes the original image data sample to obtain image characteristic data; inputting the noise data samples into a second convolutional neural network; the second convolutional neural network processes the noise data sample to obtain noise characteristic data; fusing the image characteristic data and the noise characteristic data to obtain fused characteristic data; performing convolution operation on the fusion characteristic data to obtain enhanced characteristic data, wherein the enhanced characteristic data is used for representing image characteristics of the object to be identified; and adjusting various parameters of the image denoising model according to the difference between the clean image data sample and the enhancement characteristic data.

The embodiment of the disclosure also provides an image denoising method, which comprises the following steps: acquiring original image data, wherein the original image data comprises image data of an object to be identified; acquiring noise data, the noise data being associated with the original image data; inputting the original image data into a first convolutional neural network; the first convolution neural network processes the original image data to obtain image characteristic data; inputting the noise data into a second convolutional neural network; the second convolutional neural network processes the noise data to obtain noise characteristic data; fusing the image characteristic data and the noise characteristic data to obtain fused characteristic data; and carrying out convolution operation on the fusion characteristic data to obtain enhanced characteristic data, wherein the enhanced characteristic data is used for representing the image characteristics of the object to be identified.

In the method according to the embodiment of the present disclosure, a denoising manner is performed by using a dual-branch input neural network, wherein one branch inputs terahertz original image data to be denoised, and the other branch inputs noise data. The method can recover the real signal under different signal-to-noise ratios, and has a good and stable denoising effect.

As shown in fig. 1, a passive terahertz imaging device according to an exemplary embodiment of the present disclosure may include a reflection plate 2 and its servo system, a lens 3, a detector array 4, a data acquisition and processing device 6, a display device 7, and a distribution box 5. Terahertz waves (or millimeter waves) spontaneously radiated by the detected object and terahertz waves (or millimeter waves) reflected by the background environment are incident on the reflecting plate 2 through the window 1 on the shell, reflected by the reflecting plate 2 to the lens 3, are received by the detector array 4 after being converged by the lens 3, and the received terahertz waves (or millimeter waves) are converted into electric signals by the detector array 4. The data acquisition and processing device 6 is connected to the detector array 4 to receive electrical signals from the detector array 4 and generate millimeter wave/terahertz wave images. The display device 7 is connected to the data acquisition and processing device 6, and is configured to receive and display the terahertz wave image (or millimeter wave image) generated by the data acquisition and processing device 6. The distribution box 5 is configured to supply power to the entire passive terahertz imaging apparatus.

In the actual working process, the servo system of the reflecting plate 2 controls the reflecting plate 2 to reciprocate, and the reciprocal of the period T of the motion is the imaging frame rate s. When the reflecting plate 2 swings from the maximum elevation angle to the minimum amplitude angle, the swinging angle is theta, so that the scanning of the view field angle with the height direction of 2 theta in the range of the depth of view is completed, and the reflecting plate 2 forms a pattern from the maximum elevation angle to the minimum elevation angle. The data acquisition and processing device 6 acquires data throughout this process. The control system of the reflecting plate 2 may be equipped with a position encoder, for example, to feed back the scanning position of the reflecting plate with high accuracy. When the data acquisition and processing device 6 acquires data, firstly, the acquired data is marked according to the information of the position encoder and is used for distinguishing the data of the next graph, then the data acquisition and processing device 6 processes and reconstructs the acquired data to generate terahertz/millimeter wave images, and then the data acquisition and processing device 6 can transmit the image data to the display device 7 so as to display the images on the display device 7, mark suspicious objects and automatically alarm.

Because the passive terahertz sensor has larger size and higher cost, in the passive terahertz imaging device, the scanning mechanism changes the optical paths of the sensors arranged in a linear array to form an area array in a time-sharing equivalent manner, so that two-dimensional imaging is performed on a certain area. Because of the small number of terahertz sensors, in the related art, a usable terahertz image is generally obtained by two-dimensional interpolation of the data of these sensors. When the signal-to-noise ratio of the original data of the sensor is low, a plurality of new interference features are introduced into the image obtained by two-dimensional interpolation, and the detail features of some real targets are destroyed due to larger noise during interpolation.

Further, the inventor has found that the two-dimensional interpolated data needs to be normalized to 0-1 or 0-255 to be a truly displayed image. One major factor affecting the visual effect of terahertz images is the contrast between human body parts in the images and suspects and backgrounds carried on the body. When the signal-to-noise ratio of the original data is low, the contrast between the human body and the object or background is low in the normalized ring node, and the image formed in the way is subjected to denoising, so that higher image contrast is difficult to obtain.

Fig. 2A and 2B are schematic diagrams of images of different signal-to-noise ratios acquired by a passive terahertz imaging device according to an exemplary embodiment of the present disclosure, respectively. When the signal-to-noise ratios of the original signals are different, the normalized data will have different contrasts. Referring to fig. 2A, when the signal-to-noise ratio of the original signal is low, the contrast between the obtained region of interest (as indicated by the dotted line box labeled ROI in fig. 2A) and the human body is low. Referring to fig. 2B, when the signal-to-noise ratio of the original signal is high, the contrast between the obtained region of interest (as indicated by the dotted line box labeled ROI in fig. 2B) and the human body is high.

In the embodiment of the disclosure, an image denoising model is provided, and the image denoising model comprises a double-branch input neural network, so that good denoising effects can be realized on image data with different signal to noise ratios.

As shown in fig. 3, a training method of an image denoising model according to an exemplary embodiment of the present disclosure may include operations S310 to S380, and the training method may be performed by a processor or by any electronic device including a processor. The image denoising model may include a first convolutional neural network and a second convolutional neural network.

In operation S310, a training sample set is obtained, the training sample set including a first training sample and a second training sample, the first training sample and the second training sample being used to characterize an input and an output of the image denoising model, respectively, the first training sample including an original image data sample and a noise data sample, the second training sample including a clean image data sample, the original image data sample and the clean image data sample both including image data of an object to be identified.

For example, the actual sampled data of the terahertz imaging device may be directly taken as the input signal of the data input branch to be denoised (i.e. as the original image data sample), and the noise sampling branch is input with an approximately pure noise sequence.

In operation S320, the original image data sample is input to a first convolutional neural network.

In operation S330, the first convolutional neural network processes the original image data sample to obtain image feature data.

In operation S340, the noise data samples are input to a second convolutional neural network.

In operation S350, the second convolutional neural network processes the noise data samples to obtain noise characteristic data.

In operation S360, the image feature data and the noise feature data are fused to obtain fused feature data.

In an embodiment of the disclosure, the fusing the image feature data and the noise feature data includes: the noise feature data is removed from the image feature data.

In operation S370, a convolution operation is performed on the fusion feature data to obtain enhancement feature data, where the enhancement feature data is used to characterize the image feature of the object to be identified.

In operation S380, respective parameters of the image denoising model are adjusted according to differences between the clean image data samples and the enhanced feature data.

It is to be appreciated that in embodiments of the present disclosure, the neural network model may include portions of pooling layers, loss functions, etc., which may utilize various pooling layers, loss functions in known convolutional neural networks, and are not described in detail herein.

Fig. 4 is a schematic diagram of a dual-branch neural network included in an image denoising model according to an exemplary embodiment of the present disclosure. Fig. 5 schematically shows a schematic flow chart of image denoising according to the dual-branch neural network shown in fig. 4.

Referring to fig. 3, 4 and 5 in combination, the first convolutional neural network 10 may include a first convolutional layer 11 and a second convolutional layer 12 connected in sequence. In an embodiment of the present disclosure, the first convolution layer 11 comprises a one-dimensional convolution layer and the second convolution layer 12 comprises a two-dimensional convolution layer. For example, the one-dimensional convolution layer may be at least one of a 7*1 convolution layer and a 5*1 convolution layer, and the two-dimensional convolution layer may be a 3*3 convolution layer.

For example, the first convolution layer 11 includes a first sub-convolution layer 111 and a second sub-convolution layer 112 that are sequentially connected, where each of the first sub-convolution layer 111 and the second sub-convolution layer 112 is a one-dimensional convolution layer. For example, the first sub-convolution layer 111 is a 7*1 convolution layer and the second sub-convolution layer 112 is a 5*1 convolution layer.

For example, the second convolution layer 12 includes a third sub-convolution layer 121 and a fourth sub-convolution layer 122 that are sequentially connected, where the third sub-convolution layer 121 and the fourth sub-convolution layer 122 are two-dimensional convolution layers. For example, the third sub-convolution layer 121 and the fourth sub-convolution layer 122 are each 3*3 convolution layers.

The second convolutional neural network 20 may include a fifth convolutional layer 21 and a sixth convolutional layer 22 connected in sequence. In an embodiment of the present disclosure, the fifth convolution layer 21 comprises a one-dimensional convolution layer and the sixth convolution layer 22 comprises a two-dimensional convolution layer. For example, the one-dimensional convolution layer may be at least one of a 7*1 convolution layer and a 5*1 convolution layer, and the two-dimensional convolution layer may be a 3*3 convolution layer.

For example, the fifth convolution layer 21 includes a fifth sub-convolution layer 211 and a sixth sub-convolution layer 212 that are sequentially connected, where each of the fifth sub-convolution layer 211 and the sixth sub-convolution layer 212 is a one-dimensional convolution layer. For example, the fifth sub-convolution layer 211 is a 7*1 convolution layer and the sixth sub-convolution layer 212 is a 5*1 convolution layer.

For example, the sixth convolution layer 22 includes a seventh sub-convolution layer 221 and an eighth sub-convolution layer 222 that are sequentially connected, where each of the seventh sub-convolution layer 221 and the eighth sub-convolution layer 222 is a two-dimensional convolution layer. For example, the seventh sub-convolution layer 221 and the eighth sub-convolution layer 222 are each 3*3 convolution layers.

In embodiments of the present disclosure, since each sensor is independent and noise can be considered uncorrelated for both the signal input branch to be denoised (i.e., the raw image data branch) and the noise sampling branch, a one-dimensional convolution kernel is employed for feature extraction at an early stage of the network, e.g., in the first convolutional neural network described above, a one-dimensional convolution kernel of length 7 is employed for the first layer and a one-dimensional convolution kernel of length 5 is employed for the second layer. The one-dimensional convolution operation extracts signal features in a single sensor. By the operation, the dimension of the data can be reduced relatively quickly while the characteristics of the signals are maintained, and the calculation amount is reduced.

Further, the terahertz imaging device adopts a plurality of identical sensors, and input signals between adjacent sensors have certain correlation in the imaging process, so in the convolutional neural network, a two-dimensional convolutional kernel is adopted for feature extraction operation from a third layer. This mode of operation enables the correlation between channels to be exploited to extract richer signal features.

With continued reference to fig. 4, the first convolutional neural network 10 may further include a third convolutional layer 13 and a fourth convolutional layer 14. The third convolution layer 13 comprises a two-dimensional convolution layer. For example, the third convolution layer 13 includes a 3*3 convolution layer. The fourth convolution layer 14 comprises a two-dimensional convolution layer. For example, the fourth convolution layer 14 includes a 3*3 convolution layer.

The second convolutional neural network 20 may further include a seventh convolutional layer 23. The seventh convolution layer 23 comprises a two-dimensional convolution layer. For example, the seventh convolution layer 23 includes one 3*3 convolution layer.

Next, a training method of an image denoising model according to an embodiment of the present disclosure will be further described with reference to fig. 3 to 5.

In operation S330, the first convolutional neural network 10 processes the raw image data sample, which may specifically include the following sub-operations.

In sub-operation S3301, the original image data sample is input into a first convolution layer, and a first image feature is obtained through a first convolution operation.

For example, the original image data samples may be input to the first sub-convolution layer 111 for 7*1 convolution operations; then, the output of the first sub-convolution layer 111 is input to the second sub-convolution layer 112, and the 5*1 convolution operation is performed, and the output of the second sub-convolution layer 112 is the first image feature. That is, the first convolution operation includes 7*1 convolution operation and 5*1 convolution operation.

In sub-operation S3302, the first image feature is input into a second convolution layer, and a second image feature is obtained through a second convolution operation.

For example, the first image feature may be input to the third sub-convolution layer 121 for 3*3 convolution operations; then, the output of the third sub-convolution layer 121 is input to the fourth sub-convolution layer 122, and a 3*3 convolution operation is performed, where the output of the fourth sub-convolution layer 122 is the second image feature.

In operation S350, the second convolutional neural network processes the noise data samples, which may specifically include the following sub-operations.

In sub-operation S3501, the noise data samples are input into a fifth convolution layer, and undergo a fifth convolution operation to obtain a first noise characteristic.

For example, the noise data samples may be input to the fifth sub-convolution layer 211 for 7*1 convolution operations; then, the output of the fifth sub-convolution layer 211 is input to the sixth sub-convolution layer 212, and a 5*1 convolution operation is performed, where the output of the sixth sub-convolution layer 212 is the first noise characteristic. That is, the fifth convolution operation includes 7*1 convolution operation and 5*1 convolution operation.

In sub-operation S3502, the first noise characteristic is input to the sixth convolution layer, and the second noise characteristic is obtained through the sixth convolution operation.

For example, the first noise characteristic may be input to the seventh sub-convolution layer 221 for 3*3 convolution operations; then, the output of the seventh sub-convolution layer 221 is input to the eighth sub-convolution layer 222, and a 3*3 convolution operation is performed, where the output of the eighth sub-convolution layer 222 is the second noise characteristic.

In operation S360, fusing the image feature data and the noise feature data may specifically include sub-operation S3601. In a sub-operation S3601, the second image feature and the second noise feature are first fused to obtain a third image feature. The second image feature is a noisy image feature and the second noise feature is a feature of a noise sample. For example, the first fusion may comprise a feature that removes noise samples from noisy image features, i.e. in the first fusion a first denoising is performed.

In operation S330, the first convolutional neural network 10 processing the raw image data sample may further include a sub-operation S3303 in particular. In sub-operation S3303, the third image feature is input to the third convolution layer 13, and a fourth image feature is obtained through a third convolution operation. For example, the third convolution operation may be a 3*3 convolution operation.

In operation S350, the second convolutional neural network processing the noise data samples may further include a sub-operation S3503 in particular. In sub-operation S3503, the second noise characteristic is input to the seventh convolution layer 23, and the seventh convolution operation is performed to obtain a third noise characteristic. For example, the seventh convolution operation may be a 3*3 convolution operation.

In operation S360, fusing the image feature data and the noise feature data may specifically include sub-operation S3602. In a sub-operation S3602, the fourth image feature and the third noise feature are subjected to second fusion, so as to obtain a fifth image feature. The fourth image feature is a noisy image feature and the fifth noise feature is a feature of a noise sample. For example, the second fusion may comprise features that further remove noise samples from noisy image features, i.e. in the second fusion, a second denoising is performed.

Specifically, in operation S370, the fifth image feature may be input to the fourth convolution layer 14, and the enhanced feature data may be obtained through a fourth convolution operation. For example, the fourth convolution operation may be a 3*3 convolution operation.

Alternatively, in embodiments of the present disclosure, a plurality of fourth convolution layers 14 may be provided, for example, 2 fourth convolution layers 14 may be provided, each fourth convolution layer 14 being a 3*3 convolution layer. That is, in operation S370, the fifth image feature may be subjected to a 3*3 convolution operation twice to obtain the enhanced feature data.

In the embodiment of the disclosure, in a deeper level of the neural network model, features of the two input branches at various abstract degrees are fused in a feature layer stacking mode, so that better noise removal is facilitated.

Further, in an embodiment of the present disclosure, a method of denoising using a neural network with two branches is provided, where one branch inputs terahertz raw image data to be denoised and the other branch inputs noise sampling data. Because each frame of data lasts for a short time, and the noise in the terahertz data mainly originates from the thermal noise of the sensor, the noise characteristics can be considered to be relatively stable in a short period of time, so if a noise branch can acquire a part of noise samples of the same frame of data, the noise characteristics are extracted by using the part, and then the extracted noise characteristics are integrated into a denoising network together, so that the neural network model has stronger robustness in the practical use process.

It should be appreciated that training a denoising neural network requires a large number of noisy and non-noisy pairs of samples, and the cost of the manner in which non-noisy data (with extremely high signal-to-noise ratio) is obtained by superimposing the data from multiple exposures of the same scene is relatively high. To this end, embodiments of the present disclosure propose a method of acquiring a training sample set.

Referring to fig. 6, in a training method of an image denoising model according to an exemplary embodiment of the present disclosure, the following operations may be performed to acquire a training sample set, for example, to acquire noise data samples in the training sample set.

In operation S610, noisy data samples within a prescribed period of time are acquired.

For example, the terahertz imaging apparatus may be placed in a stable background environment, i.e., the observation scene is stationary without changing the target, and a wave-absorbing material may be provided in the observation scene to clean the background. In this case, the terahertz imaging apparatus may acquire raw image data of the sensor for a prescribed period of time, i.e., obtain noisy data samples for the prescribed period of time. For example, 5 minutes may be collected every 1 hour.

In operation S620, the noisy data samples within the prescribed period of time are averaged to obtain desired data.

Since the passive terahertz imaging apparatus is similar to a camera, there will be a clean background area within the imaging range of the imaging apparatus, such as an upper empty area of the imaging area where only the background is present and no other constantly changing targets are present. Because the terahertz imaging device measures the radiation brightness temperature, the emissivity of the object is unchanged, and the radiation brightness temperature of the object is unchanged when the temperature of the object is unchanged, the expected value of the terahertz imaging device can be obtained by tracking the simple background data. That is, in a period of time after the imaging device has been operated for a different period of time, for example, 5 minutes are acquired every 1 hour, and noisy data samples acquired in this period of time are averaged, and the data samples after the averaging process should be a desired signal from which noise is removed, that is, the desired value, due to the random characteristic of noise.

In operation S630, noise data within a prescribed period of time is obtained from the noisy data samples and the desired data.

For example, the desired value is subtracted from each frame of noisy data samples to obtain noisy data. As shown in fig. 9, noise data according to some exemplary embodiments of the present disclosure is schematically shown. For example, in fig. 9, the abscissa may represent a count of sampling points, and the ordinate may represent the strength of a signal, for example, a voltage value.

Because the noise of the terahertz passive imaging device mainly comes from the thermal noise of the sensor, and the thermal noise of the sensor is relatively stable at normal temperature, the modeling can be performed by using a second-order stable signal.

In operation S640, a noise signal model is established according to the noise data within the prescribed period of time.

Since a small portion of the imaging area of the terahertz imaging apparatus is a pure background area, each frame of image can acquire a sample of a small portion of the noise signal. Moreover, in a short period of time, the noise can be considered to be characteristic stationary, so that the characteristics of the small portion of the noise signal and the noise characteristics of the remaining portion are statistically the same.

For example, the noise signal model M is obtained by an autoregressive modeling method using a power spectrum estimation method for the obtained noise data within a predetermined period of time _i，k (n) the subscript i denotes the noise signal model obtained using the sampling data of the ith round, k denotes the kth sensor, and n is the index value of the discrete sequence. Through experimental observation, the noise signal model obtained after a period of operation of the sensor is substantially nearly stable with only a small amount of difference, and therefore, at least one model selected from the generated plurality of noise signal models can be used to represent most of the power characteristics of the noise, and fig. 10 schematically illustrates a noise power model estimated using the collected noise data. For example, in fig. 10, the abscissa may represent normalized angular frequency and the ordinate may represent power spectral density of the signal.

In operation S650, a noise data sample is obtained according to the noise signal model.

For example, according to the noise signal model M _i，k (n) noise data samples may be generated.

Referring to fig. 7, in a training method of an image denoising model according to an exemplary embodiment of the present disclosure, the following operations may be performed to acquire a training sample set, for example, to acquire clean image data samples in the training sample set.

In operation S710, under a first condition, a first raw image data sample is acquired.

Specifically, the imaging region may be scanned by the terahertz imaging apparatus under a condition that the ambient temperature is lower than the threshold temperature to obtain the first raw image data sample, and fig. 11 schematically illustrates a schematic diagram of raw image data obtained by the passive terahertz imaging apparatus of the exemplary embodiment of the present disclosure under the first condition. For example, in fig. 11, the abscissa may represent a count of sampling points, and the ordinate may represent the strength of a signal, for example, a voltage value.

For example, the object to be identified is located in the imaging region. For example, the threshold temperature may be 20 ℃.

For example, to acquire clean signals, the imaging device may be placed in a scene where the ambient temperature is low, e.g., 15 ℃, and then different people are allowed to walk around at will with different targets within the imaging range of the imaging device, and raw image data is acquired. In this embodiment, the original image may be 220 pixels wide and 440 pixels high. The imaging device may comprise 40 sensors, each acquiring 512 lines of data, i.e. the raw image data comprises a matrix of 40 x 512.

In an embodiment of the present disclosure, the first condition is a condition that the sensor or imaging device can acquire image data of high signal-to-noise ratio.

In operation S720, a second image data sample is obtained according to the first original image data sample and the noise data sample, wherein the second image data sample includes clean image data from which noise is removed.

For example, the noise data samples may be subtracted from the first raw image data samples to obtain second image data samples. Because the first original image data sample acquired under the first condition has a higher signal-to-noise ratio, the noise data is subtracted from the first original image data sample with a high signal-to-noise ratio, so that a better denoising effect can be realized, and cleaner clean image data can be obtained. That is, the second image data sample can better represent a clean image that can be obtained under the first condition.

In operation S730, a third image data sample under at least one second condition is generated from the second image data sample, wherein the third image data sample includes clean image data from which noise is removed.

Specifically, in operation S730, a scaling factor may be determined according to the temperature value in the first condition and the temperature value in the second condition; the second image data samples are then scaled according to the scaling factor to generate at least one third image data sample under a second condition.

For example, referring to fig. 11, at 15 ℃, the difference between the human body and the ambient temperature is large, and the voltage value formed is around 6000. In order to obtain clean image data at other temperatures, actual sampling is not required again, and the second image data sample is only required to be scaled by a certain multiplying power according to the proportional relation between the other temperatures and the temperature (for example, 15 ℃) under the first condition, so as to simulate at least one third image data sample under the second condition.

In the embodiment of the disclosure, for the case that the signal-to-noise ratio of the original image is higher, the signal-to-noise ratio of the image can be improved by using a denoising algorithm such as BM3D, the denoised image is sampled and scaled, and clean image data at different environmental temperatures can be simulated by changing the scaling factor.

In operation S740, the clean image data sample is obtained from the second image data sample and the third image data sample.

For example, the second image data sample and the third image data sample are combined to obtain the clean image data sample.

In an embodiment of the present disclosure, the signal-to-noise ratio of the raw image data samples acquired under the first condition is higher than the signal-to-noise ratio of the raw image data samples acquired under the second condition.

Since the signal of the terahertz imaging apparatus is mainly formed by the temperature difference between the human body and the environment, in the embodiment of the present disclosure, the imaging device is placed in a scene where the ambient temperature is low, and thus the signal-to-noise ratio of the acquired image itself is high.

Referring to fig. 8, in a training method of an image denoising model according to an exemplary embodiment of the present disclosure, the following operations may be performed to acquire a training sample set, for example, to acquire raw image data samples in the training sample set.

In operation S810, the original image data sample is obtained from the obtained noise data sample and the clean image data sample.

On the basis of the acquired noise data sample and the clean image data sample, for example, the acquired noise data sample and the clean image data sample may be added to synthesize a noisy original image data sample.

As described above, in embodiments of the present disclosure, clean image data samples under different conditions may be obtained. Further, the original image data under different conditions can be synthesized based on the clean image data samples under different conditions. In this way, noisy raw image data samples can be obtained efficiently and at low cost.

In embodiments of the present disclosure, different people may be arranged to walk around at will in an imaging scene with different targets, to promote the diversity of training signals, which may be embodied as the diversity of human body gestures and the diversity of carrying targets. In this way, each terahertz image acquired can be used to generate raw image data (i.e., noisy image data) and clean image data (i.e., noise-removed image data) at different ambient temperatures. In this way, a large number of training sample pairs can be generated at a low cost and with high efficiency.

In contrast, in the related art, a training data pair needs to be generated by overlapping the exposure data to synthesize a clean image, for example, if each scene is overlapped 50 times, the time for acquiring the data is 50 times longer than the acquisition time of the embodiment of the disclosure, and the conventional acquisition mode cannot conveniently simulate different environmental temperatures. Thus, the manner in which the training sample set is obtained as set forth in the embodiments of the present disclosure provides significant advantages.

The embodiment of the disclosure also provides an image denoising method. The image denoising method can use the image denoising model obtained by the training method to denoise the image. It should be noted that, the image denoising method described below corresponds to the training method of the image denoising model described above, and for brevity, some exemplary descriptions of the image denoising method will be omitted from the description below, and for these omitted parts, reference may be made to the corresponding parts of the training method of the image denoising model described above without conflict.

As shown in fig. 12, an image denoising method according to an exemplary embodiment of the present disclosure may include operations S1210 to S1280, which may be performed by a processor or by any electronic device including a processor. For example, it may be performed by the above-described imaging apparatus.

In operation S1210, raw image data including image data of an object to be identified is acquired.

In operation S1220, noise data associated with the original image data is acquired.

In operation S1230, the raw image data is input into a first convolutional neural network.

In operation S1240, the first convolutional neural network processes the raw image data to obtain image feature data.

In operation S1250, the noise data is input to a second convolutional neural network.

In operation S1260, the second convolutional neural network processes the noise data to obtain noise characteristic data.

In operation S1270, the image feature data and the noise feature data are fused to obtain fused feature data.

In operation S1280, a convolution operation is performed on the fusion feature data to obtain enhancement feature data, where the enhancement feature data is used to characterize an image feature of the object to be identified.

In an embodiment of the present disclosure, the method may further include: and interpolating the enhancement characteristic data to obtain an interpolation image. For example, the interpolation may be a two-dimensional interpolation.

In the embodiment of the disclosure, noise is removed before the interpolation is performed on the original image data, so that the situation that the real characteristics are destroyed or new interference characteristics are introduced due to large noise can be effectively avoided, and a higher signal-to-noise ratio can be naturally obtained when the interpolation is performed on the data.

For example, the acquiring the original image data includes: and scanning an imaging area through a terahertz imaging device to acquire original image data, wherein the object to be identified is located in the imaging area.

For example, the acquiring noise data includes: the background area in the imaging area is scanned by a terahertz imaging device to acquire noise data.

Fig. 13A to 13C are schematic diagrams of original images with different signal-to-noise ratios, respectively. Fig. 14A to 14C are schematic diagrams of images obtained by denoising the images of fig. 13A to 13C using a conventional BM3D denoising algorithm, respectively. Fig. 15A to 15C are schematic diagrams of images obtained after denoising the images of fig. 13A to 13C, respectively, using an image denoising method according to an embodiment of the present disclosure. Wherein, for the original images in fig. 13A to 13C, the signal-to-noise ratio thereof gradually increases.

As can be seen by comparing fig. 13A, fig. 14A and fig. 15A, for an original image with a low signal-to-noise ratio, the signal-to-noise ratio of a denoising image obtained by the conventional BM3D denoising algorithm is low, and suspects and human bodies cannot be well distinguished; the denoising image obtained by the image denoising method disclosed by the embodiment of the invention has higher signal-to-noise ratio and can better recover the real signal.

As can be seen from comparing fig. 13A to 15C, for the original images with low signal-to-noise ratio, medium signal-to-noise ratio and high signal-to-noise ratio, the denoising images obtained by the image denoising method according to the embodiment of the disclosure have higher signal-to-noise ratio, that is, the image denoising method can recover the real signal to the greatest extent under different signal-to-noise ratios, and has higher stability.

Based on the training method of the image denoising model, the disclosure also provides a training device of the image denoising model. The device will be described in detail below in connection with fig. 16.

As shown in fig. 16, the training apparatus 800 of the image denoising model includes a training sample acquisition module 810, a first convolutional neural network module 820, a second convolutional neural network module 830, a fusion module 840, a convolutional operation module 850, and a parameter adjustment module 860.

The training sample obtaining module 810 is configured to obtain a training sample set, where the training sample set includes a first training sample and a second training sample, the first training sample and the second training sample are used to characterize input and output of the image denoising model, the first training sample includes an original image data sample and a noise data sample, the second training sample includes a clean image data sample, and the original image data sample and the clean image data sample both include image data of an object to be identified. In some exemplary embodiments, the training sample acquiring module 810 may be configured to perform the operation S310 and its sub-operations described above, which are not described herein.

A first convolutional neural network module 820 for: and receiving input of the original image data sample, and processing the original image data sample to obtain image characteristic data. In some exemplary embodiments, the first convolutional neural network module 820 may be used to perform operations S320 and S330 and its sub-operations described above, which are not described herein.

A second convolutional neural network module 830 for: and receiving an input of the noise data sample and processing the noise data sample to obtain noise characteristic data. In some exemplary embodiments, the second convolutional neural network module 830 may be configured to perform operations S340 and S350 and sub-operations thereof described above, which are not described herein.

And a fusion module 840, configured to fuse the image feature data and the noise feature data to obtain fusion feature data. In some exemplary embodiments, the fusion module 840 may be configured to perform the operation S360 and its sub-operations described above, which are not described herein.

The convolution operation module 850 is configured to perform a convolution operation on the fused feature data to obtain enhanced feature data, where the enhanced feature data is used to characterize an image feature of the object to be identified. In some exemplary embodiments, the convolution operation module 850 may be configured to perform the operation S370 and its sub-operations described above, which are not described herein.

And a parameter adjustment module 860, configured to adjust each parameter of the image denoising model according to a difference between the clean image data sample and the enhancement feature data. In some exemplary embodiments, the parameter adjustment module 860 may be configured to perform the operation S380 and its sub-operations described above, which are not described herein.

As shown in fig. 17, the image denoising apparatus 900 includes an original image data acquisition module 910, a noise data acquisition module 920, a first convolutional neural network module 930, a second convolutional neural network module 940, a fusion module 950, and a convolutional operation module 960.

The raw image data acquisition module 910 is configured to acquire raw image data, where the raw image data includes image data of an object to be identified. In some exemplary embodiments, the original image data acquisition module 910 may be configured to perform the operation S1210 and its sub-operations described above, which are not described herein.

A noise data acquisition module 920, configured to acquire noise data, where the noise data is associated with the original image data. In some exemplary embodiments, the noise data obtaining module 920 may be configured to perform the operation S1220 and its sub-operations described above, which are not described herein.

A first convolutional neural network module 930 for: and receiving input of the original image data, and processing the original image data to obtain image characteristic data. In some exemplary embodiments, the first convolutional neural network module 930 may be used to perform operations S1230 and S1240 and their sub-operations described above, which are not described herein.

A second convolutional neural network module 940 for: and receiving the input of the noise data and processing the noise data to obtain noise characteristic data. In some exemplary embodiments, the second convolutional neural network module 940 may be used to perform operations S1250, S1260 and sub-operations thereof described above, which are not described herein.

And a fusion module 950, configured to fuse the image feature data and the noise feature data to obtain fused feature data. In some exemplary embodiments, the fusion module 950 may be configured to perform the operation S1270 and sub-operations thereof described above, which are not described herein.

And a convolution operation module 960, configured to perform convolution operation on the fused feature data to obtain enhanced feature data, where the enhanced feature data is used to characterize an image feature of the object to be identified. In some exemplary embodiments, the convolution operation module 960 may be used to perform the operation S1280 and its sub-operations described above, which are not described herein.

According to an embodiment of the present disclosure, the training apparatus 800 for an image denoising model includes a training sample acquiring module 810, a first convolutional neural network module 820, a second convolutional neural network module 830, a fusion module 840, a convolutional operation module 850, and a parameter adjustment module 860, and any of the modules including the raw image data acquiring module 910, the noise data acquiring module 920, the first convolutional neural network module 930, the second convolutional neural network module 940, the fusion module 950, and the convolutional operation module 960 may be combined into one module to be implemented, or any of the modules may be split into a plurality of modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to an embodiment of the present disclosure, the training apparatus 800 of the image denoising model includes a training sample acquiring module 810, a first convolutional neural network module 820, a second convolutional neural network module 830, a fusion module 840, a convolutional operation module 850, and a parameter adjustment module 860, and the image denoising apparatus 900 includes at least one of an original image data acquiring module 910, a noise data acquiring module 920, a first convolutional neural network module 930, a second convolutional neural network module 940, a fusion module 950, and a convolutional operation module 960 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, an Application Specific Integrated Circuit (ASIC), or any other reasonable manner of integrating or encapsulating the circuit, or may be implemented as hardware or firmware, or any one of three or a suitable combination of any of the three of the above. Alternatively, the training apparatus 800 for an image denoising model includes a training sample acquiring module 810, a first convolutional neural network module 820, a second convolutional neural network module 830, a fusion module 840, a convolutional operation module 850, and a parameter adjustment module 860, and at least one of the image denoising apparatus 900 including an original image data acquiring module 910, a noise data acquiring module 920, a first convolutional neural network module 930, a second convolutional neural network module 940, a fusion module 950, and a convolutional operation module 960 may be at least partially implemented as a computer program module that may perform corresponding functions when executed.

As shown in fig. 18, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. The processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 1001 may also include on-board memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of the method flows according to embodiments of the present disclosure.

For example, the electronic device may be a passive terahertz imaging apparatus.

In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiment of the present disclosure by executing programs in the ROM 1002 and/or the RAM 1003. Note that the program may be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of the method flow according to the embodiments of the present disclosure by executing programs stored in the one or more memories.

According to an embodiment of the disclosure, the electronic device 1000 may also include an input/output (I/O) interface 1005, the input/output (I/O) interface 1005 also being connected to the bus 1004. The electronic device 1000 may also include one or more of the following components connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output portion 1007 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.; a storage portion 1008 including a hard disk or the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The drive 1010 is also connected to the I/O interface 1005 as needed. A removable medium 1011, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in the drive 1010, so that a computer program read out therefrom is installed as needed in the storage section 1008.

The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 1002 and/or RAM 1003 and/or one or more memories other than ROM 1002 and RAM 1003 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to implement the item recommendation method provided by embodiments of the present disclosure.

The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1001. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of signals on a network medium, distributed, and downloaded and installed via the communication section 1009, and/or installed from the removable medium 1011. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1009, and/or installed from the removable medium 1011. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1001. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims

1. A method of training an image denoising model comprising a first convolutional neural network and a second convolutional neural network, the method comprising:

inputting the noise data samples into a second convolutional neural network;

2. The method of claim 1, wherein the fusing the image feature data and the noise feature data comprises: the noise feature data is removed from the image feature data.

3. The method of claim 1 or 2, wherein the first convolutional neural network comprises a first convolutional layer and a second convolutional layer connected in sequence, the first convolutional layer comprising a one-dimensional convolutional layer, the second convolutional layer comprising a two-dimensional convolutional layer.

4. A method according to claim 3, wherein the second convolutional neural network comprises a fifth convolutional layer comprising a one-dimensional convolutional layer and a sixth convolutional layer comprising a two-dimensional convolutional layer, connected in sequence.

5. The method of claim 4, wherein the first convolutional neural network processing the raw image data samples comprises:

6. The method of claim 5, wherein the second convolutional neural network processing the noise data samples comprises:

7. The method of claim 6, wherein fusing the image feature data and the noise feature data comprises:

8. The method of claim 7, wherein the first convolutional neural network further comprises a third convolutional layer comprising a two-dimensional convolutional layer,

9. The method of claim 8, wherein the second convolutional neural network further comprises a seventh convolutional layer comprising a two-dimensional convolutional layer,

10. The method of claim 9, wherein fusing the image feature data and the noise feature data further comprises:

11. The method of claim 10, wherein the first convolutional neural network further comprises a fourth convolutional layer comprising a two-dimensional convolutional layer,

12. The method of claim 1 or 2, wherein the acquiring a training sample set comprises:

acquiring noisy data samples within a specified time period;

and obtaining noise data samples according to the noise signal model.

13. The method of claim 12, wherein the acquiring a training sample set further comprises:

under a first condition, acquiring first original image data samples;

14. The method of claim 13, wherein the acquiring a training sample set further comprises:

15. The method according to claim 13, wherein the acquiring the first raw image data sample under the first condition specifically comprises:

wherein the object to be identified is located in the imaging region.

16. The method of claim 12, wherein the acquiring noisy data samples over a specified period of time comprises:

17. The method according to claim 15, wherein said generating third image data samples under at least one second condition from said second image data samples, in particular comprises:

18. The method of claim 3, wherein the first convolution layer comprises a first sub-convolution layer and a second sub-convolution layer connected in sequence, the first sub-convolution layer and the second sub-convolution layer each being one-dimensional convolution layers; and/or the second convolution layer comprises a third sub-convolution layer and a fourth sub-convolution layer which are sequentially connected, and the third sub-convolution layer and the fourth sub-convolution layer are two-dimensional convolution layers.

19. The method of claim 4, wherein the fifth convolutional layer comprises a fifth and a sixth sub-convolutional layer connected in sequence, each of the fifth and sixth sub-convolutional layers being a one-dimensional convolutional layer; and/or the sixth convolution layer comprises a seventh sub-convolution layer and an eighth sub-convolution layer which are sequentially connected, wherein the seventh sub-convolution layer and the eighth sub-convolution layer are two-dimensional convolution layers.

20. An image denoising method, comprising:

inputting the original image data into a first convolutional neural network;

inputting the noise data into a second convolutional neural network;

21. The method of claim 20, wherein the fusing the image feature data and the noise feature data comprises: the noise feature data is removed from the image feature data.

22. The method of claim 20 or 21, wherein the first convolutional neural network comprises a first convolutional layer comprising a one-dimensional convolutional layer and a second convolutional layer comprising a two-dimensional convolutional layer, connected in sequence.

23. The method of claim 22, wherein the second convolutional neural network comprises a fifth convolutional layer comprising a one-dimensional convolutional layer and a sixth convolutional layer comprising a two-dimensional convolutional layer, connected in sequence.

24. The method of claim 23, wherein the first convolutional neural network processing the raw image data comprises:

25. The method of claim 24, wherein the second convolutional neural network processing the noise data comprises:

26. The method of claim 25, wherein fusing the image feature data and the noise feature data comprises:

27. The method of claim 26 wherein the first convolutional neural network further comprises a third convolutional layer comprising a two-dimensional convolutional layer,

28. The method of claim 27 wherein the second convolutional neural network further comprises a seventh convolutional layer comprising a two-dimensional convolutional layer,

29. The method of claim 28, wherein fusing the image feature data and the noise feature data further comprises:

30. The method of claim 29, wherein the first convolutional neural network further comprises a fourth convolutional layer comprising a two-dimensional convolutional layer,

31. The method of claim 30, wherein the method further comprises: and interpolating the enhancement characteristic data to obtain an interpolation image.

32. The method of claim 20 or 21, wherein the acquiring raw image data comprises: and scanning an imaging area through a terahertz imaging device to acquire original image data, wherein the object to be identified is located in the imaging area.

33. The method of claim 32, wherein the acquiring noise data comprises: the background area in the imaging area is scanned by a terahertz imaging device to acquire noise data.

34. An image denoising model training apparatus, comprising:

35. An image denoising apparatus, comprising:

36. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-33.

37. The electronic device of claim 36, wherein the electronic device is a passive terahertz imaging apparatus.

38. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 33.

39. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 33.