WO2022226886A1

WO2022226886A1 - Image processing method based on transform domain denoising autoencoder as a priori

Info

Publication number: WO2022226886A1
Application number: PCT/CN2021/090956
Authority: WO
Inventors: 李彦明; 郑海荣; 刘新; 万丽雯; 胡战利; 周瑾洁
Original assignee: 深圳高性能医疗器械国家研究院有限公司
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2022-11-03

Abstract

Disclosed are an image processing method based on a transform domain denoising autoencoder as a priori. The method comprises: using an original image and multi-channel transformation features to construct a multi-channel tensor space having multi-scale and multi-view characteristics, and constructing a training data set; training a de-noising autoencoder network on the basis of the training data set so as to combine an image transform domain with an original pixel domain to obtain an image in the transform domain, and learning a priori information in the multi-channel tensor space by using the image in the transform domain; and introducing the a priori information learned from the multi-channel tensor space into an iteration process for processing image restoration problems to carry out solving, so as to obtain an optimized denoising autoencoder network. Using a reconstructed image obtained in the present invention, the quality of the image is improved while maintaining more texture details, and diagnostic requirements are better satisfied.

Description

Image processing method based on transform domain denoising autoencoder as prior

technical field

The present invention relates to the technical field of medical image processing, and more particularly, to an image processing method based on transform domain denoising automatic encoder as a priori.

Background technique

In recent years, medical image processing has been widely used in clinical guidance. For example, X-ray computed tomography (CT) is used for diagnosis and intervention in hospitals and clinics. X-ray CT may pose a potential risk of cancer or genetic disease due to exposure to radiation. X-CT medical imaging images have the advantages of high density and resolution of tissue structures and little damage to the human body, and are very important for the study of pathology and anatomy. However, during the scanning process of X-CT machine and the process of transmitting images, the phenomenon of blurred images or indistinct borders will occur, resulting in low readability of X-CT medical image images, and doctors cannot make accurate diagnosis. Therefore it is necessary to reduce the X-ray dose. State-of-the-art techniques generally attempt to solve this problem in two ways: by reducing the operating current and exposure time of the X-ray tube, or by reducing the number of sampled views. The former method can solve the noise problem introduced by low signal-to-noise ratio (SNR) projection. The latter method is generally safer, however, it produces insufficient projection data, i.e. the views are sparse and noisy. X-ray dose is a key indicator in X-CT medical imaging images. The higher the X-ray dose, the clearer the image. However, with the increase in the dose of X-rays, the harm to the human body continues to increase. At present, the equipment in many hospitals has reached the minimum dose requirements, but the minimum dose of CT will be accompanied by low quality and noise. Obtaining high-quality CT images under the condition of low dose (minimum harm to human body) has important scientific significance and broad application prospects for the field of medical diagnosis.

Chen Hu et al. published an article "Low-dose CT denoising with convolutional neural network" at the IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017) conference in 2017, and successfully applied deep neural networks to the field of low-dose CT denoising. This is a scheme for denoising low-dose CT images through deep neural networks without accessing the raw projection data. A deep convolutional neural network is trained to gradually convert low-dose CT images into normal-dose CT images. However, the dataset used in this scheme is a pair of low-dose high-dose CT image pairs, where the low-dose image is obtained by applying Poisson noise to each detector simulating a normal dose sinogram with a blank scan flux elements to generate corresponding low-dose images.

Eunhee Kang et al. published an article "Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network" in IEEE Transactions on Medical Imaging in 2018, and proposed a denoising scheme based on wavelet residual network. This scheme synergizes the expressive power of deep learning with the performance guarantee of denoising algorithms based on the wavelet framework. However, the low-dose CT images of this scheme are also simulated based on high-dose images.

ZhanliHu et al. published the article "A feature refinement approach for statistical interior CT reconstruction" in Physics in Medicine and Biology in 2016, proposing a statistical interior tomography method for computed tomography. The scheme focuses on considering the statistical properties of local projection data and recovers the fine structure lost in traditional total variable (TV) minimization reconstruction. The proposed method utilizes a compressed sensing framework, which only assumes the inner region of interest (ROI) to be piecewise constant or polynomial and does not require any other prior knowledge. To integrate the statistical distribution properties of the projection data, an objective function is established under the criterion of penalized weighted least squares (PWLS-TV). In the proposed method, FBP reconstruction based on extrapolation of internal projections is first used as an initial guess to mitigate truncation artifacts and provide an expanded field of view.

After analysis, the main defects of the existing CT image processing are: considering the potential risk of emitting X-rays to patients, low-dose CT is a commonly used diagnostic evidence in clinical medicine, but low-dose imaging agents in CT imaging will lead to The reconstructed images generate a lot of quantum noise and blurred morphological features; in existing deep learning-based image reconstruction schemes, the dataset used is a pair of low- and high-dose CT image pairs, but in real life, clean The one-to-one correspondence of CT images is rare. In the prior art, low-dose images are generated by applying Poisson noise to each detector element simulating a normal dose sinusoid with a blank scanning flux, which is complicated and inefficient.

SUMMARY OF THE INVENTION

The purpose of the present invention is to overcome the above-mentioned defects of the prior art, and to provide an image processing method based on transform domain denoising auto-encoder as a priori, which is a new method for denoising low-dose images using prior information of unsupervised learning. Technical solutions.

According to a first aspect of the present invention, there is provided an image processing method based on transform domain denoising auto-encoder as a priori. The method includes the following steps:

Step S1: Construct a multi-channel tensor space with multi-scale and multi-view characteristics using the original image and multi-channel transformation features, and construct a training data set;

Step S2, train a denoising auto-encoder network based on the training data set to combine the image transform domain with the original pixel domain, obtain an image in the transform domain, and use the image in the transform domain to learn the multi-channel tensor space. the prior information;

In step S3, the prior information learned from the multi-channel tensor space is introduced into the iterative process of processing the image restoration problem to solve, and an optimized denoising auto-encoder network is obtained.

According to a second aspect of the present invention, an image processing method is provided. The method includes: transforming the image to be processed to obtain a transform domain image;

The to-be-processed image and the image transform domain are combined, input to the optimized denoising auto-encoder network obtained according to the present invention, and the reconstructed image is output.

Compared with the prior art, the present invention has the advantage that the transform domain-based denoising auto-encoder is provided as a priori image processing method, and the core idea is to enhance the classical de-noising auto-encoder (DAE) by transforming the domain, The encoder captures complementary information from multiple views, improving image quality while maintaining more texture details, making the processed images sharper and more suitable for diagnostic needs.

Other features and advantages of the present invention will become apparent from the following detailed description of exemplary embodiments of the present invention with reference to the accompanying drawings.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.

1 is a flowchart of an image processing method based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention;

2 is a schematic diagram of the overall process of an image processing method based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention;

3 is a flow chart of network learning based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of experimental results according to an embodiment of the present invention.

Detailed ways

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangement of components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods, and apparatus should be considered part of the specification.

In all examples shown and discussed herein, any specific values should be construed as illustrative only and not limiting. Accordingly, other instances of the exemplary embodiment may have different values.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further discussion in subsequent figures.

The image processing method based on transform domain denoising autoencoder as a priori provided by the present invention can be applied to various types of image reconstruction such as magnetic resonance imaging, computed tomography, positron emission computed tomography, etc. block, demosaicing, etc. For the sake of clarity, the following description takes CT image denoising as an example.

Inspired by the significant performance of the prior combined utilization of the pixel domain and the wavelet domain, the present invention proposes a CT denoising algorithm based on the transform domain denoising automatic encoder as a priori (TDAEP-CT). Enhance the classic denoising autoencoder (DAE), which captures complementary information from multiple views. In short, the method includes: using non-orthogonal wavelet coefficients to form a multi-channel feature image (such as a 4-channel feature image); obtaining a multi-channel image by stacking the original image in the pixel domain and the multi-channel feature image in the wavelet domain. quantile (such as a 5-channel tensor); use this multi-channel tensor as network input to train the transformed DAE (or TDAE); obtain an optimized image prior based on the trained autoencoder, with the help of auxiliary variable techniques Incorporate it into the iterative recovery process.

Specifically, referring to FIG. 1 , taking CT image denoising as an example, the provided image processing method based on transform domain denoising autoencoder as a priori includes the following steps.

Step S110 , register the CT images that do not correspond one to one, generate a multi-channel CT tensor, and construct a training data set.

For example, as shown in Fig. 2, this step includes: first, normalizing the CT images that do not correspond to one-to-one so as to keep the size consistent in the training phase, and then performing wavelet transformation on the CT images (for example, using 1 A 5-dimensional CT image tensor is formed by stacking four wavelet images with an original image. Among them, the one-to-one non-corresponding CT images are shown in Figure 2(a), the process of wavelet transform of CT images is shown in Figure 2(b), and the formation of 5-dimensional CT image tensors is shown in Figure 2(c) Show.

As shown in Figure 2(a), this embodiment uses a wavelet transform (WT) to generate the change domain. Wavelet transform can effectively analyze image features, especially image details. Despite the success of wavelet transforms in information retrieval tasks, there is still room for improvement. In the traditional discrete wavelet transform, the pseudo-Gibbs phenomenon occurs near the discontinuity of the extracted signal. It causes alternating undershoots and overshoots near singularities of the reconstructed signal and produces blocky artifacts in the processed image. These practical pitfalls can be alleviated by using TIWT (Translation Invariant Wavelet Transform or Cyclic Spinning), the core idea of which is to "average" dependencies. TIWT computes the inner product between all (circular) translated versions of the image and wavelet basis functions. Restoration can be achieved sequentially by thresholding and averaging operators. Using TIWT can avoid the pseudo-Gibbs phenomenon in the denoising process, and obtain better gain than DWT (Discrete Wavelet Transform) in removing noise and restoring the reduced high frequency components.

In one embodiment, the overcomplete wavelet transform consists of N orthogonal wavelet transforms, each of which consists of a cyclic shift of a wavelet basis function. Will

is the basic orthogonal wavelet transform matrix,

represents the possible wavelet transform matrix, applying the circular image shift to the basis functions

, the TIWT matrix and its inverse process are expressed as:

therefore,

WW ^T x=x WW ^T =I (2)

It is worth noting that W ^T W≠I and W is not orthogonal.

After level 1 2D-TIWT, the original image is decomposed into 4 sub-band images: approximate part LL and detail including horizontal component HL, vertical component LH and diagonal component HH (each having 1/4 the size of the original image) part. The low frequency component is the subband LL that contains most of the information of the original image. The subbands denoted HL, LH and HH contain the finest scale detail wavelet coefficients, corresponding to the higher frequency detail information of the original image. It should be noted that after 2D-TIWT decomposition, each subband image always has almost the same size as the original input image. A 2D inverse translation-invariant wavelet transform consisting of four subbands can completely reconstruct the original image. In this embodiment, image priors with multi-scale and multi-view characteristics are learned by TIWT.

The faceted data obtained from the transform domain provides more contour prior information, which is of great help in dealing with restoration tasks. The embodiment of the present invention constructs a multi-faceted data composed of elements in the wavelet domain and the pixel domain, and forms a tensor as the network input. Figure 2(c) depicts the formation of a 5-channel tensor in the transform domain. In one embodiment, the final training data is

Among them, the former component Ix is the original image, and the latter component Wx represents the combination of four subband images.

Step S120, using the training data set to train the denoising auto-encoder to learn the prior in the transform domain.

In this step, the network design process is illustrated by taking CT Image-based Enhanced Classical Denoising Autoencoder (TDAEP-CT) as an example.

Specifically, based on DAE, Bigdeli et al. proposed (Denoising Autoencoder Prior, DAEP), which uses the magnitude of DAE error as the prior information for image restoration. Suppose DAE is

Its output is

Its optimal value is trained with Gaussian noise and expected quadratic loss, expressed as:

Among them, the expectation

is the overall image x and Gaussian noise η with standard deviation σ _η . It can be deduced that:

where p(x) is the true data density,

is a local Gaussian kernel. It can be seen from equation (3) that the optimal DAE reconstruction function at each point x is given by a convolution of the density function p, that is, the weighted average of each point in the neighborhood x.

Furthermore, for the Gaussian density

exist

So the autoencoder error

is proportional to the log-likelihood gradient of the smoothed density, that is:

Among them, * is the convolution operator. Therefore, DAEP adopts the transfer characteristics of prior information and uses the magnitude of this mean shift vector as the negative log-likelihood of the image prior, which is expressed as:

As in Equation (5), DAE learns from a given set of data samples a mean-shift vector field proportional to the slope of the prior logarithm. Therefore, Bigdeli et al. propose a new prior called Deep Mean Shift Prior (DMSP). Use it in a gradient descent way to achieve Bayesian risk minimization. The formula for DMSP is expressed as:

By extending the original DMSP, and integrating multi-model aggregation and multi-channel network learning, a high-dimensional embedding network can also be employed, which precedes the derivation and applies the learned prior to single-channel MRI through variable enhancement techniques reconstruction.

The TDAEP-CT provided by the present invention mainly includes two processes: learning the prior information in the 5-channel tensor space instead of the original CT pixel space; the prior information learned from the 5-channel tensor space is introduced into the processing of CT image restoration during the iterative process of the problem.

First, in the learning phase, a TDAE network is trained from data pairs consisting of 5-channel tensors and their noisy versions. Correspondingly, the TDAEP prior is defined as:

where x is the original image, and a 5-channel tensor in the transform domain is represented as

Among them, the former component Ix is the original image, and the latter component Wx represents the combination of four subband images. DAE is

Its output is

in

represents the two-norm.

The biggest innovation of the present invention is to learn the prior information in the transform domain and apply it to the image restoration task. In the image restoration task, the image wavelet domain is combined with the original pixel domain to obtain the image in the transformed domain, and it is used to drive the network to extract image priors.

The following will illustrate that TDAEP is superior to DAEP in image feature extraction. Using the image transform domain can enhance the image restoration process. The biggest innovation of this work is to learn the prior information in the transform domain and apply it to the IR (image reconstruction) task.

Among them, y=Mx+n is the image degradation formula, x is the original image, M is the degradation factor/operator, y is the generated image after degradation, n is the additive noise, and the parameter λ is the control data fidelity term and A compromise between regularization terms.

Considering the way the priors are obtained from the pixel domain and the wavelet domain separately or jointly, R(x), R(Wx) and

are represented as three regular terms. Specifically, the regular terms extracted from the wavelet domain are as follows:

where Wx represents the combination of four subband images. Then, the superiority of the proposed regularizer can be derived from the following inequality:

Compared to the prior-induced regularization obtained in the pixel or wavelet domain, respectively, the present invention jointly learns them as tensors by superposition, accompanied by a loss function with lower penalty. Better learning ability helps the network to effectively extract redundant feature information and generate more compact representations. The multi-scale and multi-view properties of the transform domain are achieved by adding artificial noise to the pixel and wavelet domains simultaneously. They complement each other to obtain higher quality prior information.

Although TDAEP in Eq., Eqs. (5) and (6) provide promising regularization features, there are still challenges to solve. Specifically, gradient computation is expensive, and its derivation involves complex operations, i.e.,

Alternatively, to simplify the calculation, the TDAE network

Replace with an acceptable network

Assume

TDAEP can then become

Its gradient becomes:

Therefore, in one embodiment, the network is trained and used by the following two equations:

in this case,

close to the Gaussian noise η.

The network architecture design of the present invention can use various types of end-to-end convolutional neural networks, such as ResNet, densente and DualPathNet. Of these, base layers and building blocks are two popular tools for designing optimal architectures. In particular, ResNet introduces a fast connection scheme so that the last residual block flows directly into the next one. Therefore, it improves information flow and avoids vanishing gradients. Due to the good performance of ResNet in VDSR, EDSR and SRGAN, the architecture of TDAE network uses ResNet as a building block in the present invention.

In one embodiment, both the input and output of the TDAE network are 5-dimensional tensors. The main body of the network includes five components, each of which is composed of "CONV+BN+ReLU", "CONV+BN" and "ReLU" components. The abbreviations "CONV", "BN" and "ReLU" stand for convolutional layers, batch normalization and rectified linear units used to accelerate network learning, respectively. The number of core filters in each convolutional layer is set to 320, except that the number of filters in the last layer is 5. The kernel size of each convolutional layer is set to 3 × 3. It can be seen that its structure is similar to DnCNN (Denoising Convolutional Neural Network) except for network input and output and additional ResNet blocks. It should be noted that in TDAE, a more complex network can be used to ensure more efficient learning ability.

In step S130, an optimized denoising auto-encoder network is obtained by iterative solution.

In one embodiment, a proximal gradient method is employed to handle the nonlinearity of the network and the resulting model equations. Specifically, the model can be approximated by standard least squares minimization, expressed as:

in,

The equation function G(x) is

- Lipschitz smooth, i.e. k represents the number of iterations exponential. Here, β=1 is set empirically in experiments. Assuming β = 1, equation (16) is a standard LS (least squares) problem, which can be solved by computing the gradient as follows:

You can get:

where R represents the averaging operator used on the first channel image and intermediate ITIWT results.

has been learned during the network training phase. In addition, using network estimation

to update the gradient components

and the LS solver of equation (18) or until the final x ^k+1 -value converges.

Figure 3 is a network flow chart for TDAEP learning, in which the input is a 5-channel image, plus artificial Gaussian noise; the middle part shows a 20-layer network, consisting of 5 residual "blocks", 1 "CONV+ReLU" ", 3 "CONV+BN+ReLU" and 1 "CONV", the specific structure of the "block" refers to the upper part of Figure 3.

In order to further verify the effect of the present invention, simulation experiments are carried out. The results are shown in FIG. 4 , from left to right are the low-dose CT image, the high-dose CT image, and the CT image after denoising according to the present invention. It can be seen that the method of the present invention can effectively improve the peak signal-to-noise ratio and the structural similarity of the image, and at the same time, can restore the image detail information to a certain extent. A large number of experiments show that the present invention has a remarkable effect on CT denoising, and can be applied to other types of image reconstruction such as deblocking and demosaicing.

To sum up, the present invention extracts the prior in the transform domain, that is, jointly extracts the prior of the damaged object in the pixel domain and the intermediate wavelet domain, rather than in the pixel domain or the wavelet domain, respectively, which is constructed from the original image and multi-channel transform features. A channel tensor with multi-scale and multi-view properties. In particular, by employing translation-invariant wavelet transform (TIWT), noise and high-frequency components can be efficiently optimized. In addition, different noise weighting strategies are adopted in the network design process, which makes the design process more robust and stable for different restoration tasks. This strategy is beneficial to avoid falling into local minima and make the iterative process more stable. Further, after learning high-dimensional priors based on TDAE networks, alternating optimization and approximate gradient descent techniques are employed to solve the non-convex image restoration minimization problem.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the present invention.

A computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (non-exhaustive list) of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically coded devices, such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above. Computer-readable storage media, as used herein, are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.

The computer readable program instructions described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .

The computer program instructions for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages. Source or object code written in any combination, including object-oriented programming languages, such as Smalltalk, C++, Python, etc., and conventional procedural programming languages, such as the "C" language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through the Internet connect). In some embodiments, custom electronic circuits, such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), can be personalized by utilizing state information of computer readable program instructions. Computer readable program instructions are executed to implement various aspects of the present invention.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams. These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium storing the instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.

Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.

The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more functions for implementing the specified logical function(s) executable instructions. In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation in hardware, implementation in software, and implementation in a combination of software and hardware are all equivalent.

Various embodiments of the present invention have been described above, and the foregoing descriptions are exemplary, not exhaustive, and not limiting of the disclosed embodiments. Numerous modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the various embodiments, the practical application or technical improvement in the marketplace, or to enable others of ordinary skill in the art to understand the various embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

An image processing method based on transform domain denoising auto-encoder as a priori, comprising the following steps:

Step S1: Construct a multi-channel tensor space with multi-scale and multi-view characteristics using the original image and multi-channel transformation features, and construct a training data set;

Step S2, train a denoising auto-encoder network based on the training data set to combine the image transform domain with the original pixel domain, obtain an image in the transform domain, and use the image in the transform domain to learn the multi-channel tensor space. the prior information;

In step S3, the prior information learned from the multi-channel tensor space is introduced into the iterative process of processing the image restoration problem to solve, and an optimized denoising auto-encoder network is obtained.
The method according to claim 1, wherein step S1 comprises:

Normalize the images that do not correspond one to one;

Wavelet transform is performed on the image, and a 4-channel feature image is formed using 1-level non-orthogonal wavelet coefficients. The original image is decomposed into 4 sub-band images, including low-frequency component LL, horizontal component HL, vertical component LH and diagonal component HH, where, The horizontal component HL, the vertical component LH and the diagonal component HH are used to characterize the image details, and the low frequency component LL is used to characterize the approximate part of the image;

By superimposing the original image in the pixel domain and the 4-channel feature image in the wavelet domain, a 5-dimensional image tensor is obtained, and the training data set is constructed.
The method according to claim 2, wherein step S2 comprises:

The noisy autoencoder network is trained with a pair of data consisting of a 5-channel tensor and its noisy version. The training data is represented as
where component Ix is the original image and component Wx represents the combination of the four subband images.
The method according to claim 3, wherein, in step S3, the prior learned by the denoising autoencoder network is defined as:

in,
is a 5-channel tensor representation in the transform domain, component Ix is the original image, component Wx represents the combination of four subband images,
Represents the output of the denoising autoencoder network.
The method according to claim 4, characterized in that, in step S3, the optimization problem of the denoising autoencoder network is expressed as:

Among them, y=Mx+n is the image degradation formula, x is the original image, M is the degradation factor/operator, y is the generated image after degradation, n is the additive noise,

G(x) is
- Lipschitz smooth, k represents the iteration number index, β and λ are set parameters, and η is Gaussian noise.
The method of claim 5, wherein the optimization problem of the denoising autoencoder network is solved according to the following steps:

for the equation
set β=1;

The gradient is calculated by the following formula:

get:

where R represents the average operator used,
acquired through learning;

Using network estimation
to update the gradient components
until the set conditions are met.
The method of claim 1, wherein the image is a CT image, magnetic resonance imaging, computed tomography, or positron emission computed tomography.
An image processing method, comprising:

Transform the image to be processed to obtain a transform domain image;

The image to be processed is combined with the image transform domain, input to the optimized denoising auto-encoder network obtained according to any one of the methods of claims 1 to 7, and the reconstructed image is output.
A computer-readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the steps of the method according to any one of claims 1 to 8.
A computer device, comprising a memory and a processor, a computer program that can be run on the processor is stored in the memory, and characterized in that, when the processor executes the program, any one of claims 1 to 8 is implemented The steps of the method described in item.