WO2022226886A1 - Image processing method based on transform domain denoising autoencoder as a priori - Google Patents
Image processing method based on transform domain denoising autoencoder as a priori Download PDFInfo
- Publication number
- WO2022226886A1 WO2022226886A1 PCT/CN2021/090956 CN2021090956W WO2022226886A1 WO 2022226886 A1 WO2022226886 A1 WO 2022226886A1 CN 2021090956 W CN2021090956 W CN 2021090956W WO 2022226886 A1 WO2022226886 A1 WO 2022226886A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- transform domain
- network
- denoising
- domain
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 13
- 238000000034 method Methods 0.000 claims abstract description 41
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000009466 transformation Effects 0.000 claims abstract description 4
- 238000002591 computed tomography Methods 0.000 claims description 33
- 238000003860 storage Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 7
- 230000015556 catabolic process Effects 0.000 claims description 6
- 238000006731 degradation reaction Methods 0.000 claims description 6
- 238000012804 iterative process Methods 0.000 claims description 4
- 238000002595 magnetic resonance imaging Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 239000000654 additive Substances 0.000 claims description 2
- 230000000996 additive effect Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 10
- 230000006870 function Effects 0.000 description 15
- 238000010586 diagram Methods 0.000 description 12
- CBXRMKZFYQISIV-UHFFFAOYSA-N 1-n,1-n,1-n',1-n',2-n,2-n,2-n',2-n'-octamethylethene-1,1,2,2-tetramine Chemical compound CN(C)C(N(C)C)=C(N(C)C)N(C)C CBXRMKZFYQISIV-UHFFFAOYSA-N 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 230000006378 damage Effects 0.000 description 3
- 238000012938 design process Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- BETVNUCOOCCCIO-UHFFFAOYSA-N n-(2-dimethoxyphosphinothioylsulfanylethyl)acetamide Chemical compound COP(=S)(OC)SCCNC(C)=O BETVNUCOOCCCIO-UHFFFAOYSA-N 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000002059 diagnostic imaging Methods 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 206010073306 Exposure to radiation Diseases 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000029343 Schaaf-Yang syndrome Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000010420 art technique Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013170 computed tomography imaging Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 239000012216 imaging agent Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 238000011479 proximal gradient method Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000009987 spinning Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003325 tomography Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the present invention relates to the technical field of medical image processing, and more particularly, to an image processing method based on transform domain denoising automatic encoder as a priori.
- X-ray computed tomography is used for diagnosis and intervention in hospitals and clinics.
- X-ray CT may pose a potential risk of cancer or genetic disease due to exposure to radiation.
- X-CT medical imaging images have the advantages of high density and resolution of tissue structures and little damage to the human body, and are very important for the study of pathology and anatomy.
- the phenomenon of blurred images or indistinct borders will occur, resulting in low readability of X-CT medical image images, and doctors cannot make accurate diagnosis. Therefore it is necessary to reduce the X-ray dose.
- X-ray dose is a key indicator in X-CT medical imaging images. The higher the X-ray dose, the clearer the image. However, with the increase in the dose of X-rays, the harm to the human body continues to increase. At present, the equipment in many hospitals has reached the minimum dose requirements, but the minimum dose of CT will be accompanied by low quality and noise. Obtaining high-quality CT images under the condition of low dose (minimum harm to human body) has important scientific significance and broad application prospects for the field of medical diagnosis.
- the main defects of the existing CT image processing are: considering the potential risk of emitting X-rays to patients, low-dose CT is a commonly used diagnostic evidence in clinical medicine, but low-dose imaging agents in CT imaging will lead to The reconstructed images generate a lot of quantum noise and blurred morphological features; in existing deep learning-based image reconstruction schemes, the dataset used is a pair of low- and high-dose CT image pairs, but in real life, clean The one-to-one correspondence of CT images is rare.
- low-dose images are generated by applying Poisson noise to each detector element simulating a normal dose sinusoid with a blank scanning flux, which is complicated and inefficient.
- the purpose of the present invention is to overcome the above-mentioned defects of the prior art, and to provide an image processing method based on transform domain denoising auto-encoder as a priori, which is a new method for denoising low-dose images using prior information of unsupervised learning.
- Technical solutions are to overcome the above-mentioned defects of the prior art, and to provide an image processing method based on transform domain denoising auto-encoder as a priori, which is a new method for denoising low-dose images using prior information of unsupervised learning.
- an image processing method based on transform domain denoising auto-encoder as a priori includes the following steps:
- Step S1 Construct a multi-channel tensor space with multi-scale and multi-view characteristics using the original image and multi-channel transformation features, and construct a training data set;
- Step S2 train a denoising auto-encoder network based on the training data set to combine the image transform domain with the original pixel domain, obtain an image in the transform domain, and use the image in the transform domain to learn the multi-channel tensor space. the prior information;
- step S3 the prior information learned from the multi-channel tensor space is introduced into the iterative process of processing the image restoration problem to solve, and an optimized denoising auto-encoder network is obtained.
- an image processing method includes: transforming the image to be processed to obtain a transform domain image;
- the to-be-processed image and the image transform domain are combined, input to the optimized denoising auto-encoder network obtained according to the present invention, and the reconstructed image is output.
- the present invention has the advantage that the transform domain-based denoising auto-encoder is provided as a priori image processing method, and the core idea is to enhance the classical de-noising auto-encoder (DAE) by transforming the domain,
- DAE classical de-noising auto-encoder
- FIG. 1 is a flowchart of an image processing method based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention
- FIG. 2 is a schematic diagram of the overall process of an image processing method based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention
- FIG. 3 is a flow chart of network learning based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention
- FIG. 4 is a schematic diagram of experimental results according to an embodiment of the present invention.
- the image processing method based on transform domain denoising autoencoder as a priori provided by the present invention can be applied to various types of image reconstruction such as magnetic resonance imaging, computed tomography, positron emission computed tomography, etc. block, demosaicing, etc.
- image reconstruction such as magnetic resonance imaging, computed tomography, positron emission computed tomography, etc. block, demosaicing, etc.
- CT image denoising takes CT image denoising as an example.
- the present invention proposes a CT denoising algorithm based on the transform domain denoising automatic encoder as a priori (TDAEP-CT).
- TDAEP-CT transform domain denoising automatic encoder
- DAE classic denoising autoencoder
- the method includes: using non-orthogonal wavelet coefficients to form a multi-channel feature image (such as a 4-channel feature image); obtaining a multi-channel image by stacking the original image in the pixel domain and the multi-channel feature image in the wavelet domain.
- quantile such as a 5-channel tensor
- DAE transformed DAE
- TDAE transformed DAE
- auxiliary variable techniques Incorporate it into the iterative recovery process.
- the provided image processing method based on transform domain denoising autoencoder as a priori includes the following steps.
- Step S110 register the CT images that do not correspond one to one, generate a multi-channel CT tensor, and construct a training data set.
- this step includes: first, normalizing the CT images that do not correspond to one-to-one so as to keep the size consistent in the training phase, and then performing wavelet transformation on the CT images (for example, using 1
- a 5-dimensional CT image tensor is formed by stacking four wavelet images with an original image.
- the one-to-one non-corresponding CT images are shown in Figure 2(a)
- the process of wavelet transform of CT images is shown in Figure 2(b)
- the formation of 5-dimensional CT image tensors is shown in Figure 2(c) Show.
- this embodiment uses a wavelet transform (WT) to generate the change domain.
- WT wavelet transform
- Wavelet transform can effectively analyze image features, especially image details.
- the pseudo-Gibbs phenomenon occurs near the discontinuity of the extracted signal. It causes alternating undershoots and overshoots near singularities of the reconstructed signal and produces blocky artifacts in the processed image.
- TIWT Translation Invariant Wavelet Transform or Cyclic Spinning
- TIWT computes the inner product between all (circular) translated versions of the image and wavelet basis functions. Restoration can be achieved sequentially by thresholding and averaging operators. Using TIWT can avoid the pseudo-Gibbs phenomenon in the denoising process, and obtain better gain than DWT (Discrete Wavelet Transform) in removing noise and restoring the reduced high frequency components.
- DWT Discrete Wavelet Transform
- the overcomplete wavelet transform consists of N orthogonal wavelet transforms, each of which consists of a cyclic shift of a wavelet basis function.
- Will is the basic orthogonal wavelet transform matrix, represents the possible wavelet transform matrix, applying the circular image shift to the basis functions , the TIWT matrix and its inverse process are expressed as:
- W T W ⁇ I and W is not orthogonal.
- the original image is decomposed into 4 sub-band images: approximate part LL and detail including horizontal component HL, vertical component LH and diagonal component HH (each having 1/4 the size of the original image) part.
- the low frequency component is the subband LL that contains most of the information of the original image.
- the subbands denoted HL, LH and HH contain the finest scale detail wavelet coefficients, corresponding to the higher frequency detail information of the original image. It should be noted that after 2D-TIWT decomposition, each subband image always has almost the same size as the original input image.
- a 2D inverse translation-invariant wavelet transform consisting of four subbands can completely reconstruct the original image.
- image priors with multi-scale and multi-view characteristics are learned by TIWT.
- the faceted data obtained from the transform domain provides more contour prior information, which is of great help in dealing with restoration tasks.
- the embodiment of the present invention constructs a multi-faceted data composed of elements in the wavelet domain and the pixel domain, and forms a tensor as the network input.
- Figure 2(c) depicts the formation of a 5-channel tensor in the transform domain.
- the final training data is Among them, the former component Ix is the original image, and the latter component Wx represents the combination of four subband images.
- Step S120 using the training data set to train the denoising auto-encoder to learn the prior in the transform domain.
- the network design process is illustrated by taking CT Image-based Enhanced Classical Denoising Autoencoder (TDAEP-CT) as an example.
- TDAEP-CT CT Image-based Enhanced Classical Denoising Autoencoder
- Equation (3) the optimal DAE reconstruction function at each point x is given by a convolution of the density function p, that is, the weighted average of each point in the neighborhood x.
- the autoencoder error is proportional to the log-likelihood gradient of the smoothed density, that is:
- DAEP adopts the transfer characteristics of prior information and uses the magnitude of this mean shift vector as the negative log-likelihood of the image prior, which is expressed as:
- DMSP Deep Mean Shift Prior
- a high-dimensional embedding network can also be employed, which precedes the derivation and applies the learned prior to single-channel MRI through variable enhancement techniques reconstruction.
- the TDAEP-CT provided by the present invention mainly includes two processes: learning the prior information in the 5-channel tensor space instead of the original CT pixel space; the prior information learned from the 5-channel tensor space is introduced into the processing of CT image restoration during the iterative process of the problem.
- TDAE TDAE network
- TDAEP prior a TDAE network is trained from data pairs consisting of 5-channel tensors and their noisy versions.
- TDAEP prior is defined as:
- x is the original image
- a 5-channel tensor in the transform domain is represented as Among them, the former component Ix is the original image, and the latter component Wx represents the combination of four subband images.
- DAE is Its output is in represents the two-norm.
- the biggest innovation of the present invention is to learn the prior information in the transform domain and apply it to the image restoration task.
- the image restoration task the image wavelet domain is combined with the original pixel domain to obtain the image in the transformed domain, and it is used to drive the network to extract image priors.
- TDAEP is superior to DAEP in image feature extraction.
- Using the image transform domain can enhance the image restoration process.
- the biggest innovation of this work is to learn the prior information in the transform domain and apply it to the IR (image reconstruction) task.
- x is the original image
- M is the degradation factor/operator
- y is the generated image after degradation
- n is the additive noise
- the parameter ⁇ is the control data fidelity term and A compromise between regularization terms.
- the regular terms extracted from the wavelet domain are as follows:
- the present invention jointly learns them as tensors by superposition, accompanied by a loss function with lower penalty. Better learning ability helps the network to effectively extract redundant feature information and generate more compact representations.
- the multi-scale and multi-view properties of the transform domain are achieved by adding artificial noise to the pixel and wavelet domains simultaneously. They complement each other to obtain higher quality prior information.
- the network is trained and used by the following two equations:
- the network architecture design of the present invention can use various types of end-to-end convolutional neural networks, such as ResNet, densente and DualPathNet.
- ResNet introduces a fast connection scheme so that the last residual block flows directly into the next one. Therefore, it improves information flow and avoids vanishing gradients.
- ResNet Due to the good performance of ResNet in VDSR, EDSR and SRGAN, the architecture of TDAE network uses ResNet as a building block in the present invention.
- both the input and output of the TDAE network are 5-dimensional tensors.
- the main body of the network includes five components, each of which is composed of "CONV+BN+ReLU", “CONV+BN” and “ReLU” components.
- CONV convolutional layers
- BN convolutional layers
- ReLU rectified linear units used to accelerate network learning, respectively.
- the number of core filters in each convolutional layer is set to 320, except that the number of filters in the last layer is 5.
- the kernel size of each convolutional layer is set to 3 ⁇ 3. It can be seen that its structure is similar to DnCNN (Denoising Convolutional Neural Network) except for network input and output and additional ResNet blocks. It should be noted that in TDAE, a more complex network can be used to ensure more efficient learning ability.
- step S130 an optimized denoising auto-encoder network is obtained by iterative solution.
- a proximal gradient method is employed to handle the nonlinearity of the network and the resulting model equations.
- the model can be approximated by standard least squares minimization, expressed as:
- Equation (16) is a standard LS (least squares) problem, which can be solved by computing the gradient as follows:
- R represents the averaging operator used on the first channel image and intermediate ITIWT results. has been learned during the network training phase.
- Figure 3 is a network flow chart for TDAEP learning, in which the input is a 5-channel image, plus artificial Gaussian noise; the middle part shows a 20-layer network, consisting of 5 residual “blocks", 1 "CONV+ReLU” ", 3 "CONV+BN+ReLU” and 1 "CONV", the specific structure of the "block” refers to the upper part of Figure 3.
- the present invention extracts the prior in the transform domain, that is, jointly extracts the prior of the damaged object in the pixel domain and the intermediate wavelet domain, rather than in the pixel domain or the wavelet domain, respectively, which is constructed from the original image and multi-channel transform features.
- TSWT translation-invariant wavelet transform
- noise and high-frequency components can be efficiently optimized.
- different noise weighting strategies are adopted in the network design process, which makes the design process more robust and stable for different restoration tasks. This strategy is beneficial to avoid falling into local minima and make the iterative process more stable.
- alternating optimization and approximate gradient descent techniques are employed to solve the non-convex image restoration minimization problem.
- the present invention may be a system, method and/or computer program product.
- the computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the present invention.
- a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
- the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically coded devices, such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
- RAM random access memory
- ROM read only memory
- EPROM erasable programmable read only memory
- flash memory static random access memory
- SRAM static random access memory
- CD-ROM compact disk read only memory
- DVD digital versatile disk
- memory sticks floppy disks
- mechanically coded devices such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
- Computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.
- the computer readable program instructions described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network, and/or a wireless network.
- the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
- a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
- the computer program instructions for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages.
- Source or object code written in any combination including object-oriented programming languages, such as Smalltalk, C++, Python, etc., and conventional procedural programming languages, such as the "C" language or similar programming languages.
- the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
- the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through the Internet connect).
- LAN local area network
- WAN wide area network
- custom electronic circuits such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs)
- FPGAs field programmable gate arrays
- PDAs programmable logic arrays
- Computer readable program instructions are executed to implement various aspects of the present invention.
- These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
- These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium storing the instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
- Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
- each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more functions for implementing the specified logical function(s) executable instructions.
- the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation in hardware, implementation in software, and implementation in a combination of software and hardware are all equivalent.
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Image Processing (AREA)
Abstract
Disclosed are an image processing method based on a transform domain denoising autoencoder as a priori. The method comprises: using an original image and multi-channel transformation features to construct a multi-channel tensor space having multi-scale and multi-view characteristics, and constructing a training data set; training a de-noising autoencoder network on the basis of the training data set so as to combine an image transform domain with an original pixel domain to obtain an image in the transform domain, and learning a priori information in the multi-channel tensor space by using the image in the transform domain; and introducing the a priori information learned from the multi-channel tensor space into an iteration process for processing image restoration problems to carry out solving, so as to obtain an optimized denoising autoencoder network. Using a reconstructed image obtained in the present invention, the quality of the image is improved while maintaining more texture details, and diagnostic requirements are better satisfied.
Description
本发明涉及医学图像处理技术领域,更具体地,涉及一种基于变换域下去噪自动编码器作为先验的图像处理方法。The present invention relates to the technical field of medical image processing, and more particularly, to an image processing method based on transform domain denoising automatic encoder as a priori.
近年来,医学图像处理被广泛应用于临床指导。例如,X射线计算机断层扫描(CT)在医院和诊所中用于诊断和干预。由于受到射线照射,X射线CT可能引起癌症或遗传疾病的潜在风险。X-CT医学影像图像具有组织结构密度分辨率高,对人体损害小等优点,对病理学和解剖学的研究非常重要。但是在X-CT机扫描过程和传输图像过程中,会产生图像模糊不清或者边界不明显等现象,致使X-CT医学影像图像的可读性不高,医生无法准确诊断。因此减少X射线剂量是必要的。最新技术通常利用两种方式来试图解决此问题:减少X射线管的工作电流和曝光时间,或减少采样视图的数量。前一种方法可以解决由低信噪比(SNR)投影引入的噪声问题。后一种方法通常更安全,但是,它产生的投影数据不足,即视图稀疏且含有噪声。X射线剂量在X-CT医学影像图像中是一个关键的指标,X射线剂量越大,则图像越清晰。但是伴随着X射线剂量的增加,对人体的伤害也持续加大。目前,很多医院的设备已经达到了最小剂量的要求,但是最小剂量的CT会伴随着质量不高且有噪声的情况。在实现低剂量的情况下(对人体伤害最小的情况下)得到质量较高的CT图像对于医疗诊断领域具有重要的科学意义和广阔的应用前景。In recent years, medical image processing has been widely used in clinical guidance. For example, X-ray computed tomography (CT) is used for diagnosis and intervention in hospitals and clinics. X-ray CT may pose a potential risk of cancer or genetic disease due to exposure to radiation. X-CT medical imaging images have the advantages of high density and resolution of tissue structures and little damage to the human body, and are very important for the study of pathology and anatomy. However, during the scanning process of X-CT machine and the process of transmitting images, the phenomenon of blurred images or indistinct borders will occur, resulting in low readability of X-CT medical image images, and doctors cannot make accurate diagnosis. Therefore it is necessary to reduce the X-ray dose. State-of-the-art techniques generally attempt to solve this problem in two ways: by reducing the operating current and exposure time of the X-ray tube, or by reducing the number of sampled views. The former method can solve the noise problem introduced by low signal-to-noise ratio (SNR) projection. The latter method is generally safer, however, it produces insufficient projection data, i.e. the views are sparse and noisy. X-ray dose is a key indicator in X-CT medical imaging images. The higher the X-ray dose, the clearer the image. However, with the increase in the dose of X-rays, the harm to the human body continues to increase. At present, the equipment in many hospitals has reached the minimum dose requirements, but the minimum dose of CT will be accompanied by low quality and noise. Obtaining high-quality CT images under the condition of low dose (minimum harm to human body) has important scientific significance and broad application prospects for the field of medical diagnosis.
Chen Hu等人于2017年在IEEE 14th International Symposium on Biomedical Imaging(ISBI 2017)会议上发表文章“Low-dose CT denoising with convolutional neural network”,成功将深度神经网络应用于低剂量CT去噪领域。这是一种在不访问原始投影数据的情况下通过深度神经网络对 低剂量CT图像进行降噪的方案。深度卷积神经网络经过训练,可以将小剂量的CT图像逐步转换为正常剂量的CT图像。但该方案所用数据集是成对的一一对应的低高剂量的CT图像对,其中,低剂量图像是通过用空白扫描通量将泊松噪声施加到模拟正常剂量正弦图的每个检测器元素中来生成相应的低剂量图像。Chen Hu et al. published an article "Low-dose CT denoising with convolutional neural network" at the IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017) conference in 2017, and successfully applied deep neural networks to the field of low-dose CT denoising. This is a scheme for denoising low-dose CT images through deep neural networks without accessing the raw projection data. A deep convolutional neural network is trained to gradually convert low-dose CT images into normal-dose CT images. However, the dataset used in this scheme is a pair of low-dose high-dose CT image pairs, where the low-dose image is obtained by applying Poisson noise to each detector simulating a normal dose sinogram with a blank scan flux elements to generate corresponding low-dose images.
Eunhee Kang等人于2018年在IEEE Transactions on Medical Imaging刊上发表文章“Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network”,提出了一种基于小波残差网络的去噪方案。该方案将深度学习的表达能力与基于小波框架的去噪算法的性能保证协同起来。但该方案的低剂量CT图同样也是基于高剂量图像来进行模拟生成的。Eunhee Kang et al. published an article "Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network" in IEEE Transactions on Medical Imaging in 2018, and proposed a denoising scheme based on wavelet residual network. This scheme synergizes the expressive power of deep learning with the performance guarantee of denoising algorithms based on the wavelet framework. However, the low-dose CT images of this scheme are also simulated based on high-dose images.
ZhanliHu等人于2016年在Physics in Medicine and Biology刊上发表文章“A feature refinement approach for statistical interior CT reconstruction”,提出了一种用于计算机断层摄影的统计内部断层摄影方法。该方案着重考虑本地投影数据的统计性质,并恢复在传统的总变量(TV)最小化重建中丢失的精细结构。所提出的方法利用压缩感测框架,该框架仅假设内部感兴趣区域(ROI)为分段常数或多项式,并且不需要任何其他先验知识。为了整合投影数据的统计分布特性,在惩罚加权最小二乘(PWLS-TV)的标准下建立了目标函数。在提出的方法中,首先将基于内部投影外推的FBP重建用作初始猜测,以减轻截断伪像并提供扩展的视野。ZhanliHu et al. published the article "A feature refinement approach for statistical interior CT reconstruction" in Physics in Medicine and Biology in 2016, proposing a statistical interior tomography method for computed tomography. The scheme focuses on considering the statistical properties of local projection data and recovers the fine structure lost in traditional total variable (TV) minimization reconstruction. The proposed method utilizes a compressed sensing framework, which only assumes the inner region of interest (ROI) to be piecewise constant or polynomial and does not require any other prior knowledge. To integrate the statistical distribution properties of the projection data, an objective function is established under the criterion of penalized weighted least squares (PWLS-TV). In the proposed method, FBP reconstruction based on extrapolation of internal projections is first used as an initial guess to mitigate truncation artifacts and provide an expanded field of view.
经分析,现有的CT图像处理主要缺陷是:考虑到对患者发射X射线的潜在风险,低剂量CT在是临床医学中常用的诊断凭证,但在CT成像时低剂量的显像剂会导致重建图像产生大量量子噪声和模糊的形态特征;在现有的基于深度学习的图像重建方案中,所用数据集是成对的一一对应的低高剂量的CT图像对,但是现实生活中,干净的一一对应的CT图像很少。在现有技术中,低剂量图像是通过用空白扫描通量将泊松噪声施加到模拟正常剂量正弦图的每个检测器元素中来生成相应的低剂量图像,计算过程复杂、效率低。After analysis, the main defects of the existing CT image processing are: considering the potential risk of emitting X-rays to patients, low-dose CT is a commonly used diagnostic evidence in clinical medicine, but low-dose imaging agents in CT imaging will lead to The reconstructed images generate a lot of quantum noise and blurred morphological features; in existing deep learning-based image reconstruction schemes, the dataset used is a pair of low- and high-dose CT image pairs, but in real life, clean The one-to-one correspondence of CT images is rare. In the prior art, low-dose images are generated by applying Poisson noise to each detector element simulating a normal dose sinusoid with a blank scanning flux, which is complicated and inefficient.
发明内容SUMMARY OF THE INVENTION
本发明的目的是克服上述现有技术的缺陷,提供一种基于变换域下去噪自动编码器作为先验的图像处理方法,是利用无监督学习的先验信息用于低剂量图像去噪的新技术方案。The purpose of the present invention is to overcome the above-mentioned defects of the prior art, and to provide an image processing method based on transform domain denoising auto-encoder as a priori, which is a new method for denoising low-dose images using prior information of unsupervised learning. Technical solutions.
根据本发明的第一方面,提供一种基于变换域下去噪自动编码器作为先验的图像处理方法。该方法包括以下步骤:According to a first aspect of the present invention, there is provided an image processing method based on transform domain denoising auto-encoder as a priori. The method includes the following steps:
步骤S1:利用原始图像和多通道变换特征构造具有多尺度和多视角特性的多通道张量空间,构建训练数据集;Step S1: Construct a multi-channel tensor space with multi-scale and multi-view characteristics using the original image and multi-channel transformation features, and construct a training data set;
步骤S2,基于所述训练数据集训练去噪自动编码器网络,以将图像变换域与原始像素域相结合,获得变换域中图像,并使用变换域中图像学习所述多通道张量空间中的先验信息;Step S2, train a denoising auto-encoder network based on the training data set to combine the image transform domain with the original pixel domain, obtain an image in the transform domain, and use the image in the transform domain to learn the multi-channel tensor space. the prior information;
步骤S3,将从所述多通道张量空间中学习的先验信息引入到处理图像复原问题的迭代过程进行求解,获得优化的去噪自动编码器网络。In step S3, the prior information learned from the multi-channel tensor space is introduced into the iterative process of processing the image restoration problem to solve, and an optimized denoising auto-encoder network is obtained.
根据本发明的第二方面,提供一种图像处理方法。该方法包括:对待处理图像进行变换获得变换域图像;According to a second aspect of the present invention, an image processing method is provided. The method includes: transforming the image to be processed to obtain a transform domain image;
将所述待处理图像和图像变换域相结合,输入到根据本发明获得的优化的去噪自动编码器网络,输出重建图像。The to-be-processed image and the image transform domain are combined, input to the optimized denoising auto-encoder network obtained according to the present invention, and the reconstructed image is output.
与现有技术相比,本发明的优点在于,所提供的基于变换域下去噪自动编码器作为先验的图像处理方法,核心思想是通过变换域增强经典的去噪自动编码器(DAE),该编码器从多个视图中捕获互补信息,在保持了更多纹理细节的情况下,提高了图像的质量,使处理后的图像变得更清晰,更能满足诊断需求。Compared with the prior art, the present invention has the advantage that the transform domain-based denoising auto-encoder is provided as a priori image processing method, and the core idea is to enhance the classical de-noising auto-encoder (DAE) by transforming the domain, The encoder captures complementary information from multiple views, improving image quality while maintaining more texture details, making the processed images sharper and more suitable for diagnostic needs.
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。Other features and advantages of the present invention will become apparent from the following detailed description of exemplary embodiments of the present invention with reference to the accompanying drawings.
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.
图1是根据本发明一个实施例的基于变换域下去噪自动编码器作为先 验的图像处理方法的流程图;1 is a flowchart of an image processing method based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention;
图2是根据本发明一个实施例的基于变换域下去噪自动编码器作为先验的图像处理方法的总体过程示意图;2 is a schematic diagram of the overall process of an image processing method based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention;
图3是根据本发明一个实施例的基于变换域下去噪自动编码器作为先验的网络学习流程图;3 is a flow chart of network learning based on transform domain denoising auto-encoder as a priori according to an embodiment of the present invention;
图4是根据本发明一个实施例的实验结果示意图。FIG. 4 is a schematic diagram of experimental results according to an embodiment of the present invention.
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangement of components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the invention unless specifically stated otherwise.
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods, and apparatus should be considered part of the specification.
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。In all examples shown and discussed herein, any specific values should be construed as illustrative only and not limiting. Accordingly, other instances of the exemplary embodiment may have different values.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further discussion in subsequent figures.
本发明提供的基于变换域下去噪自动编码器作为先验的图像处理方法可应用于磁共振成像、计算机断层成像、正电子发射计算机断层成像等多种类型的图像重建,例如图像去噪、去块、去马赛克等。为清楚起见,下文以CT图像去噪为例进行说明。The image processing method based on transform domain denoising autoencoder as a priori provided by the present invention can be applied to various types of image reconstruction such as magnetic resonance imaging, computed tomography, positron emission computed tomography, etc. block, demosaicing, etc. For the sake of clarity, the following description takes CT image denoising as an example.
本发明受像素域和小波域联合利用先验的显著性能的启发,提出了一种基于变换域下去噪自动编码器作为先验(TDAEP-CT)的CT去噪算法,核心思想是通过变换域增强经典的去噪自动编码器(DAE),该编码器从多个视图中捕获互补信息。简言之,该方法包括:使用非正交小波系数来 形成多通道特征图像(如4通道特征图像);通过叠加像素域下的原始图像和小波域下的多通道特征图像,得到多通道张量(如5通道张量);使用该多通道张量作为网络输入来训练转换后的DAE(或称为TDAE);基于训练好的自动编码器获得优化的图像先验,并借助辅助变量技术将其结合到迭代恢复过程中。Inspired by the significant performance of the prior combined utilization of the pixel domain and the wavelet domain, the present invention proposes a CT denoising algorithm based on the transform domain denoising automatic encoder as a priori (TDAEP-CT). Enhance the classic denoising autoencoder (DAE), which captures complementary information from multiple views. In short, the method includes: using non-orthogonal wavelet coefficients to form a multi-channel feature image (such as a 4-channel feature image); obtaining a multi-channel image by stacking the original image in the pixel domain and the multi-channel feature image in the wavelet domain. quantile (such as a 5-channel tensor); use this multi-channel tensor as network input to train the transformed DAE (or TDAE); obtain an optimized image prior based on the trained autoencoder, with the help of auxiliary variable techniques Incorporate it into the iterative recovery process.
具体地,参见图1所述,以CT图像去噪为例,所提供的基于变换域下去噪自动编码器作为先验的图像处理方法包括以下步骤。Specifically, referring to FIG. 1 , taking CT image denoising as an example, the provided image processing method based on transform domain denoising autoencoder as a priori includes the following steps.
步骤S110,对一一不对应的CT图像进行配准,生成多通道的CT张量,并构建训练数据集。Step S110 , register the CT images that do not correspond one to one, generate a multi-channel CT tensor, and construct a training data set.
例如,参见图2所示,该步骤包括:首先是对一一不对应的CT图像进行归一化的处理,使其在训练阶段时大小保持一致,然后对CT图像进行小波变换(例如使用1级非正交小波系数来形成4通道特征图像),将四张小波的图像与一张原图堆叠形成一个5维的CT图像张量。其中,一一不对应的CT图像如图2(a)所示,CT图像进行小波变换的过程如图2(b)所示,5维的CT图像张量的形成如图2(c)所示。For example, as shown in Fig. 2, this step includes: first, normalizing the CT images that do not correspond to one-to-one so as to keep the size consistent in the training phase, and then performing wavelet transformation on the CT images (for example, using 1 A 5-dimensional CT image tensor is formed by stacking four wavelet images with an original image. Among them, the one-to-one non-corresponding CT images are shown in Figure 2(a), the process of wavelet transform of CT images is shown in Figure 2(b), and the formation of 5-dimensional CT image tensors is shown in Figure 2(c) Show.
如图2(a)所示,该实施例用小波变换(WT)来产生变化域。小波变换可以有效地分析图像特征,尤其是图像细节。尽管小波变换在信息检索任务中取得了成功,但仍有改进的余地。在传统的离散小波变换中,伪吉布斯现象发生在提取信号的不连续性附近。它在重建信号的奇异点附近引起交替的欠拍和过冲,并在处理后的图像中产生块状伪影。这些实际缺陷可以通过使用TIWT(平移不变小波变换或称循环纺纱)来缓解,其核心思想是“平均化”依赖性。TIWT计算图像和小波基函数的所有(循环)翻译版本之间的内积。复原可以通过阈值和平均算子依次实现。利用TIWT可以避免去噪过程中的伪吉布斯现象,并且在去除噪声和恢复降低的高频分量方面获得了比DWT(离散小波变换)更好的增益。As shown in Figure 2(a), this embodiment uses a wavelet transform (WT) to generate the change domain. Wavelet transform can effectively analyze image features, especially image details. Despite the success of wavelet transforms in information retrieval tasks, there is still room for improvement. In the traditional discrete wavelet transform, the pseudo-Gibbs phenomenon occurs near the discontinuity of the extracted signal. It causes alternating undershoots and overshoots near singularities of the reconstructed signal and produces blocky artifacts in the processed image. These practical pitfalls can be alleviated by using TIWT (Translation Invariant Wavelet Transform or Cyclic Spinning), the core idea of which is to "average" dependencies. TIWT computes the inner product between all (circular) translated versions of the image and wavelet basis functions. Restoration can be achieved sequentially by thresholding and averaging operators. Using TIWT can avoid the pseudo-Gibbs phenomenon in the denoising process, and obtain better gain than DWT (Discrete Wavelet Transform) in removing noise and restoring the reduced high frequency components.
在一个实施例中,过完备小波变换由N个正交小波变换构成,每一正交小波变换都由小波基函数的循环移位构成。将
为基本正交小波变换矩阵,
表示可能的小波变换矩阵,将圆形图像移位应用到基函数
中,TIWT矩阵及其逆过程被表示为:
In one embodiment, the overcomplete wavelet transform consists of N orthogonal wavelet transforms, each of which consists of a cyclic shift of a wavelet basis function. Will is the basic orthogonal wavelet transform matrix, represents the possible wavelet transform matrix, applying the circular image shift to the basis functions , the TIWT matrix and its inverse process are expressed as:
因此,therefore,
WW
Tx=x WW
T=I (2)
WW T x=x WW T =I (2)
值得注意的是,W
TW≠I并且W不是正交的。
It is worth noting that W T W≠I and W is not orthogonal.
在1级2D-TIWT之后,原始图像被分解成4个子带图像:近似部分LL和包括水平分量HL、垂直分量LH和对角分量HH(每个分量具有原始图像的1/4大小)的细节部分。低频分量是包含原始图像大部分信息的子带LL。表示为HL、LH和HH的子带包含最精细尺度的细节小波系数,对应于原始图像的较高频率的细节信息。应注意的是,在2D-TIWT分解之后,每个子带图像总是具有与原始输入图像几乎相同的尺寸。由四个子带组成的2D逆平移不变小波变换可以完全重构原始图像。在该实施例中,通过TIWT学习具有多尺度和多视角特性的图像先验。After level 1 2D-TIWT, the original image is decomposed into 4 sub-band images: approximate part LL and detail including horizontal component HL, vertical component LH and diagonal component HH (each having 1/4 the size of the original image) part. The low frequency component is the subband LL that contains most of the information of the original image. The subbands denoted HL, LH and HH contain the finest scale detail wavelet coefficients, corresponding to the higher frequency detail information of the original image. It should be noted that after 2D-TIWT decomposition, each subband image always has almost the same size as the original input image. A 2D inverse translation-invariant wavelet transform consisting of four subbands can completely reconstruct the original image. In this embodiment, image priors with multi-scale and multi-view characteristics are learned by TIWT.
从变换域获得的多面数据提供了更多的轮廓先验信息,这对处理复原任务有很大帮助。本发明实施例构造了一个由小波域和像素域元素组成的多面数据,形成一个张量作为网络输入。图2(c)描述了变换域中5通道张量的形成过程。在一个实施例中,最终的训练数据是
其中,前一个分量Ix是原始图像,后一个分量Wx代表四个子带图像的组合。
The faceted data obtained from the transform domain provides more contour prior information, which is of great help in dealing with restoration tasks. The embodiment of the present invention constructs a multi-faceted data composed of elements in the wavelet domain and the pixel domain, and forms a tensor as the network input. Figure 2(c) depicts the formation of a 5-channel tensor in the transform domain. In one embodiment, the final training data is Among them, the former component Ix is the original image, and the latter component Wx represents the combination of four subband images.
步骤S120,利用训练数据集训练去噪自动编码器,以学习变换域中的先验。Step S120, using the training data set to train the denoising auto-encoder to learn the prior in the transform domain.
在此步骤,以基于CT图像的增强的经典去噪自动编码器(TDAEP-CT)为例说明网络设计过程。In this step, the network design process is illustrated by taking CT Image-based Enhanced Classical Denoising Autoencoder (TDAEP-CT) as an example.
具体地,基于DAE的基础上,Bigdeli等人提出(Denoising Autoencoder Prior,DAEP),其使用DAE误差的幅值作为图像复原的先验信息。假设DAE为
它的输出为
利用高斯噪声和期望二次损失对其最优值进行训练,表示为:
Specifically, based on DAE, Bigdeli et al. proposed (Denoising Autoencoder Prior, DAEP), which uses the magnitude of DAE error as the prior information for image restoration. Suppose DAE is Its output is Its optimal value is trained with Gaussian noise and expected quadratic loss, expressed as:
其中,期望
是进行整体图像x和带标准差σ
η的高斯噪声η。可推导得到:
Among them, the expectation is the overall image x and Gaussian noise η with standard deviation σ η . It can be deduced that:
其中,p(x)为真实数据密度,
为局部高斯核。由式(3)可知,各点x处的最优DAE重构函数是由密度函数p的一种卷积给出的,也就是邻域x内各点的加权平均。
where p(x) is the true data density, is a local Gaussian kernel. It can be seen from equation (3) that the optimal DAE reconstruction function at each point x is given by a convolution of the density function p, that is, the weighted average of each point in the neighborhood x.
此外,对于高斯密度
存在
因此自动编码器误差
正比于平滑密度的对数似然梯度,即:
Furthermore, for the Gaussian density exist So the autoencoder error is proportional to the log-likelihood gradient of the smoothed density, that is:
其中,*为卷积算子。因此,DAEP采用先验信息的迁移特性,并利用该均值偏移向量的大小作为图像先验的负对数似然,表示为:Among them, * is the convolution operator. Therefore, DAEP adopts the transfer characteristics of prior information and uses the magnitude of this mean shift vector as the negative log-likelihood of the image prior, which is expressed as:
如方程式(5),DAE从给定的一组数据样本中学习一个均值漂移矢量场,该场与先验对数的斜率成比例。因此,Bigdeli等人提出了一种新的先验,称为深度均值漂移先验(DMSP)。以梯度下降的方式利用它来实现贝叶斯风险最小化。DMSP的公式表示为:As in Equation (5), DAE learns from a given set of data samples a mean-shift vector field proportional to the slope of the prior logarithm. Therefore, Bigdeli et al. propose a new prior called Deep Mean Shift Prior (DMSP). Use it in a gradient descent way to achieve Bayesian risk minimization. The formula for DMSP is expressed as:
通过扩展原始的DMSP,并集成了多模型聚合和多渠道网络学习,也可采用高维嵌入网络,该网络先于派生,并通过可变增强技术将所学的先验信息应用于单通道MRI重建。By extending the original DMSP, and integrating multi-model aggregation and multi-channel network learning, a high-dimensional embedding network can also be employed, which precedes the derivation and applies the learned prior to single-channel MRI through variable enhancement techniques reconstruction.
本发明提供的TDAEP-CT主要包含两个过程:学习5通道张量空间中的先验信息,而不是原始CT像素空间;从5通道张量空间中学习的先验信息引入到处理CT图像复原问题的迭代过程中。The TDAEP-CT provided by the present invention mainly includes two processes: learning the prior information in the 5-channel tensor space instead of the original CT pixel space; the prior information learned from the 5-channel tensor space is introduced into the processing of CT image restoration during the iterative process of the problem.
首先,在学习阶段,从由5通道张量及其噪声版本组成的数据对训练TDAE网络。相应地,TDAEP先验被定义为:First, in the learning phase, a TDAE network is trained from data pairs consisting of 5-channel tensors and their noisy versions. Correspondingly, the TDAEP prior is defined as:
其中,x是原始图像,变换域中5通道张量表示为
其中,前一个分量Ix是原始图像,后一个分量Wx代表四个子带图像的组合。DAE 为
它的输出为
其中
表示二范数。
where x is the original image, and a 5-channel tensor in the transform domain is represented as Among them, the former component Ix is the original image, and the latter component Wx represents the combination of four subband images. DAE is Its output is in represents the two-norm.
本发明的最大创新是学习变换域中的先验信息并将其应用于图像复原任务。在图像复原任务中,将图像小波域与原始像素域相结合,以获得变换域中的图像,并使用它来驱动网络提取图像先验信息。The biggest innovation of the present invention is to learn the prior information in the transform domain and apply it to the image restoration task. In the image restoration task, the image wavelet domain is combined with the original pixel domain to obtain the image in the transformed domain, and it is used to drive the network to extract image priors.
下文将说明TDAEP在图像特征提取方面优于DAEP。使用图像变换域能够增强图像恢复过程。这项工作的最大创新是学习变换域中的先验信息并将其应用于IR(图像重建)任务。The following will illustrate that TDAEP is superior to DAEP in image feature extraction. Using the image transform domain can enhance the image restoration process. The biggest innovation of this work is to learn the prior information in the transform domain and apply it to the IR (image reconstruction) task.
其中,y=Mx+n为图像退化公式,x是原始图像,M是退化因子/算子,y是产生的退化之后的图像,n是加性噪声,参数λ是控制数据保真度项和正则化项之间的折衷。Among them, y=Mx+n is the image degradation formula, x is the original image, M is the degradation factor/operator, y is the generated image after degradation, n is the additive noise, and the parameter λ is the control data fidelity term and A compromise between regularization terms.
考虑到分别或联合从像素域和小波域获得先验的方式,R(x),R(Wx)和
分别表示为三种正则项。具体地,从小波域提取的正则项如下:
Considering the way the priors are obtained from the pixel domain and the wavelet domain separately or jointly, R(x), R(Wx) and are represented as three regular terms. Specifically, the regular terms extracted from the wavelet domain are as follows:
其中Wx代表四个子带图像的组合。然后,所提出的正则项的优越性可以从以下不等式中导出:where Wx represents the combination of four subband images. Then, the superiority of the proposed regularizer can be derived from the following inequality:
与分别在像素或小波域中获得的先验诱导正则化相比,本发明通过叠加将它们联合学习为张量,伴随着具有较低惩罚的损失函数。较好的学习能力有助于网络有效提取冗余特征信息,产生更紧凑的表示。变换域的多尺度和多视角特性是通过同时向像素域和小波域添加人工噪声来实现的。它们相互补充,以获得更高质量的先验信息。Compared to the prior-induced regularization obtained in the pixel or wavelet domain, respectively, the present invention jointly learns them as tensors by superposition, accompanied by a loss function with lower penalty. Better learning ability helps the network to effectively extract redundant feature information and generate more compact representations. The multi-scale and multi-view properties of the transform domain are achieved by adding artificial noise to the pixel and wavelet domains simultaneously. They complement each other to obtain higher quality prior information.
虽然方程中的TDAEP,公式(5)和(6)提供了有希望的正则化特征,但仍有挑战要解决。具体地说,梯度计算成本很大,并且其推导涉及复杂的操作,即,Although TDAEP in Eq., Eqs. (5) and (6) provide promising regularization features, there are still challenges to solve. Specifically, gradient computation is expensive, and its derivation involves complex operations, i.e.,
或者,为了简化计算,将TDAE网络
替换为可接受的网络
设
TDAEP就可以变为
其梯度变为:
Alternatively, to simplify the calculation, the TDAE network Replace with an acceptable network Assume TDAEP can then become Its gradient becomes:
因此,在一个实施例中,通过以下两个等式来训练和使用网络:Therefore, in one embodiment, the network is trained and used by the following two equations:
本发明的网络架构设计可使用多种类型的端到端卷积神经网络,如ResNet,densente和DualPathNet。其中,基本层和构造块是设计最佳架构的两种流行工具。特别是,ResNet引入了快速连接方案,使得最后一个剩余块直接流入下一个。因此,它改善了信息流,避免了消失梯度。由于ResNet在VDSR、EDSR和SRGAN中的良好性能,TDAE网络的体系结构在本发明中使用ResNet作为构建模块。The network architecture design of the present invention can use various types of end-to-end convolutional neural networks, such as ResNet, densente and DualPathNet. Of these, base layers and building blocks are two popular tools for designing optimal architectures. In particular, ResNet introduces a fast connection scheme so that the last residual block flows directly into the next one. Therefore, it improves information flow and avoids vanishing gradients. Due to the good performance of ResNet in VDSR, EDSR and SRGAN, the architecture of TDAE network uses ResNet as a building block in the present invention.
在一个实施例中,TDAE网络的输入和输出都是5维张量。网络的主体包括五个构件,每个构件由“CONV+BN+ReLU”、“CONV+BN”和“ReLU”组件组成。缩写“CONV”、“BN”和“ReLU”分别代表卷积层、用于加速网络学习的批量归一化和校正线性单元。除了最后一层的滤波器数量为5之外,每个卷积层的核心滤波器数量被设置为320。每个卷积层的核大小设置为3×3。可以看出,除了网络输入输出和附加的ResNet块之外,它的结构类似于DnCNN(去噪卷积神经网络)。需说明的是,在TDAE,可采用更复杂的网络保证更高效的学习能力。In one embodiment, both the input and output of the TDAE network are 5-dimensional tensors. The main body of the network includes five components, each of which is composed of "CONV+BN+ReLU", "CONV+BN" and "ReLU" components. The abbreviations "CONV", "BN" and "ReLU" stand for convolutional layers, batch normalization and rectified linear units used to accelerate network learning, respectively. The number of core filters in each convolutional layer is set to 320, except that the number of filters in the last layer is 5. The kernel size of each convolutional layer is set to 3 × 3. It can be seen that its structure is similar to DnCNN (Denoising Convolutional Neural Network) except for network input and output and additional ResNet blocks. It should be noted that in TDAE, a more complex network can be used to ensure more efficient learning ability.
步骤S130,通过迭代求解获得优化的去噪自动编码器网络。In step S130, an optimized denoising auto-encoder network is obtained by iterative solution.
在一个实施例中,采用近端梯度法来处理网络的非线性和得到的模型方程。具体地,模型可以通过标准最小二乘最小化来近似,表示为:In one embodiment, a proximal gradient method is employed to handle the nonlinearity of the network and the resulting model equations. Specifically, the model can be approximated by standard least squares minimization, expressed as:
方程函数G(x)是
-李普希茨光滑的,即k代表迭代次数指数。这里,在 实验中根据经验设定β=1。假设β=1,方程(16)则是标准的LS(最小二乘)问题,可以通过如下计算梯度来解决:
The equation function G(x) is - Lipschitz smooth, i.e. k represents the number of iterations exponential. Here, β=1 is set empirically in experiments. Assuming β = 1, equation (16) is a standard LS (least squares) problem, which can be solved by computing the gradient as follows:
可以得到:You can get:
其中R表示在第一通道图像和中间ITIWT结果上使用的平均算子。
已经在网络训练阶段被学习。此外,利用网络估计
来更新梯度分量
和方程(18)的LS解算器或者直到最终的x
k+1-value收敛。
where R represents the averaging operator used on the first channel image and intermediate ITIWT results. has been learned during the network training phase. In addition, using network estimation to update the gradient components and the LS solver of equation (18) or until the final x k+1 -value converges.
图3是用于TDAEP学习的网络流程图,其中,输入为5通道图像,外加人工高斯噪声;中间部分示意出了是20层网络,由5个残差“块”、1个“CONV+ReLU”、3个“CONV+BN+ReLU”和1个“CONV”组成,“块”的具体结构参见图3的上半部分。Figure 3 is a network flow chart for TDAEP learning, in which the input is a 5-channel image, plus artificial Gaussian noise; the middle part shows a 20-layer network, consisting of 5 residual "blocks", 1 "CONV+ReLU" ", 3 "CONV+BN+ReLU" and 1 "CONV", the specific structure of the "block" refers to the upper part of Figure 3.
为进一步验证本发明的效果,进行了仿真实验。结果如图4所示,从左至右依次是低剂量CT图、高剂量CT图和根据本发明去噪后的CT图。可以看出,本发明方法可以有效提高图像的峰值信噪比和结构相似度,同时,可以在一定程度上恢复图像细节信息。大量实验表明,本发明对CT去噪的效果显著,并可应用于去块、去马赛克等其他类型的图像重建。In order to further verify the effect of the present invention, simulation experiments are carried out. The results are shown in FIG. 4 , from left to right are the low-dose CT image, the high-dose CT image, and the CT image after denoising according to the present invention. It can be seen that the method of the present invention can effectively improve the peak signal-to-noise ratio and the structural similarity of the image, and at the same time, can restore the image detail information to a certain extent. A large number of experiments show that the present invention has a remarkable effect on CT denoising, and can be applied to other types of image reconstruction such as deblocking and demosaicing.
综上所述,本发明在变换域提取先验,即在像素域和中间小波域联合提取受损对象的先验,而不是分别在像素域或小波域,由原始图像和多通道变换特征构造具有多尺度和多视角特性的通道张量。特别地,通过采用平移不变小波变换(TIWT),可以有效地优化噪声和高频分量。此外,网络设计过程中采用不同的噪声加权策略,使设计过程对不同的复原任务更加鲁棒和稳定,该策略有利于避免陷入局部极小值,使迭代过程更加稳定。进一步地,在学习了基于TDAE网络的高维先验后,采用交替优化和近似梯度下降技术解决非凸图像恢复最小化问题。To sum up, the present invention extracts the prior in the transform domain, that is, jointly extracts the prior of the damaged object in the pixel domain and the intermediate wavelet domain, rather than in the pixel domain or the wavelet domain, respectively, which is constructed from the original image and multi-channel transform features. A channel tensor with multi-scale and multi-view properties. In particular, by employing translation-invariant wavelet transform (TIWT), noise and high-frequency components can be efficiently optimized. In addition, different noise weighting strategies are adopted in the network design process, which makes the design process more robust and stable for different restoration tasks. This strategy is beneficial to avoid falling into local minima and make the iterative process more stable. Further, after learning high-dimensional priors based on TDAE networks, alternating optimization and approximate gradient descent techniques are employed to solve the non-convex image restoration minimization problem.
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the present invention.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的 指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。A computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (non-exhaustive list) of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically coded devices, such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above. Computer-readable storage media, as used herein, are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。The computer readable program instructions described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机 可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。The computer program instructions for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages. Source or object code written in any combination, including object-oriented programming languages, such as Smalltalk, C++, Python, etc., and conventional procedural programming languages, such as the "C" language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through the Internet connect). In some embodiments, custom electronic circuits, such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), can be personalized by utilizing state information of computer readable program instructions. Computer readable program instructions are executed to implement various aspects of the present invention.
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams. These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium storing the instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实 现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more functions for implementing the specified logical function(s) executable instructions. In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation in hardware, implementation in software, and implementation in a combination of software and hardware are all equivalent.
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。Various embodiments of the present invention have been described above, and the foregoing descriptions are exemplary, not exhaustive, and not limiting of the disclosed embodiments. Numerous modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the various embodiments, the practical application or technical improvement in the marketplace, or to enable others of ordinary skill in the art to understand the various embodiments disclosed herein. The scope of the invention is defined by the appended claims.
Claims (10)
- 一种基于变换域下去噪自动编码器作为先验的图像处理方法,包括以下步骤:An image processing method based on transform domain denoising auto-encoder as a priori, comprising the following steps:步骤S1:利用原始图像和多通道变换特征构造具有多尺度和多视角特性的多通道张量空间,构建训练数据集;Step S1: Construct a multi-channel tensor space with multi-scale and multi-view characteristics using the original image and multi-channel transformation features, and construct a training data set;步骤S2,基于所述训练数据集训练去噪自动编码器网络,以将图像变换域与原始像素域相结合,获得变换域中图像,并使用变换域中图像学习所述多通道张量空间中的先验信息;Step S2, train a denoising auto-encoder network based on the training data set to combine the image transform domain with the original pixel domain, obtain an image in the transform domain, and use the image in the transform domain to learn the multi-channel tensor space. the prior information;步骤S3,将从所述多通道张量空间中学习的先验信息引入到处理图像复原问题的迭代过程进行求解,获得优化的去噪自动编码器网络。In step S3, the prior information learned from the multi-channel tensor space is introduced into the iterative process of processing the image restoration problem to solve, and an optimized denoising auto-encoder network is obtained.
- 根据权利要求1所述的方法,其中,步骤S1包括:The method according to claim 1, wherein step S1 comprises:对一一不对应的图像进行归一化处理;Normalize the images that do not correspond one to one;对图像进行小波变换,使用1级非正交小波系数形成4通道特征图像,原始图像被分解成4个子带图像,包括低频分量LL、水平分量HL、垂直分量LH和对角分量HH,其中,水平分量HL、垂直分量LH和对角分量HH用于表征图像细节,低频分量LL用于表征图像的近似部分;Wavelet transform is performed on the image, and a 4-channel feature image is formed using 1-level non-orthogonal wavelet coefficients. The original image is decomposed into 4 sub-band images, including low-frequency component LL, horizontal component HL, vertical component LH and diagonal component HH, where, The horizontal component HL, the vertical component LH and the diagonal component HH are used to characterize the image details, and the low frequency component LL is used to characterize the approximate part of the image;通过叠加像素域下的原始图像和小波域下的4通道特征图像,得到5维的图像张量,构建所述训练数据集。By superimposing the original image in the pixel domain and the 4-channel feature image in the wavelet domain, a 5-dimensional image tensor is obtained, and the training data set is constructed.
- 根据权利要求2所述的方法,其特征在于,步骤S2包括:The method according to claim 2, wherein step S2 comprises:利用由5通道张量及其噪声版本组成的数据对训练所述其噪自动编码器网络,训练数据表示为 其中分量Ix是原始图像,分量Wx代表四个子带图像的组合。 The noisy autoencoder network is trained with a pair of data consisting of a 5-channel tensor and its noisy version. The training data is represented as where component Ix is the original image and component Wx represents the combination of the four subband images.
- 根据权利要求3所述的方法,其特征在于,在步骤S3中,所述去噪自动编码器网络学习的先验定义为:The method according to claim 3, wherein, in step S3, the prior learned by the denoising autoencoder network is defined as:
- 根据权利要求4所述的方法,其特征在于,在步骤S3,将所述去噪自动编码器网络的优化问题表示为:The method according to claim 4, characterized in that, in step S3, the optimization problem of the denoising autoencoder network is expressed as:其中,y=Mx+n为图像退化公式,x是原始图像,M是退化因子/算子,y是产生的退化之后的图像,n是加性噪声, G(x)是 -李普希茨光滑的,k代表迭代次数指数,β和λ是设定的参数,η是高斯噪声。 Among them, y=Mx+n is the image degradation formula, x is the original image, M is the degradation factor/operator, y is the generated image after degradation, n is the additive noise, G(x) is - Lipschitz smooth, k represents the iteration number index, β and λ are set parameters, and η is Gaussian noise.
- 根据权利要求5所述的方法,其特征在于,根据以下步骤求解所述去噪自动编码器网络的优化问题:The method of claim 5, wherein the optimization problem of the denoising autoencoder network is solved according to the following steps:通过以下公式计算梯度:The gradient is calculated by the following formula:得到:get:
- 根据权利要求1所述的方法,其特征在于,所述图像是CT图像、磁共振成像、计算机断层成像或正电子发射计算机断层成像。The method of claim 1, wherein the image is a CT image, magnetic resonance imaging, computed tomography, or positron emission computed tomography.
- 一种图像处理方法,包括:An image processing method, comprising:对待处理图像进行变换获得变换域图像;Transform the image to be processed to obtain a transform domain image;将所述待处理图像和图像变换域相结合,输入到根据权利要求1至7任一项方法获得的优化的去噪自动编码器网络,输出重建图像。The image to be processed is combined with the image transform domain, input to the optimized denoising auto-encoder network obtained according to any one of the methods of claims 1 to 7, and the reconstructed image is output.
- 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现根据权利要求1至8中任一项所述方法的步骤。A computer-readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the steps of the method according to any one of claims 1 to 8.
- 一种计算机设备,包括存储器和处理器,在所述存储器上存储有能够在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1至8中任一项所述的方法的步骤。A computer device, comprising a memory and a processor, a computer program that can be run on the processor is stored in the memory, and characterized in that, when the processor executes the program, any one of claims 1 to 8 is implemented The steps of the method described in item.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/090956 WO2022226886A1 (en) | 2021-04-29 | 2021-04-29 | Image processing method based on transform domain denoising autoencoder as a priori |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/090956 WO2022226886A1 (en) | 2021-04-29 | 2021-04-29 | Image processing method based on transform domain denoising autoencoder as a priori |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022226886A1 true WO2022226886A1 (en) | 2022-11-03 |
Family
ID=83846603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/090956 WO2022226886A1 (en) | 2021-04-29 | 2021-04-29 | Image processing method based on transform domain denoising autoencoder as a priori |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022226886A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117011673A (en) * | 2023-10-07 | 2023-11-07 | 之江实验室 | Electrical impedance tomography image reconstruction method and device based on noise diffusion learning |
CN117495714A (en) * | 2024-01-03 | 2024-02-02 | 华侨大学 | Face image restoration method and device based on diffusion generation priori and readable medium |
CN117689761A (en) * | 2024-02-02 | 2024-03-12 | 北京航空航天大学 | Plug-and-play magnetic particle imaging reconstruction method and system based on diffusion model |
CN118521670A (en) * | 2024-07-19 | 2024-08-20 | 腾讯科技(深圳)有限公司 | Image artifact removal method, training method and device for image artifact removal model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110223255A (en) * | 2019-06-11 | 2019-09-10 | 太原科技大学 | A kind of shallow-layer residual error encoding and decoding Recursive Networks for low-dose CT image denoising |
CN110246094A (en) * | 2019-05-13 | 2019-09-17 | 南昌大学 | A kind of denoisings for the 6 dimension insertions rebuild for color image super resolution are from encoding Prior Information Algorithm |
CN111047524A (en) * | 2019-11-13 | 2020-04-21 | 浙江工业大学 | Low-dose CT lung image denoising method based on deep convolutional neural network |
CN112330682A (en) * | 2020-11-09 | 2021-02-05 | 重庆邮电大学 | Industrial CT image segmentation method based on deep convolutional neural network |
-
2021
- 2021-04-29 WO PCT/CN2021/090956 patent/WO2022226886A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110246094A (en) * | 2019-05-13 | 2019-09-17 | 南昌大学 | A kind of denoisings for the 6 dimension insertions rebuild for color image super resolution are from encoding Prior Information Algorithm |
CN110223255A (en) * | 2019-06-11 | 2019-09-10 | 太原科技大学 | A kind of shallow-layer residual error encoding and decoding Recursive Networks for low-dose CT image denoising |
CN111047524A (en) * | 2019-11-13 | 2020-04-21 | 浙江工业大学 | Low-dose CT lung image denoising method based on deep convolutional neural network |
CN112330682A (en) * | 2020-11-09 | 2021-02-05 | 重庆邮电大学 | Industrial CT image segmentation method based on deep convolutional neural network |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117011673A (en) * | 2023-10-07 | 2023-11-07 | 之江实验室 | Electrical impedance tomography image reconstruction method and device based on noise diffusion learning |
CN117011673B (en) * | 2023-10-07 | 2024-03-26 | 之江实验室 | Electrical impedance tomography image reconstruction method and device based on noise diffusion learning |
CN117495714A (en) * | 2024-01-03 | 2024-02-02 | 华侨大学 | Face image restoration method and device based on diffusion generation priori and readable medium |
CN117495714B (en) * | 2024-01-03 | 2024-04-12 | 华侨大学 | Face image restoration method and device based on diffusion generation priori and readable medium |
CN117689761A (en) * | 2024-02-02 | 2024-03-12 | 北京航空航天大学 | Plug-and-play magnetic particle imaging reconstruction method and system based on diffusion model |
CN117689761B (en) * | 2024-02-02 | 2024-04-26 | 北京航空航天大学 | Plug-and-play magnetic particle imaging reconstruction method and system based on diffusion model |
CN118521670A (en) * | 2024-07-19 | 2024-08-20 | 腾讯科技(深圳)有限公司 | Image artifact removal method, training method and device for image artifact removal model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022226886A1 (en) | Image processing method based on transform domain denoising autoencoder as a priori | |
Ran et al. | Denoising of 3D magnetic resonance images using a residual encoder–decoder Wasserstein generative adversarial network | |
Li et al. | SACNN: Self-attention convolutional neural network for low-dose CT denoising with self-supervised perceptual loss network | |
Diwakar et al. | CT image denoising using NLM and correlation‐based wavelet packet thresholding | |
CN104182954B (en) | Real-time multi-modal medical image fusion method | |
Dakua et al. | Patient oriented graph-based image segmentation | |
WO2021168920A1 (en) | Low-dose image enhancement method and system based on multiple dose levels, and computer device, and storage medium | |
Yang et al. | Super-resolution of medical image using representation learning | |
CN115018728A (en) | Image fusion method and system based on multi-scale transformation and convolution sparse representation | |
Zhu et al. | STEDNet: Swin transformer‐based encoder–decoder network for noise reduction in low‐dose CT | |
Yin et al. | Unpaired low-dose CT denoising via an improved cycle-consistent adversarial network with attention ensemble | |
Zhao et al. | Dual-scale similarity-guided cycle generative adversarial network for unsupervised low-dose CT denoising | |
Manimala et al. | Sparse MR image reconstruction considering Rician noise models: A CNN approach | |
WO2023279316A1 (en) | Pet reconstruction method based on denoising score matching network | |
Liu et al. | DFSNE-Net: Deviant feature sensitive noise estimate network for low-dose CT denoising | |
CN112991220B (en) | Method for correcting image artifact by convolutional neural network based on multiple constraints | |
CN113129296B (en) | Image processing method based on denoising automatic encoder under transform domain as prior | |
Liu et al. | Windowed variation kernel Wiener filter model for image denoising with edge preservation | |
Jiang et al. | GDAFormer: Gradient-guided Dual Attention Transformer for Low-Dose CT image denoising | |
Li et al. | Dual-domain fusion deep convolutional neural network for low-dose CT denoising | |
KR102643601B1 (en) | Method and apparatus for low-dose x-ray computed tomography image processing based on efficient unsupervised learning using invertible neural network | |
Hein et al. | PPFM: Image denoising in photon-counting CT using single-step posterior sampling Poisson flow generative models | |
Mahmoud et al. | Variant Wasserstein Generative Adversarial Network Applied on Low Dose CT Image Denoising. | |
Kang et al. | Denoising Low-Dose CT Images Using a Multi-Layer Convolutional Analysis-Based Sparse Encoder Network | |
Li et al. | Adaptive weighted total variation expansion and Gaussian curvature guided low-dose CT image denoising network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21938367 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29-02-2024) |