CN113256500B - Deep learning neural network model system for multi-modal image synthesis - Google Patents

Deep learning neural network model system for multi-modal image synthesis Download PDF

Info

Publication number
CN113256500B
CN113256500B CN202110746839.XA CN202110746839A CN113256500B CN 113256500 B CN113256500 B CN 113256500B CN 202110746839 A CN202110746839 A CN 202110746839A CN 113256500 B CN113256500 B CN 113256500B
Authority
CN
China
Prior art keywords
layer
image
layers
neural network
batch normalization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110746839.XA
Other languages
Chinese (zh)
Other versions
CN113256500A (en
Inventor
武王将
杨瑞杰
庄洪卿
王皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Third Hospital Peking University Third Clinical Medical College
Original Assignee
Peking University Third Hospital Peking University Third Clinical Medical College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Third Hospital Peking University Third Clinical Medical College filed Critical Peking University Third Hospital Peking University Third Clinical Medical College
Priority to CN202110746839.XA priority Critical patent/CN113256500B/en
Publication of CN113256500A publication Critical patent/CN113256500A/en
Application granted granted Critical
Publication of CN113256500B publication Critical patent/CN113256500B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • G06T11/008Specific post-processing after tomographic reconstruction, e.g. voxelisation, metal artifact correction

Abstract

The invention relates to a deep learning neural network model system for multi-modal image synthesis, which comprises a multi-resolution residual deep neural network formed by combining a Residual Deep Neural Network (RDNN) and a multi-resolution optimization strategy; the RDNN includes a convolution layers, B exfoliation layers, C Batch Normalization (Batch Normalization) layers and D long-term residual connections; wherein, the convolution layer is used for extracting image characteristics; the falling layer is used for avoiding network overfitting; the batch normalization layer is used for normalizing the input of the corresponding convolution kernel; long-term residual concatenation is used to preserve structural information in the input image; each peeling layer is provided with two coiling layers on two sides and is connected with the coiling layers adjacent to the two sides; a convolution layer is arranged between each falling layer and each batch normalization layer; the shedding layer, the convolution layer and the batch normalization layer are connected in sequence; one end of each long-term residual error connection is connected between the convolution layer and the batch normalization layer; the other end is connected between the other group of convolution layers and the batch normalization layer.

Description

Deep learning neural network model system for multi-modal image synthesis
Technical Field
The invention relates to the technical field of medical image processing and guide treatment, in particular to a deep learning neural network model system for multi-modal image synthesis.
Background
Adaptive Radio Therapy (ART) technology based on CBCT images can improve the irradiation of rays to tumors and protect organs at risk near the tumors. However, prior art CBCT images HU have low accuracy, low soft tissue resolution and severe artifacts. Therefore, to implement Adaptive Radiotherapy (ART), it is first necessary to generate a synthetic ct (sct) image with high HU accuracy and structural fidelity based on a CBCT image.
U-Net and other deep learning networks have been widely used in the task of sCT image generation. Researches show that HU accuracy of the generated sCT image is remarkably improved, and dose calculation can be carried out on the basis of the sCT image. However, these sCT images are not highly fidelity to structures present in CBCT images, and produce image blur.
One important reason for the low fidelity of the structure of the sCT images described above is that previous studies did not optimize the network based on specific locations due to the difficulty of obtaining medical data. For example, in training a network generated by head sCT images, some studies train them with data of the head, abdomen, pelvis, etc. Secondly, the network lacks the constraint of improving the structural fidelity of the sCT image.
Disclosure of Invention
The invention aims to provide a deep learning neural network model system for multi-modal image synthesis, and solves the technical problems that how to enable the generated sCT image not only to have high HU accuracy, but also to have good fidelity to structural information in a CBCT image.
The invention aims to solve the defects of the prior art and provides a deep learning neural network model system for multi-modal image synthesis, which comprises a multi-resolution residual deep learning network (multi-resolution RDNN) formed by combining a residual deep learning network (RDNN) and a multi-resolution (multi-resolution) rate optimization strategy; the residual error deep learning network comprises A convolutional layers, B shedding layers (dropouts), C Batch Normalization layers and D long-term residual error connections (long-term residual errors); wherein, the convolution layer is used for extracting image characteristics; the falling layer is used for avoiding network overfitting; the batch normalization layer is used for standardizing the input of the corresponding convolution kernel, so that the network training process is stabilized and the learning efficiency is improved; the long-term residual connection is used for keeping the structural information in the deep image; the two sides of each release layer are provided with the convolution layers, and each release layer is connected with the convolution layers adjacent to the two sides; inputting the extracted image characteristics into the adjacent peeling layer by the convolution layer;
a convolution layer is arranged between each falling layer and each batch normalization layer; the shedding layer, the convolution layer and the batch normalization layer are connected in sequence;
one end of each long-term residual error connection is connected between the convolution layer and the batch normalization layer; the other end of each long-term residual error connection is connected between the other group of convolution layers and the batch normalization layer;
the multi-resolution optimization strategy is to sequentially optimize the optimization process in different resolution modes from low to high; at low resolution, the optimization process focuses only on the profile of the entire image and does not need to be tied to local details in the image; the detail information in the image is learned in a higher resolution mode.
Preferably, the residual deep learning network includes 15 convolutional layers.
Each of the convolutional layers may be represented as a (k') k convolutional layer, in the form of n, where k represents the size of the convolutional kernels, n represents the number of the convolutional kernels, and the activation function used is elu.
Preferably, the Dropout Rate (Dropout Rate) in the residual deep learning network is 20%.
Preferably, 7 shedding layers are arranged.
Preferably, 7 batch normalization layers are arranged.
Preferably, there are 3 long-term residual connections.
Preferably, the residual deep learning network does not use pooling layer (pooling) to avoid loss of structural information in the image.
Advantageous effects
Compared with the prior art, the invention has the beneficial effects that:
the deep learning neural network model system for multi-modal image synthesis provided by the invention optimizes and tests a specially designed multi-resolution residual deep neural network by using the image of a specific part of a patient, and combines the residual deep neural network provided by the invention with a multi-resolution optimization strategy to form the multi-resolution residual deep neural network (see figure 2). The network integrates the advantages of the two, so that the generated sCT image has high HU accuracy and good fidelity to the structural information in the CBCT image.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.
Fig. 1 is a schematic structural diagram of a residual deep neural network according to the present invention.
Fig. 2 is a schematic structural diagram of a multi-resolution residual deep neural network according to the present invention.
Detailed Description
The present invention is described in more detail below to facilitate an understanding of the present invention.
As shown in fig. 1 and fig. 2, the deep learning neural network model system for multimodal image synthesis according to the present invention includes a multi-resolution residual deep neural network formed by combining a Residual Deep Neural Network (RDNN) and a multi-resolution optimization strategy; the residual deep neural network includes 15 convolutional layers, each convolutional layer may be represented as (k') conv, n, where k represents the size of a convolution kernel, and n represents the number of convolution kernels. To avoid network overfitting, 7 shedding layers were added to the residual deep neural network (shedding rate of 20%); 7 batch normalization layers were used to normalize the input of the corresponding convolution kernel, thereby stabilizing the network training process and improving learning efficiency. To better preserve structural information in CBCT images, we add three long-term residual connections. Since pooling layers can lead to loss of structural information in the image, we do not use pooling layers in the present network. The two sides of each release layer are provided with the convolution layers, and each release layer is connected with the convolution layers adjacent to the two sides; inputting the extracted image characteristics into the adjacent peeling layer by the convolution layer; a convolution layer is arranged between each falling layer and each batch normalization layer; the shedding layer, the convolution layer and the batch normalization layer are connected in sequence; one end of each long-term residual error connection is connected between the convolution layer and the batch normalization layer; the other end of each long-term residual connection is connected between the other set of convolutional layers and the batch normalization layer.
The multi-resolution optimization process is similar to the way the human visual system observes something. The mechanism sequentially optimizes the optimization process in different resolution modes from low to high. At low resolution, the optimization process can focus only on the profile of the entire image and does not need to be tied to local details in the image. The detail information in the image is learned in a higher resolution mode.
In the application, the residual deep neural network is trained by using a low-resolution image, and then fine-tuned by using a medium-resolution image and a high-resolution image, so that the network can continuously learn detail information in the image until the sCT image with high HU accuracy and high structural fidelity is synthesized.
The applicant respectively makes comparison tests for the multi-modal image synthesis deep learning neural network model system in the prior art and the application, and as the HU accuracy of the sCT image related to the test result and the fidelity of the structural information in the CBCT image do not have accurate quantitative indexes as those of the physical test and the chemical test, and more needs depend on the subjective judgment of the doctor, the applicant organizes 50 doctors with clinical experience of the tumor for more than 5 years in the test process, tests the sCT image and the CBCT image of 5 different patients, and the test results are as follows: (in this test 7 exfoliation layers, 7 batch normalization layers, and 3 long-term residual connections were provided).
Regarding the sCT image HU of the patient 1, the doctors with high HU accuracy that the sCT image generated in the prior art has are considered to be 0 person, and the doctors with high HU accuracy that the sCT image generated in the present application has are considered to be 50 persons.
Regarding the fidelity of the structural information in the CBCT image of the patient 1, the doctor who has better fidelity to the structural information in the CBCT image in the prior art is considered to be 0 person, the doctor who has better fidelity to the structural information in the CBCT image in the present application is considered to be 49 persons, and the doctor who is not much different from the doctor is considered to be 1 person.
To patient 2's sCT image HU, think that the doctor that the HU accuracy height that the sCT image that prior art produced was had is 0 people, think that the doctor that the HU accuracy height that the sCT image that this application produced was high is 48 people, think that the doctor that the two is little is 2 people.
Regarding the fidelity of the structural information in the CBCT image of the patient 2, 0 doctors having better fidelity to the structural information in the CBCT image in the prior art were considered, and 50 doctors having better fidelity to the structural information in the CBCT image in the present application were considered.
Regarding the sCT image HU of the patient 3, the doctors with high HU accuracy that the sCT image generated in the prior art has are considered to be 0 person, and the doctors with high HU accuracy that the sCT image generated in the present application has are considered to be 50 persons.
Regarding the fidelity of the structural information in the CBCT image of the patient 3, the doctor who has better fidelity to the structural information in the CBCT image in the prior art is considered to be 0 person, the doctor who has better fidelity to the structural information in the CBCT image in the present application is considered to be 47 persons, and the doctor who is not much different from the doctor is considered to be 3 persons.
Regarding the sCT image HU of the patient 4, the doctors with high HU accuracy that the sCT image generated in the prior art has are considered to be 0 person, and the doctors with high HU accuracy that the sCT image generated in the present application has are considered to be 50 persons.
Regarding the fidelity of the structural information in the CBCT image of the patient 4, the doctor who has better fidelity to the structural information in the CBCT image in the prior art is considered to be 0, the doctor who has better fidelity to the structural information in the CBCT image in the present application is considered to be 49, and the doctor who is not much different from the doctor is considered to be 1.
Regarding the sCT image HU of the patient 5, it is considered that the sCT image generated in the prior art has 0 doctors with high HU accuracy, and it is considered that the sCT image generated in the present application has 50 doctors with high HU accuracy.
Regarding the fidelity of the structural information in the CBCT image of the patient 5, the doctor who has better fidelity to the structural information in the CBCT image in the prior art is considered to be 0 person, the doctor who has better fidelity to the structural information in the CBCT image in the present application is considered to be 45 persons, and the doctor who is not much different from the doctor is considered to be 5 persons.
The above experimental data show that, compared with the prior art, the method and the device for generating the sCT image have higher HU accuracy and better fidelity to the structural information in the CBCT image.
The foregoing describes preferred embodiments of the present invention, but is not intended to limit the invention thereto. Modifications and variations of the embodiments disclosed herein may be made by those skilled in the art without departing from the scope and spirit of the invention.

Claims (7)

1. A deep learning neural network model system for multi-modal image synthesis is characterized by comprising a multi-resolution residual deep neural network formed by combining with a multi-resolution optimization strategy; firstly, training a residual deep neural network by using a low-resolution image, and then finely adjusting the residual deep neural network by using a medium-resolution image and a high-resolution image, so that the network can continuously learn detail information in the image until synthesizing an sCT image with high HU accuracy and structure fidelity; the residual deep neural network comprises A convolutional layers, B shedding layers, C batch normalization layers and D long-term residual errors which are connected; wherein, the convolution layer is used for extracting image characteristics; the falling layer is used for avoiding network overfitting; the batch normalization layer is used for standardizing the input of the corresponding convolution kernel, so that the network training process is stabilized and the learning efficiency is improved; the long-term residual connection is used for keeping structural information in the input image;
the two sides of each release layer are provided with the convolution layers, and each release layer is connected with the convolution layers adjacent to the two sides; inputting the extracted image characteristics into the adjacent peeling layer by the convolution layer;
a convolution layer is arranged between each falling layer and each batch normalization layer; the shedding layer, the convolution layer and the batch normalization layer are connected in sequence;
one end of each long-term residual error connection is connected between the convolution layer and the batch normalization layer; the other end of each long-term residual connection is connected between the other set of convolutional layers and the batch normalization layer.
2. The system of claim 1, wherein the residual deep neural network comprises 15 convolutional layers.
3. The system of claim 1, wherein the residual deep neural network has a shedding rate of 20%.
4. The system of claim 1, wherein there are 7 exfoliation layers.
5. The system of claim 1, wherein there are 7 batch normalization layers.
6. The system of claim 1, wherein there are 3 long-term residual connections.
7. The system of claim 1, wherein no pooling layer is used in the residual deep neural network to avoid loss of structural information in the image.
CN202110746839.XA 2021-07-02 2021-07-02 Deep learning neural network model system for multi-modal image synthesis Active CN113256500B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110746839.XA CN113256500B (en) 2021-07-02 2021-07-02 Deep learning neural network model system for multi-modal image synthesis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110746839.XA CN113256500B (en) 2021-07-02 2021-07-02 Deep learning neural network model system for multi-modal image synthesis

Publications (2)

Publication Number Publication Date
CN113256500A CN113256500A (en) 2021-08-13
CN113256500B true CN113256500B (en) 2021-10-01

Family

ID=77190426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110746839.XA Active CN113256500B (en) 2021-07-02 2021-07-02 Deep learning neural network model system for multi-modal image synthesis

Country Status (1)

Country Link
CN (1) CN113256500B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631909A (en) * 2015-12-23 2016-06-01 浙江大学 CBCT iterative reconstruction method with artifact correction assistance
CN112767251A (en) * 2021-01-20 2021-05-07 重庆邮电大学 Image super-resolution method based on multi-scale detail feature fusion neural network

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100498839C (en) * 2006-03-08 2009-06-10 杭州电子科技大学 Multi-modality medical data three-dimensional visualization method
EP3189492A4 (en) * 2014-09-01 2018-05-30 Aditya Imaging Information Technologies A method and system for analyzing one or more multi-resolution medical images
CN106683067B (en) * 2017-01-20 2020-06-23 福建帝视信息科技有限公司 Deep learning super-resolution reconstruction method based on residual sub-images
US10853977B2 (en) * 2017-08-30 2020-12-01 Korea Advanced Institute Of Science And Technology Apparatus and method for reconstructing image using extended neural network
CN109325931A (en) * 2018-08-22 2019-02-12 中北大学 Based on the multi-modality images fusion method for generating confrontation network and super-resolution network
US11164067B2 (en) * 2018-08-29 2021-11-02 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing a multi-resolution neural network for use with imaging intensive applications including medical imaging
CN109685811B (en) * 2018-12-24 2019-12-13 北京大学第三医院 PET/CT high-metabolism lymph node segmentation method based on dual-path U-net convolutional neural network
CN109978778B (en) * 2019-03-06 2020-11-13 浙江工业大学 Convolutional neural network medical CT image denoising method based on residual learning
CN110223255B (en) * 2019-06-11 2023-03-14 太原科技大学 Low-dose CT image denoising and recursion method based on residual error coding and decoding network
US11547378B2 (en) * 2019-07-11 2023-01-10 Canon Medical Systems Corporation Apparatus and method combining deep learning (DL) with an X-ray computed tomography (CT) scanner having a multi-resolution detector
US10937158B1 (en) * 2019-08-13 2021-03-02 Hong Kong Applied Science and Technology Research Institute Company Limited Medical image segmentation based on mixed context CNN model
CN110473195B (en) * 2019-08-13 2023-04-18 中山大学 Medical focus detection framework and method capable of being customized automatically
CN111882514B (en) * 2020-07-27 2023-05-19 中北大学 Multi-mode medical image fusion method based on double-residual ultra-dense network
CN112507777A (en) * 2020-10-10 2021-03-16 厦门大学 Optical remote sensing image ship detection and segmentation method based on deep learning
CN112641471B (en) * 2020-12-30 2022-09-09 北京大学第三医院(北京大学第三临床医学院) Bladder capacity determination and three-dimensional shape assessment method and system special for radiotherapy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631909A (en) * 2015-12-23 2016-06-01 浙江大学 CBCT iterative reconstruction method with artifact correction assistance
CN112767251A (en) * 2021-01-20 2021-05-07 重庆邮电大学 Image super-resolution method based on multi-scale detail feature fusion neural network

Also Published As

Publication number Publication date
CN113256500A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
Zhang et al. Improving CBCT quality to CT level using deep learning with generative adversarial network
CN109598722B (en) Image analysis method based on recurrent neural network
CN111429379B (en) Low-dose CT image denoising method and system based on self-supervision learning
JP4919408B2 (en) Radiation image processing method, apparatus, and program
CN109949235A (en) A kind of chest x-ray piece denoising method based on depth convolutional neural networks
CN110223255B (en) Low-dose CT image denoising and recursion method based on residual error coding and decoding network
WO2022000183A1 (en) Ct image denoising system and method
Huang et al. Learning a deep CNN denoising approach using anatomical prior information implemented with attention mechanism for low-dose CT imaging on clinical patient data from multiple anatomical sites
Gholizadeh-Ansari et al. Low-dose CT denoising with dilated residual network
Xue et al. Cone beam CT (CBCT) based synthetic CT generation using deep learning methods for dose calculation of nasopharyngeal carcinoma radiotherapy
CN112258438B (en) LDCT image recovery method based on unpaired data
CN113256500B (en) Deep learning neural network model system for multi-modal image synthesis
CN116630738A (en) Energy spectrum CT imaging method based on depth convolution sparse representation reconstruction network
US20220292641A1 (en) Dynamic imaging and motion artifact reduction through deep learning
CN116402954A (en) Spine three-dimensional structure reconstruction method based on deep learning
Gholizadeh-Ansari et al. Low-dose CT denoising using edge detection layer and perceptual loss
CN116563533A (en) Medical image segmentation method and system based on target position priori information
CN113491529B (en) Single-bed PET (positron emission tomography) delayed imaging method without concomitant CT (computed tomography) radiation
WO2023000244A1 (en) Image processing method and system, and application of image processing method
Wang et al. A self-supervised guided knowledge distillation framework for unpaired low-dose CT image denoising
CN116385317B (en) Low-dose CT image recovery method based on self-adaptive convolution and transducer mixed structure
CN113256752B (en) Low-dose CT reconstruction method based on double-domain interleaving network
Wang et al. An unsupervised dual contrastive learning framework for scatter correction in cone-beam CT image
Jeihouni et al. Superresolution and Segmentation of OCT Scans Using Multi-Stage Adversarial Guided Attention Training
KR102441033B1 (en) Deep-learning based limited-angle computed tomography image reconstruction system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant