WO2021151275A1 - Image segmentation method and apparatus, device, and storage medium - Google Patents

Image segmentation method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2021151275A1
WO2021151275A1 PCT/CN2020/098975 CN2020098975W WO2021151275A1 WO 2021151275 A1 WO2021151275 A1 WO 2021151275A1 CN 2020098975 W CN2020098975 W CN 2020098975W WO 2021151275 A1 WO2021151275 A1 WO 2021151275A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
network model
feature map
layer
muscle
Prior art date
Application number
PCT/CN2020/098975
Other languages
French (fr)
Chinese (zh)
Inventor
章古月
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202010431606.6A external-priority patent/CN111696082B/en
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151275A1 publication Critical patent/WO2021151275A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing

Definitions

  • This application relates to the field of artificial intelligence data processing, and in particular to an image segmentation method, device, equipment, and storage medium.
  • the analysis of human body components such as fat and skeletal muscle is an important method of medical research.
  • the content of fat and skeletal muscle in the human body is an important basis for evaluating individual nutritional status, and is important in clinical aspects such as patient diagnosis, treatment and prognosis. Guiding significance.
  • quantitative analysis of fat and skeletal muscle based on imaging techniques such as Computed Tomography (CT) is a widely recognized evaluation method.
  • CT Computed Tomography
  • the skeletal muscle area, visceral fat area, subcutaneous fat area, and total abdominal fat volume in CT images of the umbilical plane have important clinical value.
  • manually marking the dividing line of the muscle is very time-consuming, and the accuracy of the dividing line is not good, resulting in the problem that the segmentation of the abdominal muscle image and the fat image takes a long time and the segmentation effect is poor.
  • this application proposes an image segmentation method, device, computer equipment, and computer-readable storage medium to solve the problem of long time-consuming segmentation of abdominal muscle images and fat images and low segmentation accuracy in the prior art.
  • this application proposes an image segmentation method, which includes the steps:
  • 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
  • the predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  • an image segmentation device including:
  • Conversion module used to convert abdominal CT image data in DICOM format into abdominal image in JPG format
  • Processing module used to construct a generation network model based on the Vnet network model, and input the abdominal image in JPG format into the generation network model;
  • Generating module used to generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
  • Obtaining module used to obtain predicted segmentation result images according to the 6-channel predicted segmentation tags, where the predicted segmentation result images include subcutaneous fat images, muscle images, bone images, visceral fat images, internal organs images, and background images.
  • the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program when the computer program is executed. The following steps:
  • 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
  • the predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to implement the following steps:
  • 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
  • the predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  • the image segmentation method, device, computer equipment, and computer-readable storage medium proposed in this application input the JPG format abdominal image into the generation network model constructed based on the Vnet network model; Generate a network model to generate 6-channel predicted segmentation labels; obtain predicted segmentation result images according to the 6-channel predicted segmentation labels, where the predicted segmentation result images include subcutaneous fat images, muscle images, bone images, visceral fat images, and internal organs images , Background image.
  • the predicted segmentation result images include subcutaneous fat images, muscle images, bone images, visceral fat images, and internal organs images , Background image.
  • Fig. 1 is a schematic flowchart of a first embodiment of an image segmentation method according to the present application
  • FIG. 2 is a schematic flowchart of step S102 of the image segmentation method of the present application.
  • FIG. 3 is a schematic flowchart of step S104 of the image segmentation method of the present application.
  • FIG. 4 is a schematic diagram of an embodiment of the CA module of the image segmentation device of the present application.
  • FIG. 5 is a schematic diagram of an embodiment of a predicted segmentation result image of the image segmentation device of the present application.
  • Fig. 6 is a schematic diagram of an embodiment of a gold standard image of the image segmentation device of the present application.
  • FIG. 7 is a schematic diagram of an embodiment of a discriminant network model of the image segmentation device according to the present application.
  • FIG. 8 is a schematic diagram of program modules of the first embodiment of the image segmentation device of the present application.
  • FIG. 9 is a schematic diagram of an embodiment of a processing module of the image segmentation device of the present application.
  • FIG. 10 is a schematic diagram of an embodiment of a generating module of the image segmentation device of the present application.
  • Fig. 11 is a schematic diagram of an optional hardware architecture of the computer device of the present application.
  • FIG. 1 it is a schematic flowchart of an image segmentation method provided by an embodiment of this application.
  • the method can be executed by a device, and the device can be implemented by software and/or hardware.
  • the image segmentation method includes:
  • step S100 the abdominal CT image data in DICOM format is converted into an abdominal image in JPG format.
  • the CT abdominal image data in the Digital Imaging and Communications in Medicine (DICOM) format is set to a specific window width and window level for the abdominal image, and then the CT image in DICOM format is converted through the format conversion program The data is converted to abdomen image in JPG format, and the abdomen image in JPG format is saved.
  • DICOM format abdominal CT image data and JPG format abdominal map can also be stored in a blockchain In the node.
  • the specific window width and window level for the abdominal image can be set to a window width of 400 HU and a window level of 10 HU.
  • the abdominal CT image data in the DICOM format contains protected health information (PHI) of the patient, such as name, gender, age, and other image-related information, such as captured and generated images Equipment information, some medical context related information, etc.
  • PHI protected health information
  • the DICOM format abdominal CT image data carries a lot of information, which can be divided into the following four categories: (a) Patient information, (b) Examination and Study information, (c) Series information, (d) Image Image information.
  • Patient information includes patient name, patient ID, patient gender, patient weight, etc.
  • Study information includes: inspection number, inspection instance number, inspection date, inspection time, inspection location, inspection description, etc.
  • Series information includes serial number, inspection mode, image location, inspection description and description, image orientation, image location, layer thickness, layer-to-layer spacing, actual relative position, and body position.
  • Image information includes information such as the time the image was taken, pixel spacing, image code, and sampling rate on the image. According to the pixel spacing, the conversion parameters between the pixel points and the physical space area can be obtained, and the actual area of the physical space corresponding to the pixel area can be calculated according to the conversion parameters.
  • step S102 a generation network model is constructed based on the Vnet network model, and the abdominal image in the JPG format is input into the generation network model.
  • the step S102 includes the following steps:
  • Step S1021 Set the convolution kernel in the encoding stage of the Vnet network model to a two-dimensional convolution kernel
  • Step S1022 replacing the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model
  • Step S1023 access the channel attention CA module in the modified Vnet network model to obtain the generative network model, where the CA module is used to obtain the encoding stage and the decoding stage of the modified Vnet network Generate semantic information of the high-level feature map, and select pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
  • the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage.
  • the feature map obtained by the next encoding layer is higher than that of the previous encoding.
  • the feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
  • the Vnet network model is Fausto Milletari, Nasir Nawab, Said Ahmed Ahmad ( The medical imaging Vnet network model proposed by Seyed-Ahmad (Ahmadi) and others.
  • the Vnet network model is a typical encoding-decoding network model.
  • the coding stage includes multiple coding layers, and each coding layer includes a convolutional layer, an activation layer, and a down-sampling layer.
  • the decoding stage includes multiple decoding layers, and each decoding layer includes a deconvolution layer, an activation layer, and an upsampling layer.
  • the convolution kernel in the encoding stage of the Vnet network model is based on a three-dimensional convolution kernel, but the three-dimensional data is unreliable due to the thicker CT data scanning layer.
  • the convolution kernel in the encoding stage of the Vnet network model is set as a two-dimensional convolution kernel, and segmentation is performed separately based on the two-dimensional image.
  • the deconvolution in the decoding stage of the Vnet network model is replaced with bilinear interpolation.
  • Step S104 Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels.
  • the step S104 includes the following steps:
  • Step S1041 Obtain a feature map of each coding layer through the coding stage of the generated network model
  • Step S1042 Obtain a feature map of each decoding layer through the decoding stage of the generating network model
  • Step S1043 In the encoding stage, the CA module is used to channelize and activate the high-level features of the h*w*2c dimension of the next layer of the adjacent encoding layer in the encoding stage to obtain the first weight results of different channels , Multiplying the first weight results of the different channels with the low-level features of the upper 2h*2w*c dimension of the adjacent coding layer to obtain the first feature map of the 2h*2w*c dimension;
  • Step S1044 in the decoding stage, perform channelization operation and activation operation on the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer in the decoding stage through the CA module to obtain the second weight results of different channels Multiplying the second weight results of the different channels with the low-level features of the 2h*2w*c dimension of the next layer of the adjacent coding layer to obtain a second feature map of the 2h*2w*c dimension;
  • Step S1045 Obtain the 6-channel prediction segmentation according to the feature map obtained in each layer of the encoding stage, the feature map obtained in each layer of the decoding stage, the first feature map, and the second feature map. Label.
  • the convolution operation is performed on the convolutional layer to extract features from the input abdominal CT image.
  • an appropriate stride is used to reduce the resolution. If the resolution of the previous layer is 2h*2w, the resolution of the next layer is reduced to h*w.
  • the features of the next layer in the encoding stage of the generative network model are doubled compared to the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is c, then the next layer The number of features is 2c.
  • the feature map of each coding layer is obtained through the coding stage of the generating network model.
  • the feature map obtained by the next coding layer of the adjacent coding layer is higher than the feature map obtained by the previous coding layer.
  • the high-level features acquired in the next layer of the adjacent coding layer in the coding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, w represents the width of the graph, and 2c represents the number of features .
  • the low-level features obtained by the upper layer of the adjacent coding layer in the coding stage of the generating network model are low-level features of dimension 2h*2w*c, 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
  • each input voxel is projected to a larger area through the kernel through the deconvolution layer to increase the data size. If the resolution of the previous layer is h*w, then The resolution of the next layer is increased to 2h*2w.
  • the features of the next layer in the decoding stage of the generative network model are twice as small as the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is 2c, then the next layer The number of features is c.
  • the feature map of each decoding layer is obtained through the decoding stage of the generative network model.
  • the feature map obtained by the upper decryption layer is higher than that obtained by the next decryption layer.
  • the feature map should be low-level.
  • the high-level features acquired by the upper layer of the adjacent decoding layer in the decoding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, and w represents the height of the graph. Wide, 2c represents the number of features.
  • the low-level features acquired in the next layer of the adjacent decoding layer in the decoding code stage of the generating network model are low-level features of dimension 2h*2w*c, where 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
  • a Channel-Attention (CA) module is connected to the modified Vnet network, and the misclassified pixels are corrected through the CA module.
  • step S1043 channelizing and activating the high-level features of the next layer of h*w*2c dimensions of the adjacent coding layer in the coding stage through the CA module to obtain the first weight results of different channels includes The following steps:
  • the high-level features of the h*w*2c dimension of the next layer of the adjacent coding layer are passed through the global average pooling, 1*1 convolution, batch normalization (BN) algorithm model, and nonlinearity of the CA module. Rectified Linear Units, ReLu) activation function to obtain 1*1*c feature channel, c represents the number of features; pass the 1*1*c feature channel through the fully connected layer and sigmoid activation function to obtain the first of different channels Weighted result.
  • step S1044 channelizing and activating the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer of the decoding stage through the CA module to obtain the second weight results of different channels includes The following steps: The advanced features of the h*w*2c dimension of the upper layer of the adjacent decoding layer are obtained through the global average pooling, 1*1 convolution, BN algorithm model, and ReL activation function of the CA module to obtain 1* 1*c feature channel, c represents the number of features; pass the 1*1*c feature channel through the fully connected layer and the sigmoid activation function to obtain the second weight results of different channels.
  • the processing flow of the CA module mainly includes channelization operation, activation operation, and weight assignment reweighting operation.
  • the CA module is used to channelize the advanced features of the next layer of the adjacent encoding layer in the encoding stage, where the channelization operation includes: converting the advanced features of the next layer of the adjacent encoding layer Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c
  • the feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with
  • the low-level features of the upper layer of adjacent coding layers are multiplied to obtain a first feature map, and the first feature map is a 2h*2w*
  • the CA module is used to channelize the advanced features of the upper layer of the adjacent decoding layer in the decoding stage, where the channelization operation includes: Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c
  • the feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the next layer of adjacent coding layers are multiplied to obtain a second feature map.
  • the second feature map is a 2h*2w*c dimension feature map.
  • Step S106 Obtain a predicted segmentation result image according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat images, muscle images, bone images, visceral fat images, internal organs images, and background images.
  • the predicted segmentation labels of the 6-channel respectively represent the predicted segmentation labels of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background, which are filled with different colors to obtain the predicted segmentation result image. For example, you can use Red draws subcutaneous fat, green draws muscle, yellow draws bones, blue draws visceral fat, pink draws internal organs, and black draws background. Please refer to Figure 5. In Figure 5, different gray-scale colors are used to represent the six categories of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background.
  • the image segmentation method further includes:
  • the number of pixels in the subcutaneous fat area, visceral fat area, and muscle area is determined from the predicted segmentation result image, and the difference between the pixel points and the physical space area is obtained from the CT image data in the DICOM format.
  • the conversion parameter is to determine the actual area of subcutaneous fat, visceral fat, and muscle based on the number of pixels in the subcutaneous fat area, visceral fat, and muscle area multiplied by the square of the conversion parameter.
  • the image information of the CT image data in the DICOM format includes information such as the time when the image was taken, the pixel spacing, the image code, and the sampling rate on the image.
  • the conversion parameters between the pixel points and the physical space area can be obtained, and the actual areas of subcutaneous fat, visceral fat, and muscle can be calculated according to the following formula (1).
  • Formula (1) s n*x ⁇ 2, where s represents the actual area of subcutaneous fat, visceral fat, and muscle, n represents the total number of pixels in the subcutaneous fat area, visceral fat area, and muscle area, and x represents the conversion parameter .
  • the image segmentation method further includes:
  • the Series information of the abdominal CT image data in the DICOM format includes serial number, inspection modality, image location, inspection description and description, image orientation, image location, layer thickness, layer-to-layer spacing , Actual relative position and body position, etc. Therefore, the scanning layer thickness information can be obtained from the CT image data in the DICOM format. The actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
  • the image segmentation method further includes:
  • the predicted segmentation label and the real label corresponding to the gold standard image are respectively input into the discriminant network model, and the discriminant scores of the predicted segmentation result image and the gold standard image are obtained respectively, and the predicted segmentation is determined according to the discriminant scores Based on the gap between the result image and the gold standard image, parameter adjustment is performed on the generation network model based on the gap, so as to optimize the generation network model.
  • the generation network model can be optimized by adjusting the parameters of the generation network model, so as to improve the effect of abdominal image segmentation.
  • the gold standard image is a segmentation result manually annotated by a person, and is used to compare with the result of network estimation to evaluate the performance of the generated network model.
  • the gold standard image uses different colors to represent subcutaneous fat, muscle, bone, visceral fat, internal organs, and background. Please refer to Figure 7.
  • Figure 7 is a gold standard image representing subcutaneous fat, muscle, bone, visceral fat, internal organs, and background areas in different grayscale colors.
  • FIG. 7 is a schematic diagram of the architecture of the discriminant network model.
  • the discriminant network model includes 6 convolutional layers.
  • the first convolutional layer 802 includes a 3*3 convolutional layer and a nonlinear ReLu activation function;
  • the second convolutional layer 803 includes a 3*3 convolutional layer and a batch standardized algorithm model.
  • the third convolutional layer 804 includes 3*3 convolutional layer, batch standardized algorithm model, nonlinear ReLu activation function;
  • the fourth convolutional layer 805 includes 3*3 convolutional layer, batch standardized algorithm model , Non-linear ReLu activation function;
  • the fifth convolutional layer 806 includes 3*3 convolutional layers, batch standardized algorithm models, and nonlinear ReLu activation functions;
  • the sixth convolutional layer 807 includes global average pooling, 1*1 convolutional layer . 801 represents the predicted segmentation label of 512*512*6 dimensions or the real label corresponding to the gold standard image.
  • the predicted segmentation label of 512*512*6 dimensions and the real label corresponding to the gold standard image are input into the discriminant network model, and a convolution operation with a size of 3 and a step size of 2 is used for down-sampling.
  • the number of downsampling corresponds to the number of downsampling of the encoder in the generative network model.
  • a total of 5 downsampling is obtained to obtain a 16*16*256 feature map.
  • the global average pooling and 1*1 convolution kernel are used to obtain gold The discriminant scores of the standard image and the predicted segmented image.
  • the optimization of the KL divergence (Kullback Leibler divergence) between the predicted label result image and the gold standard image is adjusted to the optimization of the bulldozer distance (Earth Mover distance), and the bulldozer distance can be always Guide the optimization of the generative network model without being troubled by the disappearance of the gradient.
  • gradient penalty is used to accelerate the convergence of the training process of the generating network model and the discriminant network model.
  • the gradient penalty of the zero center is easier to converge to the center point, so the gradient penalty of the zero center is used.
  • the generating network model and the discriminant network model each have a corresponding loss function.
  • the loss function of the generated network model is as follows:
  • the loss function of the discriminant network model is as follows:
  • p inter (I inter ) is a derivative distribution obtained by interpolation between the true sample distribution and the false sample distribution.
  • Loss loss
  • Orig original image
  • Dice dice coefficient
  • Gen generating network model
  • I image
  • Mask mask
  • D discriminating network model
  • G generating network Model
  • p_g false sample distribution
  • p_train true sample distribution
  • P_inter derived distribution obtained by interpolation between true sample distribution and false sample distribution
  • C center, and C equals 0 to zero center.
  • the generative network model and the discriminant network model reduce the values of these two loss functions through continuous learning to achieve the goal of optimization.
  • the image segmentation method proposed in this application inputs the JPG format abdominal image into a generative network model constructed based on the Vnet network model; generates 6-channel predicted segmentation labels through the generative network model; according to the 6-channel prediction
  • the segmentation label obtains the predicted segmentation result image, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  • FIG. 8 it is an image segmentation device 100 proposed in the present application.
  • the image segmentation device 100 described in this application can be installed in a computer device.
  • the image segmentation device may include a conversion module 101, a processing module 102, a generation module 103, and an acquisition module 104.
  • the module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of a computer device and can complete fixed functions, and are stored in the memory of the computer device.
  • each module/unit is as follows:
  • the conversion module 101 is used to convert the abdominal CT image data in DICOM format into the abdominal image in JPG format.
  • the CT abdominal image data in the Digital Imaging and Communications in Medicine (DICOM) format is set to a specific window width and window level for the abdominal image, and then the CT image in DICOM format is converted through the format conversion program The data is converted to abdomen image in JPG format, and the abdomen image in JPG format is saved.
  • DICOM format abdominal CT image data and JPG format abdominal map can also be stored in a blockchain In the node.
  • the specific window width and window level for the abdominal image can be set to a window width of 400 HU and a window level of 10 HU.
  • the abdominal CT image data in the DICOM format contains protected health information (PHI) of the patient, such as name, gender, age, and other image-related information, such as captured and generated images Equipment information, some medical context related information, etc.
  • PHI protected health information
  • the DICOM format abdominal CT image data carries a lot of information, which can be divided into the following four categories: (a) Patient information, (b) Examination and Study information, (c) Series information, (d) Image Image information.
  • Patient information includes patient name, patient ID, patient gender, patient weight, etc.
  • Study information includes: inspection number, inspection instance number, inspection date, inspection time, inspection location, inspection description, etc.
  • Series information includes serial number, inspection mode, image location, inspection description and instructions, image orientation, image location, layer thickness, layer-to-layer spacing, actual relative position, and body position.
  • Image information includes information such as the time the image was taken, pixel spacing, image code, and sampling rate on the image. According to the pixel spacing, the conversion parameters between the pixel points and the physical space area can be obtained, and the actual area of the physical space corresponding to the pixel area can be calculated according to the conversion parameters.
  • the processing module 102 is configured to construct a generation network model based on the Vnet network model, and input the abdominal image in JPG format into the generation network model.
  • the processing module 102 includes:
  • a setting sub-module 1021 is used to set the convolution kernel in the encoding stage of the Vnet network model to a two-dimensional convolution kernel
  • the replacement sub-module 1022 is used to replace the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model;
  • the access sub-module 1023 is used to access the channel attention CA module in the modified Vnet network model to obtain the generated network model, wherein the CA module is used to obtain the modified Vnet network model Semantic information of the high-level feature map generated in the encoding stage and the decoding stage, and selecting pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
  • the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage.
  • the feature map obtained by the next encoding layer is higher than that of the previous encoding.
  • the feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
  • the Vnet network model is Fausto Milletari, Nasir Nawab, Said Ahmed Ahmad ( The medical imaging Vnet network model proposed by Seyed-Ahmad (Ahmadi) and others.
  • the Vnet network model is a typical encoding-decoding network model.
  • the coding stage includes multiple coding layers, and each coding layer includes a convolutional layer, an activation layer, and a down-sampling layer.
  • the decoding stage includes multiple decoding layers, and each decoding layer includes a deconvolution layer, an activation layer, and an upsampling layer.
  • the convolution kernel in the encoding stage of the Vnet network model is based on a three-dimensional convolution kernel, but the three-dimensional data is unreliable due to the thicker CT data scanning layer.
  • the convolution kernel in the encoding stage of the Vnet network model is set as a two-dimensional convolution kernel, and segmentation is performed separately based on the two-dimensional image.
  • the deconvolution in the decoding stage of the Vnet network model is replaced with bilinear interpolation.
  • the generating module 103 is configured to generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels .
  • the generating module 103 includes:
  • the first obtaining sub-module 1031 is configured to obtain the feature map of each coding layer through the coding stage of the generated network model
  • the second obtaining sub-module 1032 is configured to obtain the feature map of each decoding layer through the decoding stage of the generation network model;
  • the first processing sub-module 1033 is used for channelizing and activating the high-level features of the h*w*2c dimension of the next layer of the adjacent coding layer in the coding phase through the CA module in the coding phase to obtain different
  • the first weight result of the channel, the first weight result of the different channels is multiplied by the low-level features of the upper 2h*2w*c dimension of the adjacent coding layer to obtain the first feature map of the 2h*2w*c dimension ;
  • the second processing sub-module 1034 is used to perform channelization operation and activation operation on the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer in the decoding stage through the CA module in the decoding stage to obtain different The second weight result of the channel; the second weight result of the different channel is multiplied by the low-level features of the next layer of the adjacent coding layer with 2h*2w*c dimensions to obtain a second feature map of 2h*2w*c dimensions ;
  • the third processing sub-module 1035 is configured to obtain all the features according to the feature maps obtained in each layer of the encoding stage, the feature maps obtained in each layer of the decoding stage, the first feature map, and the second feature map.
  • the 6-channel prediction segmentation label is configured to obtain all the features according to the feature maps obtained in each layer of the encoding stage, the feature maps obtained in each layer of the decoding stage, the first feature map, and the second feature map.
  • the convolution operation is performed on the convolutional layer to extract features from the input abdominal CT image.
  • an appropriate stride is used to reduce the resolution. If the resolution of the previous layer is 2h*2w, the resolution of the next layer is reduced to h*w.
  • the features of the next layer in the encoding stage of the generative network model are doubled compared to the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is c, then the next layer The number of features is 2c.
  • the feature map of each coding layer is obtained through the coding stage of the generating network model.
  • the feature map obtained by the next coding layer of the adjacent coding layer is higher than the feature map obtained by the previous coding layer.
  • the high-level features acquired by the next layer of the adjacent coding layer in the coding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, w represents the width of the graph, and 2c represents the number of features .
  • the low-level features obtained by the upper layer of the adjacent coding layer in the coding stage of the generating network model are low-level features of dimension 2h*2w*c, 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
  • each input voxel is projected to a larger area through the kernel through the deconvolution layer to increase the data size. If the resolution of the previous layer is h*w, then The resolution of the next layer is increased to 2h*2w.
  • the features of the next layer in the decoding stage of the generative network model are twice as small as the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is 2c, then the next layer The number of features is c.
  • the feature map of each decoding layer is obtained through the decoding stage of the generative network model.
  • the feature map obtained by the upper decryption layer is higher than that obtained by the next decryption layer.
  • the feature map should be low-level.
  • the high-level features acquired by the upper layer of the adjacent decoding layer in the decoding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, and w represents the height of the graph. Wide, 2c represents the number of features.
  • the low-level features acquired in the next layer of the adjacent decoding layer in the decoding code stage of the generating network model are low-level features of dimension 2h*2w*c, where 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
  • a Channel-Attention (CA) module is connected to the modified Vnet network, and the misclassified pixels are corrected through the CA module.
  • the first processing submodule 1033 is also used to pass the high-level features of the h*w*2c dimension of the next layer of the adjacent coding layer through the global average pooling, 1*1 convolution, and batch normalization of the CA module (Batch Normalization, BN) algorithm model and non-linear (Rectified Linear Units, ReLu) activation function to obtain a 1*1*c feature channel, where c represents the number of features; the 1*1*c feature channel is fully connected Layer and sigmoid activation function to obtain the first weight results of different channels.
  • CA module Batch Normalization, BN
  • Rectified Linear Units, ReLu non-linear activation function
  • the second processing submodule 1034 is also used to pass the h*w*2c-dimensional high-level features of the upper layer of the adjacent decoding layer through the global average pooling, 1*1 convolution, and BN algorithm of the CA module
  • the model and the ReL activation function are used to obtain 1*1*c feature channels, where c represents the number of features; the 1*1*c feature channels are passed through the fully connected layer and the sigmoid activation function to obtain the second weight results of different channels.
  • the processing flow of the CA module mainly includes a channelized Channelization operation, an activation operation, and a weight assignment reweighting operation.
  • the CA module is used to channelize the advanced features of the next layer of the adjacent encoding layer in the encoding stage, where the channelization operation includes: converting the advanced features of the next layer of the adjacent encoding layer Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c
  • the feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the upper layer of adjacent coding layers are multiplied to obtain a first feature map, and the first feature map
  • the CA module is used to channelize the advanced features of the upper layer of the adjacent decoding layer in the decoding stage, where the channelization operation includes: Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c
  • the feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the next layer of adjacent coding layers are multiplied to obtain a second feature map.
  • the second feature map is a 2h*2w*c dimension feature map.
  • the acquisition module 104 is configured to obtain a predicted segmentation result image according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, Background image.
  • the predicted segmentation labels of the 6-channel respectively represent the predicted segmentation labels of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background, which are filled with different colors to obtain the predicted segmentation result image. For example, you can use Red draws subcutaneous fat, green draws muscle, yellow draws bones, blue draws visceral fat, pink draws internal organs, and black draws background. Please refer to Figure 5. In Figure 5, different gray-scale colors are used to represent the six categories of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background.
  • the image segmentation device 100 further includes:
  • the determining module is used to determine the number of pixels in the subcutaneous fat area, the visceral fat area, and the muscle area from the predicted segmentation result image, and determine the subcutaneous fat based on the determined number of pixels and the pre-acquired physical space conversion parameters , The actual area of visceral fat and muscle.
  • the number of pixels in the subcutaneous fat area, visceral fat area, and muscle area is determined from the predicted segmentation result image, and the difference between the pixel points and the physical space area is obtained from the CT image data in the DICOM format.
  • the conversion parameter is to determine the actual area of subcutaneous fat, visceral fat, and muscle based on the number of pixels in the subcutaneous fat area, visceral fat, and muscle area multiplied by the square of the conversion parameter.
  • the image information of the CT image data in the DICOM format includes information such as the time when the image was taken, the pixel spacing, the image code, and the sampling rate on the image.
  • the conversion parameters between the pixel points and the physical space area can be obtained, and the actual areas of subcutaneous fat, visceral fat, and muscle can be calculated according to the following formula (1).
  • Formula (1) s n*x ⁇ 2, where s represents the actual area of subcutaneous fat, visceral fat, and muscle, n represents the total number of pixels in the subcutaneous fat area, visceral fat area, and muscle area, and x represents the conversion parameter .
  • the image segmentation device 100 further includes:
  • the calculation module is used to obtain scan layer thickness information from the abdominal CT image data, and multiply the actual area of the subcutaneous fat, visceral fat, and muscle by the scan layer thickness to obtain the actual area of the subcutaneous fat, visceral fat, and muscle. volume.
  • the Series information of the abdominal CT image data in the DICOM format includes serial number, inspection modality, image location, inspection description and description, image orientation, image location, layer thickness, layer-to-layer spacing , Actual relative position and body position, etc. Therefore, the scanning layer thickness information can be obtained from the CT image data in the DICOM format. The actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
  • the image segmentation device 100 further includes:
  • the optimization module is configured to input the predicted segmentation label and the real label corresponding to the gold standard image into the discriminant network model to obtain the discriminant scores of the predicted segmentation result image and the gold standard image, respectively, based on the discriminant scores Determine the gap between the predicted segmentation result image and the gold standard image, and adjust the parameters of the generation network model based on the gap to optimize the generation network model.
  • the generation network model can be optimized by adjusting the parameters of the generation network model, so as to improve the effect of abdominal image segmentation.
  • the gold standard image is a segmentation result manually annotated by a person, and is used to compare with the result of network estimation to evaluate the performance of the generated network model.
  • the gold standard image uses different colors to represent subcutaneous fat, muscle, bone, visceral fat, internal organs, and background. Please refer to 6.
  • Figure 6 is a gold standard image representing subcutaneous fat, muscle, bone, visceral fat, internal organs, and background areas with different grayscale colors.
  • FIG. 7 is a schematic diagram of the architecture of the discriminant network model.
  • the discriminant network model includes 6 convolutional layers.
  • the first convolutional layer 802 includes a 3*3 convolutional layer and a nonlinear ReLu activation function;
  • the second convolutional layer 803 includes a 3*3 convolutional layer and a batch standardized algorithm model.
  • the third convolutional layer 804 includes 3*3 convolutional layer, batch standardized algorithm model, nonlinear ReLu activation function;
  • the fourth convolutional layer 805 includes 3*3 convolutional layer, batch standardized algorithm model , Non-linear ReLu activation function;
  • the fifth convolutional layer 806 includes 3*3 convolutional layers, batch standardized algorithm models, and nonlinear ReLu activation functions;
  • the sixth convolutional layer 807 includes global average pooling, 1*1 convolutional layer . 801 represents the predicted segmentation label of 512*512*6 dimensions or the real label corresponding to the gold standard image.
  • the predicted segmentation label of 512*512*6 dimensions and the real label corresponding to the gold standard image are input into the discriminant network model, and a convolution operation with a size of 3 and a step size of 2 is used for down-sampling.
  • the number of downsampling corresponds to the number of downsampling of the encoder in the generative network model.
  • a total of 5 downsampling is obtained to obtain a 16*16*256 feature map.
  • the global average pooling and 1*1 convolution kernel are used to obtain gold The discriminant scores of the standard image and the predicted segmented image.
  • the optimization of the KL divergence (Kullback Leibler divergence) between the predicted label result image and the gold standard image is adjusted to the optimization of the bulldozer distance (Earth Mover distance), and the bulldozer distance can be always Guide the optimization of the generative network model without being troubled by the disappearance of the gradient.
  • gradient penalty is used to accelerate the convergence of the training process of the generating network model and the discriminant network model.
  • the gradient penalty of the zero center is easier to converge to the center point, so the gradient penalty of the zero center is used.
  • the generating network model and the discriminant network model each have a corresponding loss function.
  • the loss function of the generated network model is as follows:
  • the loss function of the discriminant network model is as follows:
  • p inter (I inter ) is a derivative distribution obtained by interpolation between the true sample distribution and the false sample distribution.
  • Loss loss
  • Orig original image
  • Dice dice coefficient
  • Gen generating network model
  • I image
  • Mask mask
  • D discriminating network model
  • G generating network Model
  • p_g false sample distribution
  • p_train true sample distribution
  • P_inter derived distribution obtained by interpolation between true sample distribution and false sample distribution
  • C center, and C equals 0 to zero center.
  • the generative network model and the discriminant network model reduce the values of these two loss functions through continuous learning to achieve the goal of optimization.
  • the image segmentation device proposed in this application inputs the JPG format abdominal image into a generation network model constructed based on the Vnet network model; generates a 6-channel prediction segmentation label through the generation network model; according to the 6-channel prediction
  • the segmentation label obtains the predicted segmentation result image, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  • FIG. 10 it is a schematic structural diagram of a computer device for implementing the image segmentation method of the present application.
  • the computer device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as abdomen CT image segmentation based on a Vnet network model. Procedure 12.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc.
  • the memory 11 may be an internal storage unit of the computer device 1 in some embodiments, for example, a mobile hard disk of the computer device 1.
  • the memory 11 may also be an external storage device of the computer device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the computer device 1. , SD) card, flash card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the computer device 1 and an external storage device.
  • the memory 11 can not only be used to store application software and various data installed in the computer device 1, such as the code of the abdominal CT image segmentation program based on the Vnet network model, etc., but also can be used to temporarily store the output that has been output or will be output. data.
  • the processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc.
  • the processor 10 is the control unit of the computer device, which uses various interfaces and lines to connect the various components of the entire computer device, and runs or executes programs or modules stored in the memory 11 (for example, based on The abdominal CT image segmentation program of the Vnet network model, etc.), and call the data stored in the memory 11 to execute various functions of the computer device 1 and process data.
  • the bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
  • FIG. 10 only shows a computer device with components. Those skilled in the art can understand that the structure shown in FIG. 10 does not constitute a limitation on the computer device 1, and may include fewer or more components than shown in the figure. Components, or combinations of certain components, or different component arrangements.
  • the computer device 1 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power
  • the device implements functions such as charge management, discharge management, and power consumption management.
  • the power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators.
  • the computer device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the computer device 1 may also include a network interface.
  • the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the computer equipment 1 Establish a communication connection with other computer equipment.
  • the computer device 1 may also include a user interface.
  • the user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)).
  • the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, or the like. Among them, the display can also be called a display screen or a display unit as appropriate, and is used to display the information processed in the computer device 1 and to display a visualized user interface.
  • the abdominal CT image segmentation program 12 based on the Vnet network model stored in the memory 11 in the computer device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
  • 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
  • the predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  • the DICOM format abdominal CT image data and the JPG format abdominal image can also be stored In a node of a blockchain.
  • the integrated module/unit of the computer device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, or a mobile hard disk , Floppy disks, compact discs.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

An image segmentation method, relating to the field of artificial intelligence, the method comprising: converting abdominal CT image data in DICOM format into an abdominal image in JPG format (S100); inputting the JPG format abdominal image into a generative network model constructed by a Vnet-based network model (S102); by means of the generative network model, generating six channels of predicted segmentation tags, the six channels of predicted segmentation tags comprising subcutaneous fat, musculature, bone, visceral fat, internal organ, and background predicted segmentation tags (S104); and on the basis of the six channels of predicted segmentation tags, obtaining predicted segmentation result images, the predicted segmentation result image comprising a subcutaneous fat image, a musculature image, a bone image, a visceral fat image, an internal organ image, and a background image (S106). In addition, the present method further relates to blockchain technology, and the DICOM format abdominal CT image data, the JPG format abdominal image can be stored in a blockchain. Thus, the effect of segmenting an abdominal musculature image and a fat image can be improved.

Description

图像分割方法、装置、设备及存储介质Image segmentation method, device, equipment and storage medium
本申请要求于2020年5月20日提交中国专利局、申请号为CN202010431606.6,发明名称为“图像分割方法、装置、电子设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 20, 2020, the application number is CN202010431606.6, and the invention title is "Image segmentation methods, devices, electronic equipment, and computer-readable storage media". The entire content is incorporated into this application by reference.
技术领域Technical field
本申请涉及人工智能的数据处理领域,尤其涉及一种图像分割方法、装置、设备及存储介质。This application relates to the field of artificial intelligence data processing, and in particular to an image segmentation method, device, equipment, and storage medium.
背景技术Background technique
脂肪与骨骼肌等人体成分分析是医学研究的重要手段,人体内含有的脂肪与骨骼肌等成分含量情况,是评价个体营养状态的重要依据,在患者的诊断、治疗与预后等临床环节具有重要指导意义。目前,基于电子计算机断层扫描(Computed Tomography,CT)等影像学技术的脂肪与骨骼肌定量分析是广受认可的评估手段。特别是脐平面CT图像的骨骼肌面积、内脏脂肪面积、皮下脂肪面积、全腹脂肪体积等指标具有重要的临床价值。The analysis of human body components such as fat and skeletal muscle is an important method of medical research. The content of fat and skeletal muscle in the human body is an important basis for evaluating individual nutritional status, and is important in clinical aspects such as patient diagnosis, treatment and prognosis. Guiding significance. At present, quantitative analysis of fat and skeletal muscle based on imaging techniques such as Computed Tomography (CT) is a widely recognized evaluation method. In particular, the skeletal muscle area, visceral fat area, subcutaneous fat area, and total abdominal fat volume in CT images of the umbilical plane have important clinical value.
发明人发现目前医生普遍方法是针对腹部脐平面图像,根据阈值将内脏脂肪和皮下脂肪分割出来,后续手动标注肌肉的分界线,对肌肉图像与脂肪图像进行分割。但是,由于手动标注肌肉的分界线非常耗时,分界线的精度不佳,导致存在腹部肌肉图像与脂肪图像分割耗时久、分割效果差的问题。The inventor found that the current common method for doctors is to segment the visceral fat and subcutaneous fat according to the threshold of the abdominal umbilical plane image, and then manually mark the dividing line of the muscle to segment the muscle image and the fat image. However, manually marking the dividing line of the muscle is very time-consuming, and the accuracy of the dividing line is not good, resulting in the problem that the segmentation of the abdominal muscle image and the fat image takes a long time and the segmentation effect is poor.
因此,如何在克服以上不足的情况下,提供基于CT腹部图像的图像处理方案,已经成为一个亟待解决的技术问题。Therefore, how to provide an image processing solution based on CT abdominal images while overcoming the above shortcomings has become an urgent technical problem to be solved.
发明内容Summary of the invention
有鉴于此,本申请提出一种图像分割方法、装置计算机设备及计算机可读存储介质,以解决现有技术中腹部肌肉图像与脂肪图像分割耗时久、分割精度低的问题。In view of this, this application proposes an image segmentation method, device, computer equipment, and computer-readable storage medium to solve the problem of long time-consuming segmentation of abdominal muscle images and fat images and low segmentation accuracy in the prior art.
首先,为实现上述目的,本申请提出一种图像分割方法,所述方法包括步骤:First of all, in order to achieve the above objective, this application proposes an image segmentation method, which includes the steps:
将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
为实现上述目的,本申请还提供一种图像分割装置,包括:To achieve the above objective, the present application also provides an image segmentation device, including:
转换模块:用于将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Conversion module: used to convert abdominal CT image data in DICOM format into abdominal image in JPG format;
处理模块:用于基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Processing module: used to construct a generation network model based on the Vnet network model, and input the abdominal image in JPG format into the generation network model;
生成模块:用于通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generating module: used to generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
获取模块:用于根据所述6通道的预测分割标签获取预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。Obtaining module: used to obtain predicted segmentation result images according to the 6-channel predicted segmentation tags, where the predicted segmentation result images include subcutaneous fat images, muscle images, bone images, visceral fat images, internal organs images, and background images.
为实现上述目的,本申请还提供一种计算机设备,包括存储器、处理器以及存储在所 述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如下步骤:In order to achieve the above object, the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program when the computer program is executed. The following steps:
将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:To achieve the foregoing objective, the present application also provides a computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to implement the following steps:
将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
相较于现有技术,本申请所提出的图像分割方法、装置、计算机设备及计算机可读存储介质,通过将所述JPG格式的腹部图像输入基于Vnet网络模型构建的生成网络模型;通过所述生成网络模型生成6通道的预测分割标签;根据所述6通道的预测分割标签得到预测分割结果图像,其中,预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。这样,无须手动标注,就可以得到比较准确的腹部肌肉图像与脂肪图像,减少腹部肌肉图像、脂肪图像分割的时间,提高腹部肌肉图像、脂肪图像的分割效果。Compared with the prior art, the image segmentation method, device, computer equipment, and computer-readable storage medium proposed in this application input the JPG format abdominal image into the generation network model constructed based on the Vnet network model; Generate a network model to generate 6-channel predicted segmentation labels; obtain predicted segmentation result images according to the 6-channel predicted segmentation labels, where the predicted segmentation result images include subcutaneous fat images, muscle images, bone images, visceral fat images, and internal organs images , Background image. In this way, it is possible to obtain more accurate abdominal muscle images and fat images without manual labeling, reduce the time for segmentation of abdominal muscle images and fat images, and improve the segmentation effect of abdominal muscle images and fat images.
附图说明Description of the drawings
图1是本申请图像分割方法第一实施例的流程示意图;Fig. 1 is a schematic flowchart of a first embodiment of an image segmentation method according to the present application;
图2是本申请图像分割方法步骤S102的流程示意图;FIG. 2 is a schematic flowchart of step S102 of the image segmentation method of the present application;
图3是本申请图像分割方法步骤S104的流程示意图;FIG. 3 is a schematic flowchart of step S104 of the image segmentation method of the present application;
图4是本申请图像分割装置的CA模块一实施例的示意图;4 is a schematic diagram of an embodiment of the CA module of the image segmentation device of the present application;
图5是本申请图像分割装置的预测分割结果图像一实施例的示意图;5 is a schematic diagram of an embodiment of a predicted segmentation result image of the image segmentation device of the present application;
图6是本申请图像分割装置的金标准图像一实施例的示意图;Fig. 6 is a schematic diagram of an embodiment of a gold standard image of the image segmentation device of the present application;
图7为本申请图像分割装置的判别网络模型一实施例的示意图;FIG. 7 is a schematic diagram of an embodiment of a discriminant network model of the image segmentation device according to the present application;
图8是本申请图像分割装置第一实施例的程序模块示意图;FIG. 8 is a schematic diagram of program modules of the first embodiment of the image segmentation device of the present application;
图9是本申请图像分割装置的处理模块一实施例的示意图;FIG. 9 is a schematic diagram of an embodiment of a processing module of the image segmentation device of the present application;
图10是本申请图像分割装置的生成模块一实施例的示意图;FIG. 10 is a schematic diagram of an embodiment of a generating module of the image segmentation device of the present application;
图11是本申请计算机设备一可选的硬件架构的示意图。Fig. 11 is a schematic diagram of an optional hardware architecture of the computer device of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
本申请提出一种图像分割方法。参照图1所示,为本申请一实施例提供的图像分割方法的流程示意图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。This application proposes an image segmentation method. Referring to FIG. 1, it is a schematic flowchart of an image segmentation method provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.
在本实施例中,图像分割方法包括:In this embodiment, the image segmentation method includes:
步骤S100,将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像。In step S100, the abdominal CT image data in DICOM format is converted into an abdominal image in JPG format.
在本实施例中,对医学数字成像和通信(Digital Imaging and Communications in Medicine,DICOM)格式的CT腹部图像数据设置针对腹部图像特定的窗宽窗位,然后通过格式转换程序将DICOM格式的CT图像数据转换为JPG格式的腹部图像,并保存JPG格式的腹部图像。需要强调的是,为进一步保证上述DICOM格式的腹部CT图像数据、JPG格式的腹部图的私密和安全性,上述DICOM格式的腹部CT图像数据、JPG格式的腹部图还可以存储于一区块链的节点中。In this embodiment, the CT abdominal image data in the Digital Imaging and Communications in Medicine (DICOM) format is set to a specific window width and window level for the abdominal image, and then the CT image in DICOM format is converted through the format conversion program The data is converted to abdomen image in JPG format, and the abdomen image in JPG format is saved. It should be emphasized that in order to further ensure the privacy and security of the aforementioned DICOM format abdominal CT image data and JPG format abdominal map, the aforementioned DICOM format abdominal CT image data and JPG format abdominal map can also be stored in a blockchain In the node.
本实施例中,针对腹部图像的特定窗宽窗位可以设置为窗宽400HU、窗位10HU。可以理解的是,所述DICOM格式的腹部CT图像数据中包含患者的受保护的健康信息(Protected Health Information,PHI),例如姓名,性别,年龄,以及其他图像相关信息,比如捕获并生成图像的设备信息,医疗的一些上下文相关信息等。所述DICOM格式的腹部CT图像数据携带着大量的信息,这些信息具体可以分为以下四类:(a)病人Patient信息、(b)检查Study信息、(c)序列Series信息、(d)图像Image信息。Patient信息包括患者姓名、患者ID、患者性别、患者体重等。Study信息包括:检查号、检查实例号、检查日期、检查时间、检查部位、检查的描述等。Series信息包括序列号、检查模态、图像位置、检查描述和说明、图像方位、图像位置、层厚、层与层之间的间距、实际相对位置及身体位置等。Image信息包括影像拍摄的时间、像素间距pixel spacing、图像码、图像上的采样率等信息。根据像素间距pixel spacing,可以获取像素点与物理空间面积之间的换算参数,根据换算参数,可以计算像素区域相对应的物理空间的实际面积。In this embodiment, the specific window width and window level for the abdominal image can be set to a window width of 400 HU and a window level of 10 HU. It is understandable that the abdominal CT image data in the DICOM format contains protected health information (PHI) of the patient, such as name, gender, age, and other image-related information, such as captured and generated images Equipment information, some medical context related information, etc. The DICOM format abdominal CT image data carries a lot of information, which can be divided into the following four categories: (a) Patient information, (b) Examination and Study information, (c) Series information, (d) Image Image information. Patient information includes patient name, patient ID, patient gender, patient weight, etc. Study information includes: inspection number, inspection instance number, inspection date, inspection time, inspection location, inspection description, etc. Series information includes serial number, inspection mode, image location, inspection description and description, image orientation, image location, layer thickness, layer-to-layer spacing, actual relative position, and body position. Image information includes information such as the time the image was taken, pixel spacing, image code, and sampling rate on the image. According to the pixel spacing, the conversion parameters between the pixel points and the physical space area can be obtained, and the actual area of the physical space corresponding to the pixel area can be calculated according to the conversion parameters.
步骤S102,基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型。In step S102, a generation network model is constructed based on the Vnet network model, and the abdominal image in the JPG format is input into the generation network model.
可选的,请参阅图2,所述步骤S102包括以下步骤:Optionally, referring to FIG. 2, the step S102 includes the following steps:
步骤S1021,将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核;Step S1021: Set the convolution kernel in the encoding stage of the Vnet network model to a two-dimensional convolution kernel;
步骤S1022,将所述Vnet网络模型解码阶段的反卷积替换为双线性插值,得到修改后的Vnet网络模型;Step S1022, replacing the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model;
步骤S1023,在所述修改后的Vnet网络模型中接入通道注意力CA模块,得到所述生成网络模型,其中,所述CA模块用于获取所述修改后的Vnet网络的编码阶段、解码阶段生成的高级特征图的语义信息,并根据所述语义信息从低级特征图中选取属于高级特征图的像素点信息;Step S1023, access the channel attention CA module in the modified Vnet network model to obtain the generative network model, where the CA module is used to obtain the encoding stage and the decoding stage of the modified Vnet network Generate semantic information of the high-level feature map, and select pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
其中,所述高级特征图及低级特征图根据在编码阶段及解码阶段获得特征图的先后顺序确定,在所述编码阶段相邻编码层中,下一层编码层获得的特征图比上一编码层获得的特征图要高级;在所述解码阶段相邻编码层中,上一解密层获得的特征图比下一解密层获得的特征图要低级。Wherein, the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage. Among the adjacent encoding layers in the encoding stage, the feature map obtained by the next encoding layer is higher than that of the previous encoding. The feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
在本实施例中,所述Vnet网络模型为福斯托·米勒塔里(Fausto Milletari)、纳西尔·纳瓦卜(Nasir nawab)、赛义德·艾哈迈德·艾哈迈迪(Seyed-Ahmad Ahmadi)等提出的医学影像Vnet网络模型。所述Vnet网络模型是典型的编码-解码网络模型。在所述Vnet网络模型中,编码阶段包括多个编码层,每个编码层包括卷积层、激活层、下采样层。解码阶段包括多个解码层,每个解码层包括反卷积层、激活层、上采样层。In this embodiment, the Vnet network model is Fausto Milletari, Nasir Nawab, Said Ahmed Ahmad ( The medical imaging Vnet network model proposed by Seyed-Ahmad (Ahmadi) and others. The Vnet network model is a typical encoding-decoding network model. In the Vnet network model, the coding stage includes multiple coding layers, and each coding layer includes a convolutional layer, an activation layer, and a down-sampling layer. The decoding stage includes multiple decoding layers, and each decoding layer includes a deconvolution layer, an activation layer, and an upsampling layer.
所述Vnet网络模型编码阶段的卷积核是基于三维的卷积核,但由于CT数据扫描层的层厚较厚,导致三维数据并不可靠。在本实施例中,将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核,基于二维图像单独进行分割。在本实施例中,为了减少可学习的参数量,将所述Vnet网络模型的解码阶段的反卷积替换为双线性插值。The convolution kernel in the encoding stage of the Vnet network model is based on a three-dimensional convolution kernel, but the three-dimensional data is unreliable due to the thicker CT data scanning layer. In this embodiment, the convolution kernel in the encoding stage of the Vnet network model is set as a two-dimensional convolution kernel, and segmentation is performed separately based on the two-dimensional image. In this embodiment, in order to reduce the amount of learnable parameters, the deconvolution in the decoding stage of the Vnet network model is replaced with bilinear interpolation.
步骤S104,通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签。Step S104: Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels.
可选的,请参阅图3,所述步骤S104包括以下步骤:Optionally, referring to FIG. 3, the step S104 includes the following steps:
步骤S1041,通过所述生成网络模型的编码阶段获取每个编码层的特征图;Step S1041: Obtain a feature map of each coding layer through the coding stage of the generated network model;
步骤S1042,通过所述生成网络模型的解码阶段获取每个解码层的特征图;Step S1042: Obtain a feature map of each decoding layer through the decoding stage of the generating network model;
步骤S1043,在编码阶段,通过所述CA模块将所述编码阶段相邻编码层的下一层h*w*2c维度的高级特征进行通道化操作、激活操作,得到不同通道的第一权重结果,将所述不同通道的第一权重结果与相邻编码层的上一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第一特征图;Step S1043: In the encoding stage, the CA module is used to channelize and activate the high-level features of the h*w*2c dimension of the next layer of the adjacent encoding layer in the encoding stage to obtain the first weight results of different channels , Multiplying the first weight results of the different channels with the low-level features of the upper 2h*2w*c dimension of the adjacent coding layer to obtain the first feature map of the 2h*2w*c dimension;
步骤S1044,在解码阶段,通过所述CA模块将所述解码阶段相邻解码层的上一层2h*2w*c维度的高级特征进行通道化操作、激活操作,得到不同通道的第二权重结果;将所述不同通道的第二权重结果与相邻编码层的下一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第二特征图;Step S1044, in the decoding stage, perform channelization operation and activation operation on the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer in the decoding stage through the CA module to obtain the second weight results of different channels Multiplying the second weight results of the different channels with the low-level features of the 2h*2w*c dimension of the next layer of the adjacent coding layer to obtain a second feature map of the 2h*2w*c dimension;
步骤S1045,根据所述编码阶段每一层获得的特征图、所述解码阶段每一层获得的特征图、所述第一特征图、所述第二特征图,得到所述6通道的预测分割标签。Step S1045: Obtain the 6-channel prediction segmentation according to the feature map obtained in each layer of the encoding stage, the feature map obtained in each layer of the decoding stage, the first feature map, and the second feature map. Label.
在本实施例,所述生成网络模型的编码阶段,通过卷积层执行卷积操作从输入的腹部CT图像中提取特征,在编码阶段的每一层结束后,使用适当的步幅来降低分辨率,若上一层分辨率为2h*2w,则下一层分辨率降低为h*w。在本实施例中,所述生成网络模型的编码阶段下一层特征比上一层的特征增大一倍,若所述生成网络模型的编码阶段上一层特征数量为c,则下一层的特征数量为2c。In this embodiment, in the coding stage of the generated network model, the convolution operation is performed on the convolutional layer to extract features from the input abdominal CT image. After each layer of the coding stage is completed, an appropriate stride is used to reduce the resolution. If the resolution of the previous layer is 2h*2w, the resolution of the next layer is reduced to h*w. In this embodiment, the features of the next layer in the encoding stage of the generative network model are doubled compared to the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is c, then the next layer The number of features is 2c.
在本实施例,通过所述生成网络模型的编码阶段获取每个编码层的特征图,在所述编码阶段相邻编码层下一层编码层获得的特征图比上一编码层获得的特征图要高级。所述生成网络模型的编码阶段相邻编码层的下一层获取到的高级特征为h*w*2c维度的高级特征,其中,h代表图形的高,w代表图形的宽,2c代表特征数量。所述生成网络模型的编码阶段相邻编码层的上一层获取到的低级特征为2h*2w*c维度的低级特征,2h代表图形的高,2w代表图形的宽,c代表特征数量。In this embodiment, the feature map of each coding layer is obtained through the coding stage of the generating network model. In the coding stage, the feature map obtained by the next coding layer of the adjacent coding layer is higher than the feature map obtained by the previous coding layer. To be advanced. The high-level features acquired in the next layer of the adjacent coding layer in the coding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, w represents the width of the graph, and 2c represents the number of features . The low-level features obtained by the upper layer of the adjacent coding layer in the coding stage of the generating network model are low-level features of dimension 2h*2w*c, 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
在本实施例,所述生成网络模型的解码阶段,通过反卷积层将每个输入体素通过内核投影到更大的区域来增加数据大小,若上一层分辨率为h*w,则下一层分辨率提高为2h*2w。在本实施例中,所述生成网络模型的解码阶段下一层特征比上一层的特征减小一倍,若所述生成网络模型的编码阶段上一层特征数量为2c,则下一层的特征数量为c。In this embodiment, in the decoding stage of the generated network model, each input voxel is projected to a larger area through the kernel through the deconvolution layer to increase the data size. If the resolution of the previous layer is h*w, then The resolution of the next layer is increased to 2h*2w. In this embodiment, the features of the next layer in the decoding stage of the generative network model are twice as small as the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is 2c, then the next layer The number of features is c.
在本实施例中,通过所述生成网络模型的解码阶段获取每个解码层的特征图,在所述解码阶段相邻编码层中上一层解密层获得的特征图比下一层解密层获得的特征图要低级。在本实施例中,所述生成网络模型的解码阶段相邻解码层的上一层获取到的高级特征为h*w*2c维度的高级特征,其中,h代表图形的高,w代表图形的宽,2c代表特征数量。所述生成网络模型的解码码阶段相邻解码层的下一层获取到的低级特征为2h*2w*c维度的低级特征,2h代表图形的高,2w代表图形的宽,c代表特征数量。In this embodiment, the feature map of each decoding layer is obtained through the decoding stage of the generative network model. Among the adjacent coding layers in the decoding stage, the feature map obtained by the upper decryption layer is higher than that obtained by the next decryption layer. The feature map should be low-level. In this embodiment, the high-level features acquired by the upper layer of the adjacent decoding layer in the decoding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, and w represents the height of the graph. Wide, 2c represents the number of features. The low-level features acquired in the next layer of the adjacent decoding layer in the decoding code stage of the generating network model are low-level features of dimension 2h*2w*c, where 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
需要说明的是,随着编码过程的不断加深,得到的特征表达也逐渐变得丰富。但由于多个卷积过程,以及非线性函数的应用,导致高级特征图中的位置信息大量丢失,从而造成大量像素点的错分类的现象。在所述修改后的Vnet网络中接入通道注意力(Channel-Attention,CA)模块,通过CA模块对错分类的像素点进行校正。It should be noted that as the coding process continues to deepen, the obtained feature expressions gradually become richer. However, due to multiple convolution processes and the application of non-linear functions, a large amount of position information in the high-level feature map is lost, resulting in the phenomenon of misclassification of a large number of pixels. A Channel-Attention (CA) module is connected to the modified Vnet network, and the misclassified pixels are corrected through the CA module.
步骤S1043中所述通过所述CA模块将所述编码阶段相邻编码层的下一层h*w*2c维度的高级特征进行通道化操作、激活操作,得到不同通道的第一权重结果,包括以下步骤:In step S1043, channelizing and activating the high-level features of the next layer of h*w*2c dimensions of the adjacent coding layer in the coding stage through the CA module to obtain the first weight results of different channels includes The following steps:
将相邻编码层的下一层的h*w*2c维度的高级特征通过所述CA模块的全局平均池化、1*1卷积、批标准化(Batch Normalization,BN)算法模型、非线性(Rectified Linear Units,ReLu)激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的第一权重结果。The high-level features of the h*w*2c dimension of the next layer of the adjacent coding layer are passed through the global average pooling, 1*1 convolution, batch normalization (BN) algorithm model, and nonlinearity of the CA module. Rectified Linear Units, ReLu) activation function to obtain 1*1*c feature channel, c represents the number of features; pass the 1*1*c feature channel through the fully connected layer and sigmoid activation function to obtain the first of different channels Weighted result.
步骤S1044中所述通过所述CA模块将所述解码阶段相邻解码层的上一层2h*2w*c维度的高级特征进行通道化操作、激活操作,得到不同通道的第二权重结果,包括以下步骤: 将相邻解码层的上一层的h*w*2c维度的高级特征通过所述CA模块的全局平均池化、1*1卷积、BN算法模型、ReL激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的第二权重结果。In step S1044, channelizing and activating the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer of the decoding stage through the CA module to obtain the second weight results of different channels includes The following steps: The advanced features of the h*w*2c dimension of the upper layer of the adjacent decoding layer are obtained through the global average pooling, 1*1 convolution, BN algorithm model, and ReL activation function of the CA module to obtain 1* 1*c feature channel, c represents the number of features; pass the 1*1*c feature channel through the fully connected layer and the sigmoid activation function to obtain the second weight results of different channels.
请参阅图4,所述CA模块处理流程主要包括通道化Channelization操作、激活Activation操作及权重赋值Reweighting操作。在编码阶段,通过所述CA模块将所述编码阶段相邻编码层的下一层的高级特征进行通道化操作,其中,所述通道化操作包括:将相邻编码层的下一层的高级特征通过所述CA模块的全局平均池化、1*1卷积、BN算法模型、ReLu激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道进行激活操作,其中,所述激活操作包括:将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的权重结果;将所述不同通道的权重结果与相邻编码层的上一层的低级特征相乘,得到第一特征图,所述第一特征图为2h*2w*c维度的特征图。Referring to FIG. 4, the processing flow of the CA module mainly includes channelization operation, activation operation, and weight assignment reweighting operation. In the encoding stage, the CA module is used to channelize the advanced features of the next layer of the adjacent encoding layer in the encoding stage, where the channelization operation includes: converting the advanced features of the next layer of the adjacent encoding layer Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c The feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the upper layer of adjacent coding layers are multiplied to obtain a first feature map, and the first feature map is a 2h*2w*c dimension feature map.
在解码阶段,通过所述CA模块将所述解码阶段相邻解码层的上一层的高级特征进行通道化操作,其中,所述通道化操作包括:将相邻编码层的上一层的高级特征通过所述CA模块的全局平均池化、1*1卷积、BN算法模型、ReLu激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道进行激活操作,其中,所述激活操作包括:将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的权重结果;将所述不同通道的权重结果与相邻编码层的下一层的低级特征相乘,得到第二特征图,述第二特征图为2h*2w*c维度的特征图。In the decoding stage, the CA module is used to channelize the advanced features of the upper layer of the adjacent decoding layer in the decoding stage, where the channelization operation includes: Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c The feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the next layer of adjacent coding layers are multiplied to obtain a second feature map. The second feature map is a 2h*2w*c dimension feature map.
步骤S106,根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。Step S106: Obtain a predicted segmentation result image according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat images, muscle images, bone images, visceral fat images, internal organs images, and background images.
在本实施例中,所述6通道的预测分割标签分别表示皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景的预测分割标签,用不同颜色进行填充得到预测分割结果图像,例如,可以用红色绘制皮下脂肪、绿色绘制肌肉、黄色绘制骨头、蓝色绘制内脏脂肪、粉色绘制内脏器官、黑色绘制背景。请参阅图5,在图5的图中用不同灰度的颜色代表皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景这6类分类。In this embodiment, the predicted segmentation labels of the 6-channel respectively represent the predicted segmentation labels of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background, which are filled with different colors to obtain the predicted segmentation result image. For example, you can use Red draws subcutaneous fat, green draws muscle, yellow draws bones, blue draws visceral fat, pink draws internal organs, and black draws background. Please refer to Figure 5. In Figure 5, different gray-scale colors are used to represent the six categories of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background.
这样,无须手动标注,就可以得到比较准确的腹部肌肉图像与脂肪图像,减少腹部肌肉图像、脂肪图像的分割时间,提高腹部肌肉图像、脂肪图像的分割效果。In this way, it is possible to obtain more accurate abdominal muscle images and fat images without manual labeling, reduce the segmentation time of abdominal muscle images and fat images, and improve the segmentation effect of abdominal muscle images and fat images.
可选的,所述图像分割方法还包括:Optionally, the image segmentation method further includes:
从所述预测分割结果图像确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积。Determine the number of pixels in the subcutaneous fat region, visceral fat region, and muscle region from the predicted segmentation result image, and determine the subcutaneous fat, visceral fat, and muscle based on the determined number of pixels and pre-acquired physical space conversion parameters The actual area.
在本实施例中,从所述预测分割结果图像中确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,从所述DICOM格式的CT图像数据获取像素点与物理空间面积之间的换算参数,根据所述皮下脂肪区域、内脏脂肪、肌肉区域的像素点个数乘以所述换算参数的平方,确定皮下脂肪、内脏脂肪、肌肉的实际面积。In this embodiment, the number of pixels in the subcutaneous fat area, visceral fat area, and muscle area is determined from the predicted segmentation result image, and the difference between the pixel points and the physical space area is obtained from the CT image data in the DICOM format. The conversion parameter is to determine the actual area of subcutaneous fat, visceral fat, and muscle based on the number of pixels in the subcutaneous fat area, visceral fat, and muscle area multiplied by the square of the conversion parameter.
进一步说明的是,所述DICOM格式的CT图像数据的image信息包括影像拍摄的时间、像素间距pixel spacing、图像码、图像上的采样率等信息。根据像素间距pixel spacing,可以获取像素点与物理空间面积之间的换算参数,根据以下公式(1)计算皮下脂肪、内脏脂肪、肌肉的实际面积。公式(1)s=n*x^2,其中,s表示皮下脂肪、内脏脂肪、肌肉的实际面积,n表示皮下脂肪区域、内脏脂肪区域、肌肉区域的总像素点个数,x表示换算参数。It is further explained that the image information of the CT image data in the DICOM format includes information such as the time when the image was taken, the pixel spacing, the image code, and the sampling rate on the image. According to the pixel spacing, the conversion parameters between the pixel points and the physical space area can be obtained, and the actual areas of subcutaneous fat, visceral fat, and muscle can be calculated according to the following formula (1). Formula (1) s=n*x^2, where s represents the actual area of subcutaneous fat, visceral fat, and muscle, n represents the total number of pixels in the subcutaneous fat area, visceral fat area, and muscle area, and x represents the conversion parameter .
这样,可以得到准确的腹部脂肪及肌肉面积,提高实际脂肪及肌肉面积的准确率。In this way, accurate abdominal fat and muscle area can be obtained, and the accuracy of actual fat and muscle area can be improved.
可选的,所述图像分割方法还包括:Optionally, the image segmentation method further includes:
从所述腹部CT图像数据获取扫描层厚信息,将所述皮下脂肪、内脏脂肪、肌肉的实 际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。Obtain scanning layer thickness information from the abdominal CT image data, and multiply the actual area of the subcutaneous fat, visceral fat, and muscle by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
在本实施例中,所述DICOM格式的腹部CT图像数据的Series信息包括序列号、检查模态、图像位置、检查描述和说明、图像方位、图像位置、层厚、层与层之间的间距、实际相对位置及身体位置等。故从所述DICOM格式的CT图像数据可以获得扫描层厚信息。将所述皮下脂肪区域、内脏脂肪区域、肌肉区域的实际面积乘以所述扫描层厚,得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。In this embodiment, the Series information of the abdominal CT image data in the DICOM format includes serial number, inspection modality, image location, inspection description and description, image orientation, image location, layer thickness, layer-to-layer spacing , Actual relative position and body position, etc. Therefore, the scanning layer thickness information can be obtained from the CT image data in the DICOM format. The actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
可选的,所述图像分割方法还包括:Optionally, the image segmentation method further includes:
分别将所述预测分割标签与所述金标准图像对应的真实标签输入所述判别网络模型,分别得到所述预测分割结果图像与所述金标准图像的判别分数,依据判别分数判断所述预测分割结果图像与所述金标准图像之间的差距,基于所述差距对所述生成网络模型进行参数调整,以优化所述生成网络模型。The predicted segmentation label and the real label corresponding to the gold standard image are respectively input into the discriminant network model, and the discriminant scores of the predicted segmentation result image and the gold standard image are obtained respectively, and the predicted segmentation is determined according to the discriminant scores Based on the gap between the result image and the gold standard image, parameter adjustment is performed on the generation network model based on the gap, so as to optimize the generation network model.
这样,可以通过对生成网络模型进行参数调整,优化生成网络模型,以便提高腹部图像分割的效果。In this way, the generation network model can be optimized by adjusting the parameters of the generation network model, so as to improve the effect of abdominal image segmentation.
在本实施例中,金标准图像为人手动标注过的分割结果,用来与网络预估的结果相比较,来评定生成网络模型的性能。金标准图像用不同颜色代表皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景。请参阅图7,图7为用不同灰度颜色代表皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景区域的金标准图像。In this embodiment, the gold standard image is a segmentation result manually annotated by a person, and is used to compare with the result of network estimation to evaluate the performance of the generated network model. The gold standard image uses different colors to represent subcutaneous fat, muscle, bone, visceral fat, internal organs, and background. Please refer to Figure 7. Figure 7 is a gold standard image representing subcutaneous fat, muscle, bone, visceral fat, internal organs, and background areas in different grayscale colors.
请参阅图7,图7为判别网络模型的架构示意图。所述判别网络模型包括6个卷积层,第一卷积层802包括3*3卷积层、非线性ReLu激活函数;第二卷积层803包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第三卷积层804包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第四卷积层805包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第五卷积层806包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第六卷积层807包括全局平均池化、1*1卷积层。801代表512*512*6维度的预测分割标签或金标准图像对应的真实标签。Please refer to FIG. 7, which is a schematic diagram of the architecture of the discriminant network model. The discriminant network model includes 6 convolutional layers. The first convolutional layer 802 includes a 3*3 convolutional layer and a nonlinear ReLu activation function; the second convolutional layer 803 includes a 3*3 convolutional layer and a batch standardized algorithm model. , Non-linear ReLu activation function; the third convolutional layer 804 includes 3*3 convolutional layer, batch standardized algorithm model, nonlinear ReLu activation function; the fourth convolutional layer 805 includes 3*3 convolutional layer, batch standardized algorithm model , Non-linear ReLu activation function; The fifth convolutional layer 806 includes 3*3 convolutional layers, batch standardized algorithm models, and nonlinear ReLu activation functions; The sixth convolutional layer 807 includes global average pooling, 1*1 convolutional layer . 801 represents the predicted segmentation label of 512*512*6 dimensions or the real label corresponding to the gold standard image.
在本实施例中,分别将512*512*6维度的预测分割标签和金标准图像对应的真实标签输入所述判别网络模型,使用大小为3,步长为2的卷积操作进行下采样,下采样次数对应所述生成网络模型中编码器下采样的次数,共下采样5次,得到16*16*256的特征图,最后经过全局平均池化和1*1的卷积核分别得到金标准图像和预测分割图片的判别分数。In this embodiment, the predicted segmentation label of 512*512*6 dimensions and the real label corresponding to the gold standard image are input into the discriminant network model, and a convolution operation with a size of 3 and a step size of 2 is used for down-sampling. The number of downsampling corresponds to the number of downsampling of the encoder in the generative network model. A total of 5 downsampling is obtained to obtain a 16*16*256 feature map. Finally, the global average pooling and 1*1 convolution kernel are used to obtain gold The discriminant scores of the standard image and the predicted segmented image.
在本实施例中,将对所述预测标签结果图像、金标准图像之间的KL散度(KullbackLeibler divergence)的优化,调整为对推土机距离(Earth Mover distance)的优化,所述推土机距离可以一直指导所述生成网络模型的优化,不受到梯度消失的困扰。In this embodiment, the optimization of the KL divergence (Kullback Leibler divergence) between the predicted label result image and the gold standard image is adjusted to the optimization of the bulldozer distance (Earth Mover distance), and the bulldozer distance can be always Guide the optimization of the generative network model without being troubled by the disappearance of the gradient.
本实施例中,通过梯度惩罚对所述生成网络模型及判别网络模型的训练过程进行加速收敛。零中心的梯度惩罚更加容易收敛到中心点,故而使用零中心的梯度惩罚。In this embodiment, gradient penalty is used to accelerate the convergence of the training process of the generating network model and the discriminant network model. The gradient penalty of the zero center is easier to converge to the center point, so the gradient penalty of the zero center is used.
在本实施例中,所述生成网络模型及判别网络模型分别具有对应的损失函数。In this embodiment, the generating network model and the discriminant network model each have a corresponding loss function.
所述生成网络模型的损失函数如下:The loss function of the generated network model is as follows:
Figure PCTCN2020098975-appb-000001
Figure PCTCN2020098975-appb-000001
其中,
Figure PCTCN2020098975-appb-000002
λ=0.001,
in,
Figure PCTCN2020098975-appb-000002
λ=0.001,
Figure PCTCN2020098975-appb-000003
Figure PCTCN2020098975-appb-000003
所述判别网络模型的损失函数如下:The loss function of the discriminant network model is as follows:
Figure PCTCN2020098975-appb-000004
Figure PCTCN2020098975-appb-000004
其中c=0,λ=10,p inter(I inter)是由真样本分布和假样本分布插值得到的衍生分布。 Where c=0, λ=10, and p inter (I inter ) is a derivative distribution obtained by interpolation between the true sample distribution and the false sample distribution.
下面对损失函数的英文进行中文说明:Loss:损失;Orig:原图;Dice:dice系数;Gen:生成网络模型;I:图像;Mask;掩膜;D:判别网络模型;G:生成网络模型;p_g:假样本分布;p_train:真样本分布;P_inter:由真样本分布和假样本分布插值得到的衍生分布;C:中心,C等于0为零中心。The following is a Chinese description of the loss function in English: Loss: loss; Orig: original image; Dice: dice coefficient; Gen: generating network model; I: image; Mask; mask; D: discriminating network model; G: generating network Model; p_g: false sample distribution; p_train: true sample distribution; P_inter: derived distribution obtained by interpolation between true sample distribution and false sample distribution; C: center, and C equals 0 to zero center.
所述生成网络模型及判别网络模型通过不断学习,降低这两个损失函数的值,来达到优化的目的。The generative network model and the discriminant network model reduce the values of these two loss functions through continuous learning to achieve the goal of optimization.
本申请所提出的图像分割方法,通过将所述JPG格式的腹部图像输入基于Vnet网络模型构建的生成网络模型;通过所述生成网络模型生成6通道的预测分割标签;根据所述6通道的预测分割标签得到预测分割结果图像,其中,预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。这样,无须手动标注,就可以得到比较准确的腹部肌肉图像与脂肪图像,减少腹部肌肉图像、脂肪图像分割的时间,提高腹部肌肉图像、脂肪图像的分割效果。The image segmentation method proposed in this application inputs the JPG format abdominal image into a generative network model constructed based on the Vnet network model; generates 6-channel predicted segmentation labels through the generative network model; according to the 6-channel prediction The segmentation label obtains the predicted segmentation result image, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image. In this way, it is possible to obtain more accurate abdominal muscle images and fat images without manual labeling, reduce the time for segmentation of abdominal muscle images and fat images, and improve the segmentation effect of abdominal muscle images and fat images.
参阅图8所示,是本申请提出一种图像分割装置100。Referring to FIG. 8, it is an image segmentation device 100 proposed in the present application.
本申请所述图像分割装置100可以安装于计算机设备中。根据实现的功能,所述图像分割装置可以包括转换模块101、处理模块102、生成模块103、获取模块104。本发所述模块也可以称之为单元,是指一种能够被计算机设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在计算机设备的存储器中。The image segmentation device 100 described in this application can be installed in a computer device. According to the implemented functions, the image segmentation device may include a conversion module 101, a processing module 102, a generation module 103, and an acquisition module 104. The module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of a computer device and can complete fixed functions, and are stored in the memory of the computer device.
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
所述转换模块101,用于将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像。The conversion module 101 is used to convert the abdominal CT image data in DICOM format into the abdominal image in JPG format.
在本实施例中,对医学数字成像和通信(Digital Imaging and Communications in Medicine,DICOM)格式的CT腹部图像数据设置针对腹部图像特定的窗宽窗位,然后通过格式转换程序将DICOM格式的CT图像数据转换为JPG格式的腹部图像,并保存JPG格式的腹部图像。需要强调的是,为进一步保证上述DICOM格式的腹部CT图像数据、JPG格式的腹部图的私密和安全性,上述DICOM格式的腹部CT图像数据、JPG格式的腹部图还可以存储于一区块链的节点中。In this embodiment, the CT abdominal image data in the Digital Imaging and Communications in Medicine (DICOM) format is set to a specific window width and window level for the abdominal image, and then the CT image in DICOM format is converted through the format conversion program The data is converted to abdomen image in JPG format, and the abdomen image in JPG format is saved. It should be emphasized that in order to further ensure the privacy and security of the aforementioned DICOM format abdominal CT image data and JPG format abdominal map, the aforementioned DICOM format abdominal CT image data and JPG format abdominal map can also be stored in a blockchain In the node.
本实施例中,针对腹部图像的特定窗宽窗位可以设置为窗宽400HU、窗位10HU。可以理解的是,所述DICOM格式的腹部CT图像数据中包含患者的受保护的健康信息(Protected Health Information,PHI),例如姓名,性别,年龄,以及其他图像相关信息,比如捕获并生成图像的设备信息,医疗的一些上下文相关信息等。所述DICOM格式的腹部CT图像数据携带着大量的信息,这些信息具体可以分为以下四类:(a)病人Patient信息、(b)检查Study信息、(c)序列Series信息、(d)图像Image信息。Patient信息包括患者姓名、患者ID、患者性别、患者体重等。Study信息包括:检查号、检查实例号、检查日期、检查时间、检查部位、检查的描述等。Series信息包括序列号、检查模态、图像位置、检查描述和说明、图像方位、图像位置、层厚、层与层之间的间距、实际相对位置及身体位置等。Image信息包括影像拍摄的时间、像素间距pixel spacing、图像码、图像上的采样率等信息。根据像素间距pixel spacing,可以获取像素点与物理空间面积之间的换算参数,根据换算参数,可以计算像素区域相对应的物理空间的实际面积。In this embodiment, the specific window width and window level for the abdominal image can be set to a window width of 400 HU and a window level of 10 HU. It is understandable that the abdominal CT image data in the DICOM format contains protected health information (PHI) of the patient, such as name, gender, age, and other image-related information, such as captured and generated images Equipment information, some medical context related information, etc. The DICOM format abdominal CT image data carries a lot of information, which can be divided into the following four categories: (a) Patient information, (b) Examination and Study information, (c) Series information, (d) Image Image information. Patient information includes patient name, patient ID, patient gender, patient weight, etc. Study information includes: inspection number, inspection instance number, inspection date, inspection time, inspection location, inspection description, etc. Series information includes serial number, inspection mode, image location, inspection description and instructions, image orientation, image location, layer thickness, layer-to-layer spacing, actual relative position, and body position. Image information includes information such as the time the image was taken, pixel spacing, image code, and sampling rate on the image. According to the pixel spacing, the conversion parameters between the pixel points and the physical space area can be obtained, and the actual area of the physical space corresponding to the pixel area can be calculated according to the conversion parameters.
所述处理模块102,用于基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型。The processing module 102 is configured to construct a generation network model based on the Vnet network model, and input the abdominal image in JPG format into the generation network model.
可选的,请参阅图9,所述处理模块102包括:Optionally, referring to FIG. 9, the processing module 102 includes:
设置子模块1021,用于将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核;A setting sub-module 1021 is used to set the convolution kernel in the encoding stage of the Vnet network model to a two-dimensional convolution kernel;
替换子模块1022,用于将所述Vnet网络模型解码阶段的反卷积替换为双线性插值,得到修改后的Vnet网络模型;The replacement sub-module 1022 is used to replace the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model;
接入子模块1023,用于在所述修改后的Vnet网络模型中接入通道注意力CA模块,得到所述生成网络模型,其中,所述CA模块用于获取所述修改后的Vnet网络的编码阶段、解码阶段生成的高级特征图的语义信息,并根据所述语义信息从低级特征图中选取属于高级特征图的像素点信息;The access sub-module 1023 is used to access the channel attention CA module in the modified Vnet network model to obtain the generated network model, wherein the CA module is used to obtain the modified Vnet network model Semantic information of the high-level feature map generated in the encoding stage and the decoding stage, and selecting pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
其中,所述高级特征图及低级特征图根据在编码阶段及解码阶段获得特征图的先后顺序确定,在所述编码阶段相邻编码层中,下一层编码层获得的特征图比上一编码层获得的特征图要高级;在所述解码阶段相邻编码层中,上一解密层获得的特征图比下一解密层获得的特征图要低级。Wherein, the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage. Among the adjacent encoding layers in the encoding stage, the feature map obtained by the next encoding layer is higher than that of the previous encoding. The feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
在本实施例中,所述Vnet网络模型为福斯托·米勒塔里(Fausto Milletari)、纳西尔·纳瓦卜(Nasir nawab)、赛义德·艾哈迈德·艾哈迈迪(Seyed-Ahmad Ahmadi)等提出的医学影像Vnet网络模型。所述Vnet网络模型是典型的编码-解码网络模型。在所述Vnet网络模型中,编码阶段包括多个编码层,每个编码层包括卷积层、激活层、下采样层。解码阶段包括多个解码层,每个解码层包括反卷积层、激活层、上采样层。In this embodiment, the Vnet network model is Fausto Milletari, Nasir Nawab, Said Ahmed Ahmad ( The medical imaging Vnet network model proposed by Seyed-Ahmad (Ahmadi) and others. The Vnet network model is a typical encoding-decoding network model. In the Vnet network model, the coding stage includes multiple coding layers, and each coding layer includes a convolutional layer, an activation layer, and a down-sampling layer. The decoding stage includes multiple decoding layers, and each decoding layer includes a deconvolution layer, an activation layer, and an upsampling layer.
所述Vnet网络模型编码阶段的卷积核是基于三维的卷积核,但由于CT数据扫描层的层厚较厚,导致三维数据并不可靠。在本实施例中,将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核,基于二维图像单独进行分割。在本实施例中,为了减少可学习的参数量,将所述Vnet网络模型的解码阶段的反卷积替换为双线性插值。The convolution kernel in the encoding stage of the Vnet network model is based on a three-dimensional convolution kernel, but the three-dimensional data is unreliable due to the thicker CT data scanning layer. In this embodiment, the convolution kernel in the encoding stage of the Vnet network model is set as a two-dimensional convolution kernel, and segmentation is performed separately based on the two-dimensional image. In this embodiment, in order to reduce the amount of learnable parameters, the deconvolution in the decoding stage of the Vnet network model is replaced with bilinear interpolation.
所述生成模块103,用于通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签。The generating module 103 is configured to generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels .
可选的,请参阅图10,所述生成模块103包括:Optionally, referring to FIG. 10, the generating module 103 includes:
第一获取子模块1031,用于通过所述生成网络模型的编码阶段获取每个编码层的特征图;The first obtaining sub-module 1031 is configured to obtain the feature map of each coding layer through the coding stage of the generated network model;
第二获取子模块1032,用于通过所述生成网络模型的解码阶段获取每个解码层的特征图;The second obtaining sub-module 1032 is configured to obtain the feature map of each decoding layer through the decoding stage of the generation network model;
第一处理子模块1033,用于在编码阶段,通过所述CA模块将所述编码阶段相邻编码层的下一层h*w*2c维度的高级特征进行通道化操作、激活操作,得到不同通道的第一权重结果,将所述不同通道的第一权重结果与相邻编码层的上一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第一特征图;The first processing sub-module 1033 is used for channelizing and activating the high-level features of the h*w*2c dimension of the next layer of the adjacent coding layer in the coding phase through the CA module in the coding phase to obtain different The first weight result of the channel, the first weight result of the different channels is multiplied by the low-level features of the upper 2h*2w*c dimension of the adjacent coding layer to obtain the first feature map of the 2h*2w*c dimension ;
第二处理子模块1034,用于在解码阶段,通过所述CA模块将所述解码阶段相邻解码层的上一层2h*2w*c维度的高级特征进行通道化操作、激活操作,得到不同通道的第二权重结果;将所述不同通道的第二权重结果与相邻编码层的下一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第二特征图;The second processing sub-module 1034 is used to perform channelization operation and activation operation on the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer in the decoding stage through the CA module in the decoding stage to obtain different The second weight result of the channel; the second weight result of the different channel is multiplied by the low-level features of the next layer of the adjacent coding layer with 2h*2w*c dimensions to obtain a second feature map of 2h*2w*c dimensions ;
第三处理子模块1035,用于根据所述编码阶段每一层获得的特征图、所述解码阶段每一层获得的特征图、所述第一特征图、所述第二特征图,得到所述6通道的预测分割标签。The third processing sub-module 1035 is configured to obtain all the features according to the feature maps obtained in each layer of the encoding stage, the feature maps obtained in each layer of the decoding stage, the first feature map, and the second feature map. The 6-channel prediction segmentation label.
在本实施例,所述生成网络模型的编码阶段,通过卷积层执行卷积操作从输入的腹部CT图像中提取特征,在编码阶段的每一层结束后,使用适当的步幅来降低分辨率,若上一层分辨率为2h*2w,则下一层分辨率降低为h*w。在本实施例中,所述生成网络模型的编码阶段下一层特征比上一层的特征增大一倍,若所述生成网络模型的编码阶段上一层特 征数量为c,则下一层的特征数量为2c。In this embodiment, in the coding stage of the generated network model, the convolution operation is performed on the convolutional layer to extract features from the input abdominal CT image. After each layer of the coding stage is completed, an appropriate stride is used to reduce the resolution. If the resolution of the previous layer is 2h*2w, the resolution of the next layer is reduced to h*w. In this embodiment, the features of the next layer in the encoding stage of the generative network model are doubled compared to the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is c, then the next layer The number of features is 2c.
在本实施例,通过所述生成网络模型的编码阶段获取每个编码层的特征图,在所述编码阶段相邻编码层下一层编码层获得的特征图比上一编码层获得的特征图要高级。所述生成网络模型的编码阶段相邻编码层的下一层获取到的高级特征为h*w*2c维度的高级特征,其中,h代表图形的高,w代表图形的宽,2c代表特征数量。所述生成网络模型的编码阶段相邻编码层的上一层获取到的低级特征为2h*2w*c维度的低级特征,2h代表图形的高,2w代表图形的宽,c代表特征数量。In this embodiment, the feature map of each coding layer is obtained through the coding stage of the generating network model. In the coding stage, the feature map obtained by the next coding layer of the adjacent coding layer is higher than the feature map obtained by the previous coding layer. To be advanced. The high-level features acquired by the next layer of the adjacent coding layer in the coding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, w represents the width of the graph, and 2c represents the number of features . The low-level features obtained by the upper layer of the adjacent coding layer in the coding stage of the generating network model are low-level features of dimension 2h*2w*c, 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
在本实施例,所述生成网络模型的解码阶段,通过反卷积层将每个输入体素通过内核投影到更大的区域来增加数据大小,若上一层分辨率为h*w,则下一层分辨率提高为2h*2w。在本实施例中,所述生成网络模型的解码阶段下一层特征比上一层的特征减小一倍,若所述生成网络模型的编码阶段上一层特征数量为2c,则下一层的特征数量为c。In this embodiment, in the decoding stage of the generated network model, each input voxel is projected to a larger area through the kernel through the deconvolution layer to increase the data size. If the resolution of the previous layer is h*w, then The resolution of the next layer is increased to 2h*2w. In this embodiment, the features of the next layer in the decoding stage of the generative network model are twice as small as the features of the previous layer. If the number of features of the previous layer in the encoding stage of the generative network model is 2c, then the next layer The number of features is c.
在本实施例中,通过所述生成网络模型的解码阶段获取每个解码层的特征图,在所述解码阶段相邻编码层中上一层解密层获得的特征图比下一层解密层获得的特征图要低级。在本实施例中,所述生成网络模型的解码阶段相邻解码层的上一层获取到的高级特征为h*w*2c维度的高级特征,其中,h代表图形的高,w代表图形的宽,2c代表特征数量。所述生成网络模型的解码码阶段相邻解码层的下一层获取到的低级特征为2h*2w*c维度的低级特征,2h代表图形的高,2w代表图形的宽,c代表特征数量。In this embodiment, the feature map of each decoding layer is obtained through the decoding stage of the generative network model. Among the adjacent coding layers in the decoding stage, the feature map obtained by the upper decryption layer is higher than that obtained by the next decryption layer. The feature map should be low-level. In this embodiment, the high-level features acquired by the upper layer of the adjacent decoding layer in the decoding stage of the generative network model are high-level features of h*w*2c dimensions, where h represents the height of the graph, and w represents the height of the graph. Wide, 2c represents the number of features. The low-level features acquired in the next layer of the adjacent decoding layer in the decoding code stage of the generating network model are low-level features of dimension 2h*2w*c, where 2h represents the height of the graph, 2w represents the width of the graph, and c represents the number of features.
需要说明的是,随着编码过程的不断加深,得到的特征表达也逐渐变得丰富。但由于多个卷积过程,以及非线性函数的应用,导致高级特征图中的位置信息大量丢失,从而造成大量像素点的错分类的现象。在所述修改后的Vnet网络中接入通道注意力(Channel-Attention,CA)模块,通过CA模块对错分类的像素点进行校正。It should be noted that as the coding process continues to deepen, the obtained feature expressions gradually become richer. However, due to multiple convolution processes and the application of non-linear functions, a large amount of position information in the high-level feature map is lost, resulting in the phenomenon of misclassification of a large number of pixels. A Channel-Attention (CA) module is connected to the modified Vnet network, and the misclassified pixels are corrected through the CA module.
所述第一处理子模块1033,还用于将相邻编码层的下一层的h*w*2c维度的高级特征通过所述CA模块的全局平均池化、1*1卷积、批标准化(Batch Normalization,BN)算法模型、非线性(Rectified Linear Units,ReLu)激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的第一权重结果。The first processing submodule 1033 is also used to pass the high-level features of the h*w*2c dimension of the next layer of the adjacent coding layer through the global average pooling, 1*1 convolution, and batch normalization of the CA module (Batch Normalization, BN) algorithm model and non-linear (Rectified Linear Units, ReLu) activation function to obtain a 1*1*c feature channel, where c represents the number of features; the 1*1*c feature channel is fully connected Layer and sigmoid activation function to obtain the first weight results of different channels.
所述第二处理子模块1034,还用于将相邻解码层的上一层的h*w*2c维度的高级特征通过所述CA模块的全局平均池化、1*1卷积、BN算法模型、ReL激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的第二权重结果。The second processing submodule 1034 is also used to pass the h*w*2c-dimensional high-level features of the upper layer of the adjacent decoding layer through the global average pooling, 1*1 convolution, and BN algorithm of the CA module The model and the ReL activation function are used to obtain 1*1*c feature channels, where c represents the number of features; the 1*1*c feature channels are passed through the fully connected layer and the sigmoid activation function to obtain the second weight results of different channels.
请再次参阅图4,所述CA模块处理流程主要包括通道化Channelization操作、激活Activation操作及权重赋值Reweighting操作。在编码阶段,通过所述CA模块将所述编码阶段相邻编码层的下一层的高级特征进行通道化操作,其中,所述通道化操作包括:将相邻编码层的下一层的高级特征通过所述CA模块的全局平均池化、1*1卷积、BN算法模型、ReLu激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道进行激活操作,其中,所述激活操作包括:将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的权重结果;将所述不同通道的权重结果与相邻编码层的上一层的低级特征相乘,得到第一特征图,所述第一特征图为2h*2w*c维度的特征图。Please refer to FIG. 4 again. The processing flow of the CA module mainly includes a channelized Channelization operation, an activation operation, and a weight assignment reweighting operation. In the encoding stage, the CA module is used to channelize the advanced features of the next layer of the adjacent encoding layer in the encoding stage, where the channelization operation includes: converting the advanced features of the next layer of the adjacent encoding layer Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c The feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the upper layer of adjacent coding layers are multiplied to obtain a first feature map, and the first feature map is a 2h*2w*c dimension feature map.
在解码阶段,通过所述CA模块将所述解码阶段相邻解码层的上一层的高级特征进行通道化操作,其中,所述通道化操作包括:将相邻编码层的上一层的高级特征通过所述CA模块的全局平均池化、1*1卷积、BN算法模型、ReLu激活函数,得到1*1*c的特征通道,c表示特征数量;将所述1*1*c的特征通道进行激活操作,其中,所述激活操作包括:将所述1*1*c的特征通道通过全连接层及sigmoid激活函数,得到不同通道的权重结果;将所述不同通道的权重结果与相邻编码层的下一层的低级特征相乘,得到第二特征图,述第二特征图为2h*2w*c维度的特征图。In the decoding stage, the CA module is used to channelize the advanced features of the upper layer of the adjacent decoding layer in the decoding stage, where the channelization operation includes: Through the global average pooling of the CA module, the 1*1 convolution, the BN algorithm model, and the ReLu activation function, a feature channel of 1*1*c is obtained, and c represents the number of features; the 1*1*c The feature channel performs an activation operation, where the activation operation includes: passing the 1*1*c feature channel through a fully connected layer and a sigmoid activation function to obtain the weight results of different channels; and comparing the weight results of the different channels with The low-level features of the next layer of adjacent coding layers are multiplied to obtain a second feature map. The second feature map is a 2h*2w*c dimension feature map.
所述获取模块104,用于根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The acquisition module 104 is configured to obtain a predicted segmentation result image according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, Background image.
在本实施例中,所述6通道的预测分割标签分别表示皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景的预测分割标签,用不同颜色进行填充得到预测分割结果图像,例如,可以用红色绘制皮下脂肪、绿色绘制肌肉、黄色绘制骨头、蓝色绘制内脏脂肪、粉色绘制内脏器官、黑色绘制背景。请参阅图5,在图5的图中用不同灰度的颜色代表皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景这6类分类。In this embodiment, the predicted segmentation labels of the 6-channel respectively represent the predicted segmentation labels of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background, which are filled with different colors to obtain the predicted segmentation result image. For example, you can use Red draws subcutaneous fat, green draws muscle, yellow draws bones, blue draws visceral fat, pink draws internal organs, and black draws background. Please refer to Figure 5. In Figure 5, different gray-scale colors are used to represent the six categories of subcutaneous fat, muscle, bone, visceral fat, internal organs, and background.
这样,无须手动标注,就可以得到比较准确的腹部肌肉图像与脂肪图像,减少腹部肌肉图像、脂肪图像的分割时间,提高腹部肌肉图像、脂肪图像的分割效果。In this way, it is possible to obtain more accurate abdominal muscle images and fat images without manual labeling, reduce the segmentation time of abdominal muscle images and fat images, and improve the segmentation effect of abdominal muscle images and fat images.
可选的,所述图像分割装置100还包括:Optionally, the image segmentation device 100 further includes:
确定模块,用于从所述预测分割结果图像确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积。The determining module is used to determine the number of pixels in the subcutaneous fat area, the visceral fat area, and the muscle area from the predicted segmentation result image, and determine the subcutaneous fat based on the determined number of pixels and the pre-acquired physical space conversion parameters , The actual area of visceral fat and muscle.
在本实施例中,从所述预测分割结果图像中确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,从所述DICOM格式的CT图像数据获取像素点与物理空间面积之间的换算参数,根据所述皮下脂肪区域、内脏脂肪、肌肉区域的像素点个数乘以所述换算参数的平方,确定皮下脂肪、内脏脂肪、肌肉的实际面积。In this embodiment, the number of pixels in the subcutaneous fat area, visceral fat area, and muscle area is determined from the predicted segmentation result image, and the difference between the pixel points and the physical space area is obtained from the CT image data in the DICOM format. The conversion parameter is to determine the actual area of subcutaneous fat, visceral fat, and muscle based on the number of pixels in the subcutaneous fat area, visceral fat, and muscle area multiplied by the square of the conversion parameter.
进一步说明的是,所述DICOM格式的CT图像数据的image信息包括影像拍摄的时间、像素间距pixel spacing、图像码、图像上的采样率等信息。根据像素间距pixel spacing,可以获取像素点与物理空间面积之间的换算参数,根据以下公式(1)计算皮下脂肪、内脏脂肪、肌肉的实际面积。公式(1)s=n*x^2,其中,s表示皮下脂肪、内脏脂肪、肌肉的实际面积,n表示皮下脂肪区域、内脏脂肪区域、肌肉区域的总像素点个数,x表示换算参数。It is further explained that the image information of the CT image data in the DICOM format includes information such as the time when the image was taken, the pixel spacing, the image code, and the sampling rate on the image. According to the pixel spacing, the conversion parameters between the pixel points and the physical space area can be obtained, and the actual areas of subcutaneous fat, visceral fat, and muscle can be calculated according to the following formula (1). Formula (1) s=n*x^2, where s represents the actual area of subcutaneous fat, visceral fat, and muscle, n represents the total number of pixels in the subcutaneous fat area, visceral fat area, and muscle area, and x represents the conversion parameter .
这样,可以得到准确的腹部脂肪及肌肉面积,提高实际脂肪及肌肉面积的准确率。In this way, accurate abdominal fat and muscle area can be obtained, and the accuracy of actual fat and muscle area can be improved.
可选的,所述图像分割装置100还包括:Optionally, the image segmentation device 100 further includes:
计算模块,用于从所述腹部CT图像数据获取扫描层厚信息,将所述皮下脂肪、内脏脂肪、肌肉的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。The calculation module is used to obtain scan layer thickness information from the abdominal CT image data, and multiply the actual area of the subcutaneous fat, visceral fat, and muscle by the scan layer thickness to obtain the actual area of the subcutaneous fat, visceral fat, and muscle. volume.
在本实施例中,所述DICOM格式的腹部CT图像数据的Series信息包括序列号、检查模态、图像位置、检查描述和说明、图像方位、图像位置、层厚、层与层之间的间距、实际相对位置及身体位置等。故从所述DICOM格式的CT图像数据可以获得扫描层厚信息。将所述皮下脂肪区域、内脏脂肪区域、肌肉区域的实际面积乘以所述扫描层厚,得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。In this embodiment, the Series information of the abdominal CT image data in the DICOM format includes serial number, inspection modality, image location, inspection description and description, image orientation, image location, layer thickness, layer-to-layer spacing , Actual relative position and body position, etc. Therefore, the scanning layer thickness information can be obtained from the CT image data in the DICOM format. The actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
可选的,所述图像分割装置100还包括:Optionally, the image segmentation device 100 further includes:
优化模块,用于分别将所述预测分割标签与所述金标准图像对应的真实标签输入所述判别网络模型,分别得到所述预测分割结果图像与所述金标准图像的判别分数,依据判别分数判断所述预测分割结果图像与所述金标准图像之间的差距,基于所述差距对所述生成网络模型进行参数调整,以优化所述生成网络模型。The optimization module is configured to input the predicted segmentation label and the real label corresponding to the gold standard image into the discriminant network model to obtain the discriminant scores of the predicted segmentation result image and the gold standard image, respectively, based on the discriminant scores Determine the gap between the predicted segmentation result image and the gold standard image, and adjust the parameters of the generation network model based on the gap to optimize the generation network model.
这样,可以通过对生成网络模型进行参数调整,优化生成网络模型,以便提高腹部图像分割的效果。In this way, the generation network model can be optimized by adjusting the parameters of the generation network model, so as to improve the effect of abdominal image segmentation.
在本实施例中,金标准图像为人手动标注过的分割结果,用来与网络预估的结果相比较,来评定生成网络模型的性能。金标准图像用不同颜色代表皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景。请参阅6,图6为用不同灰度颜色代表皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景区域的金标准图像。In this embodiment, the gold standard image is a segmentation result manually annotated by a person, and is used to compare with the result of network estimation to evaluate the performance of the generated network model. The gold standard image uses different colors to represent subcutaneous fat, muscle, bone, visceral fat, internal organs, and background. Please refer to 6. Figure 6 is a gold standard image representing subcutaneous fat, muscle, bone, visceral fat, internal organs, and background areas with different grayscale colors.
请参阅图7,图7为判别网络模型的架构示意图。所述判别网络模型包括6个卷积层,第一卷积层802包括3*3卷积层、非线性ReLu激活函数;第二卷积层803包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第三卷积层804包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第四卷积层805包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第五卷积层806包括3*3卷积层、批标准化算法模型、非线性ReLu激活函数;第六卷积层807包括全局平均池化、1*1卷积层。801代表512*512*6维度的预测分割标签或金标准图像对应的真实标签。Please refer to FIG. 7, which is a schematic diagram of the architecture of the discriminant network model. The discriminant network model includes 6 convolutional layers. The first convolutional layer 802 includes a 3*3 convolutional layer and a nonlinear ReLu activation function; the second convolutional layer 803 includes a 3*3 convolutional layer and a batch standardized algorithm model. , Non-linear ReLu activation function; the third convolutional layer 804 includes 3*3 convolutional layer, batch standardized algorithm model, nonlinear ReLu activation function; the fourth convolutional layer 805 includes 3*3 convolutional layer, batch standardized algorithm model , Non-linear ReLu activation function; The fifth convolutional layer 806 includes 3*3 convolutional layers, batch standardized algorithm models, and nonlinear ReLu activation functions; The sixth convolutional layer 807 includes global average pooling, 1*1 convolutional layer . 801 represents the predicted segmentation label of 512*512*6 dimensions or the real label corresponding to the gold standard image.
在本实施例中,分别将512*512*6维度的预测分割标签和金标准图像对应的真实标签输入所述判别网络模型,使用大小为3,步长为2的卷积操作进行下采样,下采样次数对应所述生成网络模型中编码器下采样的次数,共下采样5次,得到16*16*256的特征图,最后经过全局平均池化和1*1的卷积核分别得到金标准图像和预测分割图片的判别分数。In this embodiment, the predicted segmentation label of 512*512*6 dimensions and the real label corresponding to the gold standard image are input into the discriminant network model, and a convolution operation with a size of 3 and a step size of 2 is used for down-sampling. The number of downsampling corresponds to the number of downsampling of the encoder in the generative network model. A total of 5 downsampling is obtained to obtain a 16*16*256 feature map. Finally, the global average pooling and 1*1 convolution kernel are used to obtain gold The discriminant scores of the standard image and the predicted segmented image.
在本实施例中,将对所述预测标签结果图像、金标准图像之间的KL散度(KullbackLeibler divergence)的优化,调整为对推土机距离(Earth Mover distance)的优化,所述推土机距离可以一直指导所述生成网络模型的优化,不受到梯度消失的困扰。In this embodiment, the optimization of the KL divergence (Kullback Leibler divergence) between the predicted label result image and the gold standard image is adjusted to the optimization of the bulldozer distance (Earth Mover distance), and the bulldozer distance can be always Guide the optimization of the generative network model without being troubled by the disappearance of the gradient.
本实施例中,通过梯度惩罚对所述生成网络模型及判别网络模型的训练过程进行加速收敛。零中心的梯度惩罚更加容易收敛到中心点,故而使用零中心的梯度惩罚。In this embodiment, gradient penalty is used to accelerate the convergence of the training process of the generating network model and the discriminant network model. The gradient penalty of the zero center is easier to converge to the center point, so the gradient penalty of the zero center is used.
在本实施例中,所述生成网络模型及判别网络模型分别具有对应的损失函数。In this embodiment, the generating network model and the discriminant network model each have a corresponding loss function.
所述生成网络模型的损失函数如下:The loss function of the generated network model is as follows:
Figure PCTCN2020098975-appb-000005
Figure PCTCN2020098975-appb-000005
其中,
Figure PCTCN2020098975-appb-000006
λ=0.001,
in,
Figure PCTCN2020098975-appb-000006
λ=0.001,
Figure PCTCN2020098975-appb-000007
Figure PCTCN2020098975-appb-000007
所述判别网络模型的损失函数如下:The loss function of the discriminant network model is as follows:
Figure PCTCN2020098975-appb-000008
Figure PCTCN2020098975-appb-000008
其中c=0,λ=10,p inter(I inter)是由真样本分布和假样本分布插值得到的衍生分布。 Where c=0, λ=10, and p inter (I inter ) is a derivative distribution obtained by interpolation between the true sample distribution and the false sample distribution.
下面对损失函数的英文进行中文说明:Loss:损失;Orig:原图;Dice:dice系数;Gen:生成网络模型;I:图像;Mask;掩膜;D:判别网络模型;G:生成网络模型;p_g:假样本分布;p_train:真样本分布;P_inter:由真样本分布和假样本分布插值得到的衍生分布;C:中心,C等于0为零中心。The following is a Chinese description of the loss function in English: Loss: loss; Orig: original image; Dice: dice coefficient; Gen: generating network model; I: image; Mask; mask; D: discriminating network model; G: generating network Model; p_g: false sample distribution; p_train: true sample distribution; P_inter: derived distribution obtained by interpolation between true sample distribution and false sample distribution; C: center, and C equals 0 to zero center.
所述生成网络模型及判别网络模型通过不断学习,降低这两个损失函数的值,来达到优化的目的。The generative network model and the discriminant network model reduce the values of these two loss functions through continuous learning to achieve the goal of optimization.
本申请所提出的图像分割装置,通过将所述JPG格式的腹部图像输入基于Vnet网络模型构建的生成网络模型;通过所述生成网络模型生成6通道的预测分割标签;根据所述6通道的预测分割标签得到预测分割结果图像,其中,预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。这样,无须手动标注,就可以得到比较准确的腹部肌肉图像与脂肪图像,减少腹部肌肉图像、脂肪图像分割的时 间,提高腹部肌肉图像、脂肪图像的分割效果。The image segmentation device proposed in this application inputs the JPG format abdominal image into a generation network model constructed based on the Vnet network model; generates a 6-channel prediction segmentation label through the generation network model; according to the 6-channel prediction The segmentation label obtains the predicted segmentation result image, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image. In this way, it is possible to obtain more accurate abdominal muscle images and fat images without manual labeling, reduce the time for segmentation of abdominal muscle images and fat images, and improve the segmentation effect of abdominal muscle images and fat images.
如图10所示,是本申请实现图像分割方法的计算机设备的结构示意图。As shown in FIG. 10, it is a schematic structural diagram of a computer device for implementing the image segmentation method of the present application.
所述计算机设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如基于Vnet网络模型的腹部CT图像分割程序12。The computer device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as abdomen CT image segmentation based on a Vnet network model. Procedure 12.
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是计算机设备1的内部存储单元,例如该计算机设备1的移动硬盘。所述存储器11在另一些实施例中也可以是计算机设备1的外部存储设备,例如计算机设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括计算机设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于计算机设备1的应用软件及各类数据,例如基于Vnet网络模型的腹部CT图像分割程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the computer device 1 in some embodiments, for example, a mobile hard disk of the computer device 1. In other embodiments, the memory 11 may also be an external storage device of the computer device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the computer device 1. , SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the computer device 1 and an external storage device. The memory 11 can not only be used to store application software and various data installed in the computer device 1, such as the code of the abdominal CT image segmentation program based on the Vnet network model, etc., but also can be used to temporarily store the output that has been output or will be output. data.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述计算机设备的控制核心(Control Unit),利用各种接口和线路连接整个计算机设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如基于Vnet网络模型的腹部CT图像分割程序等),以及调用存储在所述存储器11内的数据,以执行计算机设备1的各种功能和处理数据。The processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc. The processor 10 is the control unit of the computer device, which uses various interfaces and lines to connect the various components of the entire computer device, and runs or executes programs or modules stored in the memory 11 (for example, based on The abdominal CT image segmentation program of the Vnet network model, etc.), and call the data stored in the memory 11 to execute various functions of the computer device 1 and process data.
所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
图10仅示出了具有部件的计算机设备,本领域技术人员可以理解的是,图10示出的结构并不构成对所述计算机设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 10 only shows a computer device with components. Those skilled in the art can understand that the structure shown in FIG. 10 does not constitute a limitation on the computer device 1, and may include fewer or more components than shown in the figure. Components, or combinations of certain components, or different component arrangements.
例如,尽管未示出,所述计算机设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述计算机设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the computer device 1 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power The device implements functions such as charge management, discharge management, and power consumption management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The computer device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述计算机设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该计算机设备1与其他计算机设备之间建立通信连接。Further, the computer device 1 may also include a network interface. Optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the computer equipment 1 Establish a communication connection with other computer equipment.
可选地,该计算机设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在计算机设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the computer device 1 may also include a user interface. The user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)). Optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, or the like. Among them, the display can also be called a display screen or a display unit as appropriate, and is used to display the information processed in the computer device 1 and to display a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only for illustrative purposes, and are not limited by this structure in the scope of the patent application.
所述计算机设备1中的所述存储器11存储的基于Vnet网络模型的腹部CT图像分割程序12是多个指令的组合,在所述处理器10中运行时,可以实现:The abdominal CT image segmentation program 12 based on the Vnet network model stored in the memory 11 in the computer device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
具体地,所述处理器10对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。需要强调的是,为进一步保证上述DICOM格式的腹部CT图像数据、所述JPG格式的腹部图像的私密和安全性,所述DICOM格式的腹部CT图像数据、所述JPG格式的腹部图像还可以存储于一区块链的节点中。Specifically, for the specific implementation method of the above-mentioned instructions by the processor 10, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG. 1, which will not be repeated here. It should be emphasized that, in order to further ensure the privacy and security of the aforementioned DICOM format abdominal CT image data and the JPG format abdominal image, the DICOM format abdominal CT image data and the JPG format abdominal image can also be stored In a node of a blockchain.
进一步地,所述计算机设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。所述计算机可读存储介质可以是非易失性,也可以是易失性,所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘。Further, if the integrated module/unit of the computer device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, or a mobile hard disk , Floppy disks, compact discs.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed equipment, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the present application.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。In addition, it is obvious that the word "including" does not exclude other units or steps, and the singular does not exclude the plural. Multiple units or devices stated in the system claims can also be implemented by one unit or device through software or hardware. The second class words are used to indicate names, and do not indicate any specific order.

Claims (20)

  1. 一种图像分割方法,其中,所述方法包括步骤:An image segmentation method, wherein the method includes the steps:
    将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
    基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
    通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
    根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  2. 如权利要求1所述的图像分割方法,其中,所述DICOM格式的腹部CT图像数据、所述JPG格式的腹部图像存储于区块链中,所述基于Vnet网络模型构建生成网络模型,包括以下步骤:The image segmentation method according to claim 1, wherein the abdominal CT image data in DICOM format and the abdominal image in JPG format are stored in a blockchain, and the generating network model based on the Vnet network model includes the following step:
    将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核;Setting the convolution kernel in the encoding stage of the Vnet network model to a two-dimensional convolution kernel;
    将所述Vnet网络模型解码阶段的反卷积替换为双线性插值,得到修改后的Vnet网络模型;Replacing the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model;
    在所述修改后的Vnet网络模型中接入通道注意力CA模块,得到所述生成网络模型,其中,所述CA模块用于获取所述修改后的Vnet网络的编码阶段、解码阶段生成的高级特征图的语义信息,并根据所述语义信息从低级特征图中选取属于高级特征图的像素点信息;The channel attention CA module is connected to the modified Vnet network model to obtain the generative network model, where the CA module is used to obtain the advanced level generated during the encoding and decoding stages of the modified Vnet network. The semantic information of the feature map, and selecting pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
    其中,所述高级特征图及低级特征图根据在编码阶段及解码阶段获得特征图的先后顺序确定,在所述编码阶段相邻编码层中,下一层编码层获得的特征图比上一编码层获得的特征图要高级;在所述解码阶段相邻编码层中,上一解密层获得的特征图比下一解密层获得的特征图要低级。Wherein, the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage. Among the adjacent encoding layers in the encoding stage, the feature map obtained by the next encoding layer is higher than that of the previous encoding. The feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
  3. 如权利要求1所述的图像分割方法,其中,所述通过所述生成网络模型生成6通道的预测分割标签,包括以下步骤:8. The image segmentation method according to claim 1, wherein said generating a 6-channel predicted segmentation label through said generating network model comprises the following steps:
    通过所述生成网络模型的编码阶段获取每个编码层的特征图;Acquiring a feature map of each coding layer through the coding stage of the generating network model;
    通过所述生成网络模型的解码阶段获取每个解码层的特征图;Acquiring a feature map of each decoding layer through the decoding stage of the generating network model;
    在编码阶段,通过所述CA模块将所述编码阶段相邻编码层的下一层h*w*2c维度的高级特征进行通道化操作、激活操作,得到不同通道的第一权重结果,将所述不同通道的第一权重结果与相邻编码层的上一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第一特征图;In the encoding stage, the CA module is used to channelize and activate the high-level features of the h*w*2c dimension of the next layer of the adjacent encoding layer in the encoding stage to obtain the first weight results of different channels, and then The first weight results of the different channels are multiplied by the low-level features of the upper 2h*2w*c dimension of the adjacent coding layer to obtain the first feature map of the 2h*2w*c dimension;
    在解码阶段,通过所述CA模块将所述解码阶段相邻解码层的上一层2h*2w*c维度的高级特征进行通道化操作、激活操作,得到不同通道的第二权重结果;将所述不同通道的第二权重结果与相邻编码层的下一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第二特征图;In the decoding stage, the CA module is used to channelize and activate the advanced features of the upper 2h*2w*c dimension of the adjacent decoding layer in the decoding stage to obtain the second weight results of different channels; The second weight results of the different channels are multiplied by the low-level features of the next layer of 2h*2w*c dimensions of the adjacent coding layer to obtain a second feature map of 2h*2w*c dimensions;
    根据所述编码阶段每一层获得的特征图、所述解码阶段每一层获得的特征图、所述第一特征图、所述第二特征图,得到所述6通道的预测分割标签。According to the feature map obtained in each layer of the encoding stage, the feature map obtained in each layer of the decoding stage, the first feature map, and the second feature map, the 6-channel prediction segmentation label is obtained.
  4. 如权利要求1至3中任一项所述的图像分割方法,其中,所述根据所述6通道的预测分割标签得到预测分割结果图像之后,所述方法还包括以下步骤:The image segmentation method according to any one of claims 1 to 3, wherein after the predicted segmentation result image is obtained according to the predicted segmentation label of the 6 channels, the method further comprises the following steps:
    从所述预测分割结果图像确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积。Determine the number of pixels in the subcutaneous fat region, visceral fat region, and muscle region from the predicted segmentation result image, and determine the subcutaneous fat, visceral fat, and muscle based on the determined number of pixels and pre-acquired physical space conversion parameters The actual area.
  5. 如权利要求4所述的图像分割方法,其中,所述根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积之后,所述方法还包括以下步骤:The image segmentation method of claim 4, wherein after the actual area of subcutaneous fat, visceral fat, and muscle is determined according to the determined number of pixels and the physical space conversion parameters obtained in advance, the method further It includes the following steps:
    从所述腹部CT图像数据获取扫描层厚信息,将所述皮下脂肪、内脏脂肪、肌肉的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。Obtain scanning layer thickness information from the abdominal CT image data, and multiply the actual area of the subcutaneous fat, visceral fat, and muscle by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
  6. 如权利要求5所述的图像分割方法,其中,所述将所述皮下脂肪区域、内脏脂肪区域、肌肉区域的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积之后,所述方法还包括以下步骤:The image segmentation method of claim 5, wherein the actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the actual area of the subcutaneous fat, visceral fat, and muscle. After volume, the method further includes the following steps:
    分别将所述预测分割标签与所述金标准图像对应的真实标签输入所述判别网络模型,分别得到所述预测分割结果图像与所述金标准图像的判别分数,依据所述判别分数判断所述预测分割结果图像与金标准图像之间的差距,基于所述差距对所述生成网络模型进行参数调整,以优化所述生成网络模型。The real labels corresponding to the predicted segmentation label and the gold standard image are respectively input into the discriminant network model to obtain the discriminant scores of the predicted segmentation result image and the gold standard image respectively, and the discriminant scores are used to determine the Predict the gap between the segmentation result image and the gold standard image, and adjust the parameters of the generation network model based on the gap to optimize the generation network model.
  7. 如权利要求2所述的图像分割方法,其中,所述DICOM格式的CT图像数据的信息包括影像拍摄的时间、像素间距、图像码、及图像上的采样率。3. The image segmentation method of claim 2, wherein the information of the CT image data in the DICOM format includes the time when the image was taken, the pixel pitch, the image code, and the sampling rate on the image.
  8. 一种图像分割装置,其中,所述装置包括:An image segmentation device, wherein the device includes:
    转换模块:用于将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Conversion module: used to convert abdominal CT image data in DICOM format into abdominal image in JPG format;
    处理模块:用于基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Processing module: used to construct a generation network model based on the Vnet network model, and input the abdominal image in JPG format into the generation network model;
    生成模块:用于通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generating module: used to generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
    获取模块:用于根据所述6通道的预测分割标签获取预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。Obtaining module: used to obtain predicted segmentation result images according to the 6-channel predicted segmentation tags, where the predicted segmentation result images include subcutaneous fat images, muscle images, bone images, visceral fat images, internal organs images, and background images.
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:A computer device includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the following steps when the processor executes the computer program:
    将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
    基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
    通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
    根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  10. 如权利要求9所述的计算机设备,其中,所述DICOM格式的腹部CT图像数据、所述JPG格式的腹部图像存储于区块链中,所述基于Vnet网络模型构建生成网络模型,包括以下步骤:The computer device according to claim 9, wherein the abdomen CT image data in DICOM format and the abdominal image in JPG format are stored in a blockchain, and the building and generating network model based on the Vnet network model comprises the following steps :
    将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核;Setting the convolution kernel in the encoding stage of the Vnet network model as a two-dimensional convolution kernel;
    将所述Vnet网络模型解码阶段的反卷积替换为双线性插值,得到修改后的Vnet网络模型;Replacing the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model;
    在所述修改后的Vnet网络模型中接入通道注意力CA模块,得到所述生成网络模型,其中,所述CA模块用于获取所述修改后的Vnet网络的编码阶段、解码阶段生成的高级特征图的语义信息,并根据所述语义信息从低级特征图中选取属于高级特征图的像素点信息;The channel attention CA module is connected to the modified Vnet network model to obtain the generative network model, where the CA module is used to obtain the advanced level generated during the encoding and decoding stages of the modified Vnet network. The semantic information of the feature map, and selecting pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
    其中,所述高级特征图及低级特征图根据在编码阶段及解码阶段获得特征图的先后顺序确定,在所述编码阶段相邻编码层中,下一层编码层获得的特征图比上一编码层获得的特征图要高级;在所述解码阶段相邻编码层中,上一解密层获得的特征图比下一解密层获得的特征图要低级。Wherein, the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage. Among the adjacent encoding layers in the encoding stage, the feature map obtained by the next encoding layer is higher than that of the previous encoding. The feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
  11. 如权利要求9所述的计算机设备,其中,所述通过所述生成网络模型生成6通道的预测分割标签,包括以下步骤:9. The computer device according to claim 9, wherein said generating a 6-channel predicted segmentation label through said generating network model comprises the following steps:
    通过所述生成网络模型的编码阶段获取每个编码层的特征图;Acquiring a feature map of each coding layer through the coding stage of the generating network model;
    通过所述生成网络模型的解码阶段获取每个解码层的特征图;Acquiring a feature map of each decoding layer through the decoding stage of the generating network model;
    在编码阶段,通过所述CA模块将所述编码阶段相邻编码层的下一层h*w*2c维度的高级特征进行通道化操作、激活操作,得到不同通道的第一权重结果,将所述不同通道的第一权重结果与相邻编码层的上一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第一特征图;In the encoding stage, the CA module is used to channelize and activate the high-level features of the h*w*2c dimension of the next layer of the adjacent encoding layer in the encoding stage to obtain the first weight results of different channels, and then The first weight results of the different channels are multiplied by the low-level features of the upper 2h*2w*c dimension of the adjacent coding layer to obtain the first feature map of the 2h*2w*c dimension;
    在解码阶段,通过所述CA模块将所述解码阶段相邻解码层的上一层2h*2w*c维度的高级特征进行通道化操作、激活操作,得到不同通道的第二权重结果;将所述不同通道的第二权重结果与相邻编码层的下一层2h*2w*c维度的低级特征相乘,得到2h*2w*c维度的第二特征图;In the decoding stage, through the CA module, channelize and activate the high-level features of the upper 2h*2w*c dimension of the adjacent decoding layer in the decoding stage to obtain the second weight results of different channels; The second weight results of the different channels are multiplied by the low-level features of the 2h*2w*c dimension of the next layer of the adjacent coding layer to obtain the second feature map of the 2h*2w*c dimension;
    根据所述编码阶段每一层获得的特征图、所述解码阶段每一层获得的特征图、所述第一特征图、所述第二特征图,得到所述6通道的预测分割标签。According to the feature map obtained at each layer of the encoding stage, the feature map obtained at each layer of the decoding stage, the first feature map, and the second feature map, the 6-channel prediction segmentation label is obtained.
  12. 如权利要求9至11中任一项所述的计算机设备,其中,所述根据所述6通道的预测分割标签得到预测分割结果图像之后,所述处理器执行所述计算机程序时还实现以下步骤:The computer device according to any one of claims 9 to 11, wherein after the predicted segmentation result image is obtained according to the predicted segmentation label of the 6 channels, the processor further implements the following steps when executing the computer program :
    从所述预测分割结果图像确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积。Determine the number of pixels in the subcutaneous fat region, visceral fat region, and muscle region from the predicted segmentation result image, and determine the subcutaneous fat, visceral fat, and muscle based on the determined number of pixels and pre-acquired physical space conversion parameters The actual area.
  13. 如权利要求12所述的计算机设备,其中,所述根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积之后,所述方法还包括以下步骤:The computer device according to claim 12, wherein, after determining the actual areas of subcutaneous fat, visceral fat, and muscle according to the determined number of pixels and pre-acquired physical space conversion parameters, the method further comprises The following steps:
    从所述腹部CT图像数据获取扫描层厚信息,将所述皮下脂肪、内脏脂肪、肌肉的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。Obtain scanning layer thickness information from the abdominal CT image data, and multiply the actual area of the subcutaneous fat, visceral fat, and muscle by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
  14. 如权利要求13所述的计算机设备,其中,所述将所述皮下脂肪区域、内脏脂肪区域、肌肉区域的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积之后,所述处理器执行所述计算机程序时还实现以下步骤:The computer device of claim 13, wherein the actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle After that, the processor further implements the following steps when executing the computer program:
    分别将所述预测分割标签与所述金标准图像对应的真实标签输入所述判别网络模型,分别得到所述预测分割结果图像与所述金标准图像的判别分数,依据所述判别分数判断所述预测分割结果图像与金标准图像之间的差距,基于所述差距对所述生成网络模型进行参数调整,以优化所述生成网络模型。The real labels corresponding to the predicted segmentation label and the gold standard image are respectively input into the discriminant network model to obtain the discriminant scores of the predicted segmentation result image and the gold standard image respectively, and the discriminant scores are used to determine the Predict the gap between the segmentation result image and the gold standard image, and adjust the parameters of the generation network model based on the gap to optimize the generation network model.
  15. 如权利要求10所述的计算机设备,其中,所述DICOM格式的CT图像数据的信息包括影像拍摄的时间、像素间距、图像码、及图像上的采样率。10. The computer device of claim 10, wherein the information of the CT image data in the DICOM format includes the time when the image was taken, the pixel pitch, the image code, and the sampling rate on the image.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein, when the computer program is executed by a processor, the following steps are implemented:
    将DICOM格式的腹部CT图像数据转换为JPG格式的腹部图像;Convert abdominal CT image data in DICOM format to abdominal image in JPG format;
    基于Vnet网络模型构建生成网络模型,将所述JPG格式的腹部图像输入所述生成网络模型;Constructing a generation network model based on the Vnet network model, and inputting the abdominal image in JPG format into the generation network model;
    通过所述生成网络模型生成6通道的预测分割标签,其中,所述6通道的预测分割标签包括皮下脂肪、肌肉、骨头、内脏脂肪、内脏器官、背景预测分割标签;Generate 6-channel predicted segmentation labels through the generation network model, where the 6-channel predicted segmentation labels include subcutaneous fat, muscle, bone, visceral fat, internal organs, and background predicted segmentation labels;
    根据所述6通道的预测分割标签得到预测分割结果图像,其中,所述预测分割结果图像包括皮下脂肪图像、肌肉图像、骨头图像、内脏脂肪图像、内脏器官图像、背景图像。The predicted segmentation result image is obtained according to the 6-channel predicted segmentation label, where the predicted segmentation result image includes subcutaneous fat image, muscle image, bone image, visceral fat image, internal organ image, and background image.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述根据所述6通道的预测分割标签得到预测分割结果图像之后,所述计算机程序被处理器执行时还实现如下步骤:16. The computer-readable storage medium according to claim 16, wherein after the predicted segmentation result image is obtained according to the predicted segmentation label of the 6 channels, the computer program further implements the following steps when being executed by the processor:
    从所述预测分割结果图像确定皮下脂肪区域、内脏脂肪区域、肌肉区域的像素点个数,根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积。Determine the number of pixels in the subcutaneous fat region, visceral fat region, and muscle region from the predicted segmentation result image, and determine the subcutaneous fat, visceral fat, and muscle based on the determined number of pixels and pre-acquired physical space conversion parameters The actual area.
  18. 如权利要求17所述的计算机可读存储介质,其中,所述根据所述确定的像素点个数及预先获取的物理空间换算参数,确定皮下脂肪、内脏脂肪、肌肉的实际面积之后,所述计算机程序被处理器执行时还实现如下步骤:17. The computer-readable storage medium of claim 17, wherein after the actual areas of subcutaneous fat, visceral fat, and muscle are determined according to the determined number of pixels and pre-acquired physical space conversion parameters, the When the computer program is executed by the processor, the following steps are also implemented:
    从所述腹部CT图像数据获取扫描层厚信息,将所述皮下脂肪、内脏脂肪、肌肉的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积。Obtain scanning layer thickness information from the abdominal CT image data, and multiply the actual area of the subcutaneous fat, visceral fat, and muscle by the scanning layer thickness to obtain the actual volume of the subcutaneous fat, visceral fat, and muscle.
  19. 如权利要求18所述的计算机可读存储介质,其中,所述将所述皮下脂肪区域、内脏脂肪区域、肌肉区域的实际面积乘以所述扫描层厚得到所述皮下脂肪、内脏脂肪及肌肉的实际体积之后,所述方法还包括以下步骤:The computer-readable storage medium of claim 18, wherein the actual area of the subcutaneous fat area, visceral fat area, and muscle area is multiplied by the scanning layer thickness to obtain the subcutaneous fat, visceral fat, and muscle area. After the actual volume, the method further includes the following steps:
    分别将所述预测分割标签与所述金标准图像对应的真实标签输入所述判别网络模型,分别得到所述预测分割结果图像与所述金标准图像的判别分数,依据所述判别分数判断所述预测分割结果图像与金标准图像之间的差距,基于所述差距对所述生成网络模型进行参数调整,以优化所述生成网络模型。The real labels corresponding to the predicted segmentation label and the gold standard image are respectively input into the discriminant network model to obtain the discriminant scores of the predicted segmentation result image and the gold standard image respectively, and the discriminant scores are used to determine the Predict the gap between the segmentation result image and the gold standard image, and adjust the parameters of the generation network model based on the gap to optimize the generation network model.
  20. 如权利要求16所述的计算机可读存储介质,其中,所述DICOM格式的腹部CT图像数据、所述JPG格式的腹部图像存储于区块链中,所述基于Vnet网络模型构建生成网络模型,包括以下步骤:The computer-readable storage medium according to claim 16, wherein the abdominal CT image data in DICOM format and the abdominal image in JPG format are stored in a blockchain, and the network model is constructed based on the Vnet network model, It includes the following steps:
    将所述Vnet网络模型编码阶段的卷积核设置为二维卷积核;Setting the convolution kernel in the encoding stage of the Vnet network model as a two-dimensional convolution kernel;
    将所述Vnet网络模型解码阶段的反卷积替换为双线性插值,得到修改后的Vnet网络模型;Replacing the deconvolution in the decoding stage of the Vnet network model with bilinear interpolation to obtain a modified Vnet network model;
    在所述修改后的Vnet网络模型中接入通道注意力CA模块,得到所述生成网络模型,其中,所述CA模块用于获取所述修改后的Vnet网络的编码阶段、解码阶段生成的高级特征图的语义信息,并根据所述语义信息从低级特征图中选取属于高级特征图的像素点信息;The channel attention CA module is connected to the modified Vnet network model to obtain the generative network model, where the CA module is used to obtain the advanced level generated during the encoding and decoding stages of the modified Vnet network. The semantic information of the feature map, and selecting pixel information belonging to the high-level feature map from the low-level feature map according to the semantic information;
    其中,所述高级特征图及低级特征图根据在编码阶段及解码阶段获得特征图的先后顺序确定,在所述编码阶段相邻编码层中,下一层编码层获得的特征图比上一编码层获得的特征图要高级;在所述解码阶段相邻编码层中,上一解密层获得的特征图比下一解密层获得的特征图要低级。Wherein, the high-level feature map and the low-level feature map are determined according to the sequence of obtaining feature maps in the encoding stage and the decoding stage. Among the adjacent encoding layers in the encoding stage, the feature map obtained by the next encoding layer is higher than that of the previous encoding. The feature map obtained by the layer is higher-level; in the adjacent coding layers of the decoding stage, the feature map obtained by the previous decryption layer is lower than the feature map obtained by the next decryption layer.
PCT/CN2020/098975 2020-05-20 2020-06-29 Image segmentation method and apparatus, device, and storage medium WO2021151275A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010431606.6 2020-05-20
CN202010431606.6A CN111696082B (en) 2020-05-20 Image segmentation method, device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2021151275A1 true WO2021151275A1 (en) 2021-08-05

Family

ID=72478096

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098975 WO2021151275A1 (en) 2020-05-20 2020-06-29 Image segmentation method and apparatus, device, and storage medium

Country Status (1)

Country Link
WO (1) WO2021151275A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332381A (en) * 2022-01-05 2022-04-12 北京理工大学 Aorta CT image key point detection method and system based on three-dimensional reconstruction
CN114332070A (en) * 2022-01-05 2022-04-12 北京理工大学 Meteor crater detection method based on intelligent learning network model compression

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180108138A1 (en) * 2015-04-29 2018-04-19 Siemens Aktiengesellschaft Method and system for semantic segmentation in laparoscopic and endoscopic 2d/2.5d image data
CN109146899A (en) * 2018-08-28 2019-01-04 众安信息技术服务有限公司 CT image jeopardizes organ segmentation method and device
US20190057488A1 (en) * 2017-08-17 2019-02-21 Boe Technology Group Co., Ltd. Image processing method and device
CN109754403A (en) * 2018-11-29 2019-05-14 中国科学院深圳先进技术研究院 Tumour automatic division method and system in a kind of CT image
CN110097557A (en) * 2019-01-31 2019-08-06 卫宁健康科技集团股份有限公司 Automatic medical image segmentation method and system based on 3D-UNet
CN110223300A (en) * 2019-06-13 2019-09-10 北京理工大学 CT image abdominal multivisceral organ dividing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180108138A1 (en) * 2015-04-29 2018-04-19 Siemens Aktiengesellschaft Method and system for semantic segmentation in laparoscopic and endoscopic 2d/2.5d image data
US20190057488A1 (en) * 2017-08-17 2019-02-21 Boe Technology Group Co., Ltd. Image processing method and device
CN109146899A (en) * 2018-08-28 2019-01-04 众安信息技术服务有限公司 CT image jeopardizes organ segmentation method and device
CN109754403A (en) * 2018-11-29 2019-05-14 中国科学院深圳先进技术研究院 Tumour automatic division method and system in a kind of CT image
CN110097557A (en) * 2019-01-31 2019-08-06 卫宁健康科技集团股份有限公司 Automatic medical image segmentation method and system based on 3D-UNet
CN110223300A (en) * 2019-06-13 2019-09-10 北京理工大学 CT image abdominal multivisceral organ dividing method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332381A (en) * 2022-01-05 2022-04-12 北京理工大学 Aorta CT image key point detection method and system based on three-dimensional reconstruction
CN114332070A (en) * 2022-01-05 2022-04-12 北京理工大学 Meteor crater detection method based on intelligent learning network model compression
CN114332070B (en) * 2022-01-05 2024-05-28 北京理工大学 Meteorite detection method based on intelligent learning network model compression

Also Published As

Publication number Publication date
CN111696082A (en) 2020-09-22

Similar Documents

Publication Publication Date Title
US11282205B2 (en) Structure correcting adversarial network for chest x-rays organ segmentation
CN111429421B (en) Model generation method, medical image segmentation method, device, equipment and medium
US20220406410A1 (en) System and method for creating, querying, and displaying a miba master file
CN111047605B (en) Construction method and segmentation method of vertebra CT segmentation network model
WO2021189913A1 (en) Method and apparatus for target object segmentation in image, and electronic device and storage medium
TW202125415A (en) Training method, equipment and storage medium of 3d target detection and model
US11526994B1 (en) Labeling, visualization, and volumetric quantification of high-grade brain glioma from MRI images
Kong et al. Automated maxillofacial segmentation in panoramic dental x-ray images using an efficient encoder-decoder network
Koshino et al. Narrative review of generative adversarial networks in medical and molecular imaging
WO2021151275A1 (en) Image segmentation method and apparatus, device, and storage medium
US10878564B2 (en) Systems and methods for processing 3D anatomical volumes based on localization of 2D slices thereof
Pal et al. A fully connected reproducible SE-UResNet for multiorgan chest radiographs segmentation
Gaggion et al. CheXmask: a large-scale dataset of anatomical segmentation masks for multi-center chest x-ray images
Liu et al. Tracking-based deep learning method for temporomandibular joint segmentation
CN111209946A (en) Three-dimensional image processing method, image processing model training method, and medium
Sheng et al. Modeling nodule growth via spatial transformation for follow-up prediction and diagnosis
Liu et al. Joint cranial bone labeling and landmark detection in pediatric CT images using context encoding
CN115294023A (en) Liver tumor automatic segmentation method and device
CN111696082B (en) Image segmentation method, device, electronic equipment and computer readable storage medium
Qu et al. Advancing diagnostic performance and clinical applicability of deep learning-driven generative adversarial networks for Alzheimer's disease
Tian et al. A revised approach to orthodontic treatment monitoring from oralscan video
Lee et al. Learning radiologist’s step-by-step skill for cervical spinal injury examination: Line drawing, prevertebral soft tissue thickness measurement, and swelling detection
Rickmann et al. Vertex Correspondence in Cortical Surface Reconstruction
CN117393100B (en) Diagnostic report generation method, model training method, system, equipment and medium
US20240020827A1 (en) System and method for generating a diagnostic model and user interface presentation based on individual anatomical features

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20917106

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20917106

Country of ref document: EP

Kind code of ref document: A1