CN112950582B

CN112950582B - 3D lung focus segmentation method and device based on deep learning

Info

Publication number: CN112950582B
Application number: CN202110223645.1A
Authority: CN
Inventors: 杜强; 陈相儒; 郭雨晨; 聂方兴; 唐超
Original assignee: Beijing Xiao Bai Century Network Technology Co ltd
Current assignee: Beijing Xiao Bai Century Network Technology Co ltd
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2023-11-24
Anticipated expiration: 2041-03-01
Also published as: CN112950582A

Abstract

The invention discloses a 3D lung focus segmentation method, a device, electronic equipment and a storage medium based on deep learning, wherein the method is characterized in that a lung nodule dicom image is obtained, and the dicom image is preprocessed; three-dimensionally stacking the pre-processed dicom images to obtain a 3D image block, and cutting the 3D image block; performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; the method is simpler, more convenient and faster than the current main stream 3D segmentation method, accords with the shape characteristics of a nodular sphere, achieves the accuracy close to that of the main stream segmentation method, and provides a new thought.

Description

3D lung focus segmentation method and device based on deep learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a 3D lung focus segmentation method and device based on deep learning, electronic equipment and a storage medium.

Background

In medicine, in the medical field, medical imaging has become accurate diagnosis, focus location is an integral step, but because some focuses are smaller, the focus is difficult to find, and a doctor often ignores the focus to cause misdiagnosis, and because of inconsistent focus shapes, the focus is different in size, and the doctor is relatively difficult to measure the focus.

The deep learning method is proposed from the end of the 20 th century, has been well developed so far, simulates the nerve composition of a human, and utilizes a large amount of data to complete the simulation of nerve connection, thereby achieving the effect of simulating the brain of the human and completing the work which cannot be completed even by some professionals.

LIDC-IDRI is a dataset consisting of chest medical image files (e.g., CT, X-ray) and corresponding diagnostic result lesion labels. This data was collected by the national cancer institute (National Cancer Institute) at the initiative to investigate early cancer detection in high risk populations. The data set comprises 1018 CT images of patients, the lung nodule labeling at the pixel-by-pixel level is completed, and certain labeling is carried out on the classification of the lung nodules.

The method for detecting and segmenting the LIDC-IDRI by using the deep learning scheme at present mainly detects lung nodules, and segments pixel by pixel (2D) or voxel by voxel (3D) are carried out according to detection results, so that the description of the volumes of the lung nodules and other corresponding descriptions are obtained. Current methods involve the use of two-stage models, or one-stage detection and segmentation. However, such an approach results in a correspondingly high time-consuming or high memory consumption.

Disclosure of Invention

The invention aims to provide a 3D lung focus segmentation method, a device, electronic equipment and a storage medium based on deep learning, which can directly obtain corresponding results by segmentation through a 3D image without detecting and re-segmenting.

In a first aspect, an embodiment of the present invention provides a method for deep learning 3D lung lesion segmentation, the method comprising the steps of:

acquiring a lung nodule dicom image, and preprocessing the dicom image;

three-dimensionally stacking the pre-processed dicom images to obtain a 3D image block, and cutting the 3D image block;

performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of center point coordinates, and obtaining the coordinates of the regressive points through the center point coordinates to obtain a segmentation result.

Optionally, the training process of the spherical segmentation model includes:

preprocessing the labeling data of the dicom image to obtain spherical coordinate data;

extracting a feature map based on the constructed network;

constructing a feature map pyramid network FPN and a cyclic feature pyramid network RFP based on ResNeSt-34;

and dividing the feature map into three parts through the FPN and the RFP, calculating a loss function of the spherical coordinate data of the feature map, and solving an optimized model.

Optionally, preprocessing the dicom image includes;

image normalization and image enhancement;

the image normalization comprises the step of obtaining an image normalized to pixel values of 0-255 by adjusting the image value of the dicom image through window width and window level by using window width and window level values of 500-1500 respectively.

Optionally, preprocessing the labeling data of the dicom image to obtain spherical coordinate data includes:

according to the labeling data of the dicom image, coordinate information and a segmentation contour of the data are obtained;

carrying out consistent interpolation on the labeling data of the dicom image according to the space distance, and setting the unit value of each dimension of the voxel to be the same value through an interpolation mode of 3 times of linear interpolation;

after the labeling data interpolation of the dicom image is completed, calculating the spherical coordinate space position of the regression pixel; wherein, the spatial expression of the spherical coordinates is as follows:

z＝rcosθ

wherein,n takes the value 36, < > in the present invention>m has a value of 36 in the present invention, r is the spherical center coordinates from each of the corresponding θ and +.>The angle reaches the length of the object edge.

Further, the calculating of the loss function of the spherical coordinate data of the feature map by dividing the feature map into three parts through the FPN and RFP includes:

the size of the category corresponding to the first partial regression feature map is D.H.W.k, wherein DHW respectively corresponds to three dimensions of length, width and height of the feature map, k is the number of categories, 1 is set, the activation function is sigmoid, namely whether the feature map is a nodule or not is judged through a probability value, and the loss function used is Focal loss;

the second part is the centrality of the spherical coordinates corresponding to the feature map, the size of the centrality is D x H x W x 1, dhw is the three dimensions of the length, width and height of the feature map, and the calculation formula of the centrality is as follows:

wherein d is _i Is the length of the first ray, where the spherical coordinates return n rays, where n is 72 (representing 36+36) in the present invention, according to the θ sum of the spherical coordinatesUniformly taking the values, namely, taking 36 values from 0-2 pi at equal intervals, and obtaining the +.>36 values are taken from 0-2 pi at equal intervals to obtain 72 regression lines, and the regression lines represent regression distances from the central point to 72 directions;

the third part is the distance from the center point on the spherical coordinates n angles corresponding to the regression feature map, the size of the distance is D.times.H.times.W.times.n, the DHW is the three dimensions of length, width, height and n is n angles, and as shown in the second part, the value of n is 72 in the invention.

In a second aspect, an embodiment of the present invention provides a 3D lung focus segmentation apparatus based on deep learning, the segmentation apparatus including:

the image processing module is used for acquiring a lung nodule dicom image and preprocessing the dicom image;

the 3D image acquisition module is used for carrying out three-dimensional stacking on the pre-processed dicom images to obtain a 3D image block, and cutting the 3D image block;

the regression subgraph acquisition module is used for extracting features of the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and the calculation module is used for calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of central point coordinates, and obtaining the coordinates of the regressed points through the central point coordinates to obtain a segmentation result.

In a third aspect, the present invention provides an electronic device, comprising:

a processor; a memory for storing processor-executable instructions;

wherein the processor implements the method described above by executing the executable instructions.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the above method.

Advantageous effects

The invention provides a 3D lung focus segmentation method, a device, electronic equipment and a storage medium based on deep learning, wherein the method is characterized in that a lung nodule dicom image is obtained, and the dicom image is preprocessed; three-dimensionally stacking the pre-processed dicom images to obtain a 3D image block, and cutting the 3D image block; performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; compared with the existing mainstream 3D segmentation method, the method is simpler, more convenient and faster, accords with the shape characteristics of a nodular sphere, achieves the accuracy close to that of the mainstream segmentation method, and provides a new way.

Drawings

FIG. 1 is a flow chart of a deep learning based 3D lung lesion segmentation method according to an embodiment of the present invention;

FIG. 2 is a training process of a spherical segmentation model according to an embodiment of the present invention;

FIG. 3 is a flowchart of preprocessing labeling data of a dicom image to obtain spherical coordinate data according to an embodiment of the present invention;

FIG. 4 shows coordinate information and a segmentation contour of data obtained by performing a univocal interpolation according to a spatial distance by labeling data of a dicom image according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the calculation of a loss function of the spherical coordinate data of the feature map by dividing the feature map into three parts by the FPN and RFP according to an embodiment of the present invention;

FIG. 6 is a block diagram of a 3D lung lesion segmentation device based on deep learning according to an embodiment of the present invention;

fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be clearly and completely described in connection with the embodiments, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention aims to provide a 3D lung focus segmentation method, a device, electronic equipment and a storage medium based on deep learning, which are capable of directly obtaining corresponding results through segmentation of 3D images without detection and re-segmentation, and the invention is further described below with reference to the accompanying drawings and specific embodiments:

fig. 1 shows a flowchart of a 3D lung focus segmentation method based on deep learning according to an embodiment of the present invention, as shown in fig. 1, the segmentation method includes the following steps:

s20, acquiring a lung nodule dicom image, and preprocessing the dicom image;

s40, three-dimensionally stacking the pre-processed dicom images to obtain a 3D image block, and cutting the 3D image block;

s60, extracting features of the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and S80, calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of center point coordinates, and obtaining the coordinates of the regressive points through the center point coordinates to obtain a segmentation result.

The method of the embodiment comprises the steps of preprocessing a dicom image of a lung nodule by acquiring the dicom image; three-dimensionally stacking the pre-processed dicom images to obtain a 3D image block, and cutting the 3D image block; performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; compared with the prior mainstream 3D segmentation method, the method is simpler, more convenient and faster, accords with the shape characteristics of a nodular sphere, achieves the accuracy close to that of the mainstream segmentation method, and provides a new thought.

Specifically, as shown in fig. 2, the training process of the spherical segmentation model includes:

s601, preprocessing the labeling data of the dicom image to obtain spherical coordinate data;

s602, extracting a feature map based on the constructed network;

s603, constructing a feature map pyramid network FPN and a cyclic feature pyramid network RFP based on ResNeSt-34;

s604, dividing the feature map into three parts through the FPN and the RFP, calculating a loss function of spherical coordinate data of the feature map, and solving an optimized model.

Specifically, as shown in fig. 3-4, preprocessing the labeling data of the dicom image to obtain spherical coordinate data includes:

s6011, obtaining coordinate information and a segmentation contour of data according to the labeling data of the dicom image;

s6012, carrying out consistent interpolation on the labeling data of the dicom image according to the space distance, and setting the unit value of each dimension of the voxel to be the same value in an interpolation mode of 3 times of linear interpolation;

s6013, calculating the spherical coordinate space position of the regression pixel after the labeling data interpolation of the dicom image is completed; wherein, the spatial expression of the spherical coordinates is as follows:

wherein, among them,n takes the value 36, < > in the present invention>m has a value of 36 in the present invention, r is the spherical center coordinates from each of the corresponding θ and +.>The angle reaches the length of the object edge.

In the data processing process, the embodiment obtains coordinate information and a segmentation contour of data through labeling data of a dicom image, performs consistent interpolation according to a spatial distance, and sets the size of a voxel (a unit value of each dimension of the voxel) to be the same value through interpolation, that is, the unit size of each dimension is the same millimeter number, for example, one voxel corresponds to a spatial size of 0.6x0.6x0.6mm. The corresponding regression radius and angle are processed by the method, so that the model is easier to carry out radius and added regression on the same pixel value, and the model can be easily trained. After the original data interpolation is completed, calculating the spherical coordinate space position of the regression pixel, wherein the space expression of the spherical coordinate is shown as a formula (1), in the data processing process, the distance from the center point of each pixel to the nodule segmentation edge is calculated, and n values are equally spaced according to theta from 0-2 pi, wherein n is 36 in the invention, and in the same way, phi is equally spaced from 0-2 pi, and the number of values is n; phi and n are respectively used as sub-modules of two 36 channels after sampling based on the feature map pyramid network FPN and the cyclic feature pyramid network RFP output of ResNeSt-34 for regression training, and the regression aims at the distance from the center point to the outer edge of the nodule at the angle corresponding to each phi and n. And (3) regressing distances from the center point in 36 phi and 36 n different directions on each pixel, thereby obtaining a segmentation result of the sphere coordinates.

Specifically, as shown in fig. 5, the calculation of the loss function of the spherical coordinate data of the feature map by dividing the feature map into three parts by the FPN and RFP includes:

the size of the category corresponding to the first partial regression feature map is D.H.W.k, wherein DHW respectively corresponds to three dimensions of length, width and height of the feature map, k is the number of categories, 1 is set, the activation function is sigmoid, namely whether the feature map is a nodule or not is judged through a probability value, and the loss function used is Focal loss; the probability value is calculated by the neural network FPN and RFP, which learn the label pattern, so that the probability calculation result on each feature map corresponds to the classification part in fig. 4.

In some embodiments, preprocessing the dicom image comprises;

image normalization and image enhancement;

the image normalization comprises the step of obtaining an image normalized to pixel values of 0-255 by adjusting the image value of the dicom image through window width and window level by using window width and window level values of 500-1500 respectively. Specifically, first x _c,i,j Representing the pixel values of the ith row and jth column in the c-th channel of a picture, the window width window level values are respectively 500-1500 in the invention, namely the image obtained from the dicom image is obtained by adjusting the image values through the window width window level, and the image normalized to the pixel values of 0-255 is obtained. In the present invention c= {1},training parameter configuration:

the initial learning rate in training is set to be 0.01, the training algebra is 50 epochs, the learning rate updating mode is WarmUpCosinelearningRate, namely, in the first 5 epochs, the learning rate is increased from 0.002 to 0.01 according to the fluctuation range of 0.2 of each epochs, and 45 epochs are sequentially increased according to the formulaCalculated, where n is the total number of epochs trained, 150 in this experiment and e is the current number of epochs. The optimizer used in the training is Adam optimizer and the loss function is MSE.

In the actual data processing process, the invention carries out 128 x 128 pixel block taking on n x 512 images under the limitation of display card storage, thus the values can be seen at the same time to ensure that the program runs smoothly without errors, the results of each block are spliced back to the final result, wherein n is the number of images, and a plurality of dicoms are stacked to form a 3D structure due to each dicom sequence. And finally obtaining a final result of the product of the centrality and the probability value of each feature image pixel, and sequencing the results according to the product of the centrality and the probability value.

In practical scene application, a dicom image is read out through pydicom, fixed window width and window level conversion is carried out on the dicom image, three-dimensional stacking is carried out on a plurality of dicom images to obtain a 3D image block, the 3D image block is cut, the size of the cut image is 128 x 128, the characteristic extraction is carried out on the cut image through a Spherical Segmentation model, a regression subgraph is obtained, the product of the centrality and the probability value is calculated to obtain a plurality of central point coordinates, and the coordinates of the regression points of the central point coordinates are obtained through the central point coordinates, so that a final segmentation result can be obtained.

Based on the same inventive concept, the embodiment of the present invention also provides a deep learning-based 3D lung lesion segmentation device, which can be used to implement the method described in the above embodiment, as described in the following embodiment. Because the principle of the deep learning-based 3D lung focus segmentation device for solving the problem is similar to that of a deep learning-based 3D lung focus segmentation method, implementation of the deep learning-based 3D lung focus segmentation device can be referred to implementation of the deep learning-based 3D lung focus segmentation method, and repeated parts are not repeated. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the system described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

As shown in fig. 6, the splitting device provided by the embodiment of the present invention includes:

an image processing module 20, configured to acquire a lung nodule dicom image, and perform preprocessing on the dicom image;

a 3D image obtaining module 40, configured to three-dimensionally stack the pre-processed dicom images to obtain a 3D image block, and crop the 3D image block;

the regression subgraph acquisition module 60 is used for extracting features of the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and the calculating module 80 is configured to calculate the product of the centrality and the probability of the regression subgraph to obtain a plurality of center point coordinates, and obtain the coordinates of the regressed points through the center point coordinates to obtain the segmentation result.

The segmenting device acquires a lung nodule dicom image through an image processing module 20 and pre-processes the dicom image; the 3D image acquisition module 40 performs three-dimensional stacking on the pre-processed dicom images to obtain 3D image blocks, and cuts the 3D image blocks; the regression subgraph acquisition module 60 extracts the features of the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; the calculation module 80 calculates the product of the centrality and the probability of the regression subgraph to obtain a plurality of center point coordinates, and obtains the coordinates of the regressive points through the center point coordinates to obtain a segmentation result.

s602, extracting a feature map based on a network ResNeSt-34;

The embodiment of the present invention also provides an electronic device, fig. 7 shows a schematic configuration of an electronic device to which the embodiment of the present invention can be applied, and as shown in fig. 7, the electronic device includes a Central Processing Unit (CPU) 701 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the system operation are also stored. The CPU 701, ROM 702, and RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 310 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The present invention also provides a computer readable storage medium, which may be a computer readable storage medium included in a deep learning-based 3D lung focus segmentation apparatus in the above embodiment; or may be a computer-readable storage medium, alone, that is not incorporated into an electronic device. The computer readable storage medium stores one or more programs for use by one or more processors to perform the deep learning based 3D lung lesion segmentation method described herein.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A deep learning-based 3D lung lesion segmentation method, the method comprising the steps of:

acquiring a lung nodule dicom image, and preprocessing the dicom image;

performing feature extraction on the cut 3D image block through a pre-trained spherical segmentation model to obtain a regression subgraph;

2. The segmentation method as set forth in claim 1, wherein the training process of the spherical segmentation model includes:

constructing a feature pyramid network FPN and a cyclic feature pyramid network RFP based on ResNeSt-34;

extracting a feature map based on the constructed network;

3. The segmentation method according to claim 1, characterized in that preprocessing the dicom image includes;

image normalization and image enhancement;

4. The segmentation method according to claim 2, wherein preprocessing the labeling data of the dicom image to obtain spherical coordinate data includes:

z＝rcosθ

5. The segmentation method according to claim 4, wherein the dividing the feature map into three parts by the FPN and RFP to calculate a loss function of the spherical coordinate data of the feature map includes:

the size of the category corresponding to the first partial regression feature map is D.H.W.k, wherein DHW respectively corresponds to three dimensions of length, width and height of the feature map, k is the number of categories, 1 is set, the activation function is sigmoid, namely whether the feature map is a nodule or not is judged through a probability value, and the loss function used is Focalloss;

6. A deep learning-based 3D pulmonary lesion segmentation device, the segmentation device comprising:

the 3D image acquisition module is used for carrying out three-dimensional stacking on the pre-processed dicom images to obtain 3D image blocks, and cutting the 3D image blocks;

the regression subgraph acquisition module is used for extracting the characteristics of the cut 3D image block through a pre-trained spherical segmentation model to obtain a regression subgraph;

and the calculation module is used for calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of center point coordinates, and obtaining the coordinates of the regressed points through the center point coordinates to obtain a segmentation result.

7. An electronic device, comprising:

a processor, a memory for storing processor-executable instructions;

wherein the processor is configured to implement the method of any of claims 1-5 by executing the executable instructions.

8. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1-5.