CN113470044A

CN113470044A - CT image liver automatic segmentation method based on deep convolutional neural network

Info

Publication number: CN113470044A
Application number: CN202110641080.9A
Authority: CN
Inventors: 付冲; 贾体慧; 戴黎明
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2021-06-09
Filing date: 2021-06-09
Publication date: 2021-10-01

Abstract

The invention relates to a CT image liver automatic segmentation method based on a deep convolutional neural network, which comprises the following steps of 1: establishing a training sample set, including reading an original file of a liver CT data set, making a liver segmentation data set, unifying the size of the data set into a fixed size, and dividing the size of the data set into a test set and a training set according to a proportion; step 2: constructing a depth attention inverse residual error module, and performing down-sampling by using a non-parameter down-sampling mode to further reduce the network parameter number; and step 3: establishing a depth attention inverse residual error network; and 4, step 4: using a training set training network to screen and store training models with good effect; and 5: model availability is verified in the test set. The invention processes the liver CT image, accurately and quickly segments the liver region, is used in any liver lesion detection system, has smaller network parameter scale and has practicability.

Description

CT image liver automatic segmentation method based on deep convolutional neural network

Technical Field

The invention discloses a CT image liver automatic segmentation method based on a deep convolutional neural network, and relates to the field of medical image computer-aided diagnosis.

Technical Field

According to the latest global cancer data in 2020 released by the world health organization international cancer research institute (WHO IARC), in 2020, the number of liver cancer cases in china is up to 41.13 ten thousand, which accounts for 45.3% of the number of liver cancer cases in the world, and the liver is the first organ in the leaderboard. Among the death cases of liver cancer worldwide, the number of death cases in China is more than 39 ten thousand, and the number of death cases is second to that of lung cancer. Liver compensation function is strong (as long as 30-50% of the liver is not damaged, liver function is not affected), pain nerves around the liver are lacked, and the disease condition is easily covered by other background diseases, so that liver cancer is difficult to be found in early clinical diagnosis, and even in developed countries, only less than 30% of liver cancer patients can be found at early stage and treated by operation. Therefore, for the population with high risk of suffering from liver diseases, the liver part of the population is subjected to targeted examination regularly, and the liver is discovered and treated as soon as possible when the liver suffers from the pathological changes, so that the death rate of liver diseases such as liver cancer can be effectively reduced, and the survival probability and the life quality of the population suffering from the liver diseases are improved.

With the development and application of computer vision technology, the imaging analysis by computer aided doctors has become a mainstream research direction. The core of assisting a doctor in diagnosing liver CT is to help the doctor to clearly divide and identify liver regions, and the computer technology is used for assisting the doctor in identifying and dividing liver diseases (such as liver cancer) from the whole abdominal cavity CT, and then performing targeted analysis and diagnosis. The traditional segmentation algorithm is applied to a computer-aided diagnosis system in a large scale, however, the traditional segmentation algorithm has some problems which are difficult to solve in both flow and technology, because the traditional segmentation algorithm is generally designed based on low-level information, such as region growing, a level set method, a superpixel method, an SVM and other machine learning methods, high-level semantic information is not fully utilized, the segmentation effect on boundary blurring and organ adhesion parts is poor, a liver CT image contains a large number of human organs, bones and other interference detection factors, the medical image segmentation based on the traditional method and the machine learning method depends on a manual extraction and feature design mode, the segmentation accuracy for the liver is low, and the design process is complex.

With the rapid development of the deep learning technology in recent years, the related technology is gradually applied to the medical field, compared with the traditional method, the segmentation method based on the deep learning focuses on high-level semantic information more, the semantic representation of the image is expressed more accurately, the complicated steps of manual feature design are omitted, the network building process is concise and clear compared with the traditional method, and the technical possibility is provided for designing the full-automatic detection segmentation algorithm of the abdominal CT image. Although the existing liver segmentation algorithm based on the convolutional neural network has made great progress in accuracy, the following defects still exist:

(1) a segmentation algorithm frame based on a deep convolutional neural network is derived from a natural image task, and the existing liver segmentation algorithm is mostly transplanted to a natural image segmentation algorithm, so that the transplanted segmentation network frame structure is excessively redundant, the parameter quantity is huge, and the practicability is poor.

(2) The existing medical segmentation networks are all general networks, emphasis is placed on the generalization of the networks aiming at various medical images, and the segmentation framework is not adjusted and optimized aiming at the actual image characteristics of the liver, so that the specificity is poor.

(3) The existing lightweight module is not optimized aiming at the characteristics of medical images in design, so that the segmentation identification precision is low, and the lightweight module is difficult to be used in clinical application.

Disclosure of Invention

The invention aims to provide a CT image liver automatic segmentation method based on a deep convolutional neural network, which is used for rapidly segmenting a liver region by utilizing a training model and providing accurate liver region information for further diagnosing liver diseases. The method ensures higher liver segmentation precision under the condition of greatly reducing the parameter quantity required by the network, provides practical reference information for the auxiliary diagnosis of subsequent liver diseases, and has considerable practical value.

The technical scheme adopted for realizing the purpose of the invention is as follows:

step 1: establishing a training sample set, including reading an original file of a liver CT data set, making a liver segmentation data set, unifying the size of the data set into a fixed size, and dividing the data set into a test set and a training set;

step 2: constructing a depth attention inverse residual error module, which comprises designing a feature layer attention mechanism and constructing an optional expansion rate depth inverse residual error module;

and step 3: establishing a depth attention inverse residual error network, which comprises an integrated depth attention inverse residual error module, constructing a down-sampling network layer and an up-sampling network layer, and designing transverse connection to fuse information of the up-sampling layer and the down-sampling layer;

and 4, step 4: training the network by using a training set to obtain a training model with a good effect;

and 5: model availability is verified in the test set.

The invention has the beneficial effects that: aiming at three defects in the existing liver algorithm research based on the convolutional neural network set forth in the technical background, the innovation points of the invention are as follows:

1. the segmentation network of the liver is simplified by adopting the lightweight module, the high-dimensional feature mapping thought is used for reference, the selectable expansion rate reverse residual error module is constructed, the network module still keeps better feature extraction capability after lightweight is ensured, the expansion rate can be adjusted according to the network depth, the overall parameter scale of the network is controlled to be not too large, the defect that the performance and the parameter scale of the existing segmentation network cannot be considered is overcome, and the practicability of the model is improved.

2. A feature layer attention mechanism is designed based on a pseudo-colorization principle, and a liver feature map only with gray features is subjected to linear enhancement, so that each level of network layer pays more attention to information correlation among feature map levels, feature information streams among network layers are supplemented, and the defects of low utilization rate, under-fitting and the like of parameters of the existing liver segmentation network are overcome.

3. A convolution downsampling mode commonly adopted by a lightweight network at the present stage is abandoned, and a parameter-free maximum pooling method is adopted in the downsampling process, so that the network parameter quantity is further reduced, and the network has high practicability.

Drawings

In order to make the technical scheme of the invention clearer, the invention is explained in detail below with reference to the accompanying drawings, wherein:

fig. 1 is an overall flow chart of liver segmentation model design.

Fig. 2 is a schematic diagram of a liver training test set production proof.

FIG. 3 is a schematic illustration of a feature layer attention mechanism.

Fig. 4 is a schematic diagram of an alternative expansion rate anti-residual structure.

Fig. 5 is a schematic diagram of a depth attention anti-residual module.

Fig. 6a is a schematic diagram of an overall structure of a deep attention anti-residual error network.

Fig. 6b is a schematic diagram of the overall structure of the depth attention anti-residual module.

Fig. 7 is a table of details of the depth attention anti-residual network structure.

Fig. 8 is a diagram illustrating the effect of different spreading rates on network performance.

Fig. 9 is a diagram illustrating the effect of different spreading rates on the number of network parameters.

FIG. 10a is a schematic diagram showing the comparison between the average intersection ratio index and the network parameter number of different segmentation models.

FIG. 10b is a schematic diagram of comparison between the Dase index and the network parameter of different segmentation models.

FIG. 10c is a diagram illustrating a comparison of Jacard index and network parameters of different segmentation models.

FIG. 11 is a comparison of test set segmentation coupons.

Detailed Description

The present invention will be described in further detail with reference to the following detailed description for more clearly illustrating the objects, technical means and advantages of the present invention. The specific embodiments described herein are merely illustrative of and not limiting of the invention. The invention can also be applied by other specific implementation methods. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention without departing from its spirit or essential characteristics.

The invention provides a CT image liver automatic segmentation method based on a deep convolutional neural network, the whole flow chart of which is shown as the attached figure 1, and the method specifically comprises the following steps:

step 1: and establishing a training sample set. The method specifically comprises the following steps:

step 1.1: since the liver dataset is from a subset of the LiST (ISBI 2017 liver tumor segmentation challenge) challenge, 130 patient CT slices were included. Specific data information of the data set is stored in a nii format, a nii file can be read by mricon software, in the reading process, window values of the liver original image are set to be (0,200), window values of the liver and tumor label are set to be (0,1) and (0,2), and finally, the original data sets are respectively manufactured according to the window values;

step 1.2: a generic picture format (. png) dataset was made. Cutting the abdominal cavity image containing liver information after being intercepted by mricon software, ensuring that the pixel values of the original image and the label are in one-to-one correspondence, and storing the image in a general image format (. png);

step 1.3: dividing the original liver segmentation data set into a test picture set and a training picture set according to a ratio of about 8:2, wherein the training set covers CT slices of about 110 patients, and the test set covers CT slices of about 20 patients;

step 1.4: manufacturing a liver segmentation label, specifically processing an original label through binarization operation to manufacture an accurate liver region segmentation label;

step 1.5: making a final data set, specifically scaling the original drawing and the label made in the above steps in a unified size, wherein the scaling size can be set to 256 × 256;

step 2: and constructing a depth attention anti-residual error module which comprises a feature layer attention mechanism and an optional expansion rate depth anti-residual error module, wherein an up-sampling structure and a down-sampling structure are matched behind each depth attention anti-residual error module. The method specifically comprises the following steps:

step 2.1: designing a feature layer attention mechanism, as shown in fig. 3, specifically, first allocating constant terms to input features X (height: H, width: W, number of channels: C) from an upper layer convolution network according to channels, and setting N constant terms according to required output channel dimension number N, wherein each constant term

Carrying out weighted point multiplication with X, splicing the results by using Concat (-) operation, and finally adding a BN (batch standard) layer to ensure the characteristics of the feature graph after outputThe distribution is uniform, and the dimension of the finally output channel is N. It is specifically defined as formula (1):

wherein H_o(. cndot.) represents the attention-enhancing mechanism, and X represents the input features, which can be expressed as X { X ═ according to the number of channels C⁰,x¹,...,x^C-1}，

Represents a linear transformation and Concat (-) represents a per-channel splicing operation.

Step 2.2: designing an optional expansion rate depth inverse residual structure, wherein a characteristic graph channel output after a characteristic layer attention mechanism is assumed to be N-2, and the specific steps are as follows, firstly, performing dimensionality-increasing expansion on the characteristic graph output by the characteristic layer attention mechanism according to the channel by adopting 1 × 1 convolution, and ensuring that useful characteristics of a liver region can be fully extracted after a lightweight structure is adopted under the high-dimensional condition; then, designing an optional expansion rate structure, for example, three switch structures in fig. 3 respectively correspond to three different expansion rates designed by a deep and shallow network layer, and when the three expansion rates are specifically designed, the three expansion rates can be respectively set to be 2, 4 and 6, then the feature diagram channels obtained through the three expansion rates are respectively 4, 8 and 12, and after the feature diagram is subjected to dimension enhancement, a BN layer and a ReLU activation function are respectively added to perform batch standardization and activation operation on the features; then, performing feature extraction on the feature map after the dimension is raised by adopting 3 multiplied by 3 depth separable convolution, and adding a BN layer and an activation layer for normalization and activation; and finally, continuously reducing the dimension of the feature map to be consistent with the input by adopting 1 × 1 convolution, and adding a BN layer to perform feature batch standardization, namely finally outputting the number N of the feature map channels to be 2. The overall architecture of the optional extension rate depth anti-residual structure is shown in fig. 4.

Step 2.3: an up-sampling structure and a down-sampling structure are designed, the down-sampling is carried out by using a parametrically-free maximum pooling mode, and the up-sampling is carried out by using a 3 x 3 transposition convolution. The constructed depth attention anti-residual error modules are shown in fig. 5, wherein the number of the depth anti-residual error modules with selectable expansion rates can be increased or decreased by setting, for example, a plurality of modules can be set, and the modules can be cascaded by jump connection, in this example, the number of the modules is set to be 2, and the modules include two jump connection structures.

And step 3: the method for establishing the deep attention anti-residual error network comprises a down-sampling network layer and an up-sampling network layer, and specifically comprises the following steps:

step 3.1: constructing a down-sampling network layer, wherein the down-sampling network layer comprises depth attention anti-residual modules with three expansion rates, the expansion rates between the modules are reduced in a gradient manner along with the network depth and are respectively 6, 4 and 2, as shown in the detail of reference in fig. 7, the three expansion rates correspond to the number of characteristic diagram channels which are respectively 32/64, 128 and 256, namely, a deep network adopts a smaller expansion rate, and a shallow network adopts a larger expansion rate, so that the deep network layer structure is prevented from being excessively redundant. As shown in fig. 7, the network as a whole contains five depth attention anti-residual modules from shallow to deep, and the number of the selectable expansion rate depth anti-residual blocks in each module is 2, 3 and 3 respectively.

Step 3.2: an up-sampling network layer is constructed, the up-sampling network layer also comprises three expansion rate depth attention anti-residual modules, the expansion rates among the modules are increased in a gradient mode along with the network depth and are respectively 2, 4 and 6, the numbers of the corresponding characteristic map channels are respectively 256, 128 and 64/32, namely, the network layer close to an output layer adopts a larger expansion rate, otherwise, a smaller expansion rate is adopted, and the phenomenon of under-fitting caused by too little available characteristic information of a deep network is prevented. The whole network layer also comprises five depth attention anti-residual modules, and the number of the depth anti-residual modules in each module is respectively set to be 3, 2 and 1. The whole of the down-sampling network layer and the up-sampling network layer tend to be symmetrically designed to form a U-shaped network structure, as shown in fig. 6 a.

Step 3.3: adding a transverse connection structure (a bit adding operation in figure 6 a) into an up-down sampling layer structure, adding output feature maps obtained by an up-down sampling layer according to bits, fusing feature information extracted by a deep-shallow layer network, avoiding loss of detail information in the shallow layer network, keeping a fused channel consistent with output channels of the up-down sampling layer and the down-sampling layer, wherein the output channel of the down-sampling layer is 128, the output of the up-sampling layer is also 128, and the output channel of the feature map obtained by bit fusion of the up-down sampling layer and the up-sampling layer is still 128. The overall framework of the built depth attention anti-residual error network is shown in fig. 6a, fig. 6b shows the specific application position of the selectable expansion rate depth anti-residual error module in the overall network, and other specific detail information of the network structure is shown in the table of fig. 7.

And 4, step 4: the method for obtaining the training model by using the data set training network specifically comprises the following steps:

step 4.1: and adjusting various hyper-parameters of the training process, such as learning rate, training round number, attenuation factor, network optimizer and the like.

Step 4.2: and sending the divided training data sets into a network, randomly extracting a certain proportion of data sets to be used as a verification set, and starting to train the model.

Step 4.3: and in the training process, recording the performance of each round model on the verification set, and screening and storing the optimal model.

And 5: verifying the model availability by using the test set, specifically comprising the following steps:

step 5.1: and selecting three indexes of average cross-over ratio (mIoU), Dace (Dice) index and Jaccard (Jaccard) as evaluation indexes of the test set, and comparing and analyzing the indexes with the existing classical network and light weight.

Step 5.2: and recording the network parameter quantity and the memory occupation condition, and analyzing and comparing with the classical network.

Step 5.3: and performing a segmentation test on the test set to verify the actual segmentation effect of the model.

Results and analysis of the experiments

1. Segmentation evaluation index, experimental result and analysis

(1) Index for evaluation of segmentation

The performance evaluation method adopts a Dice (Dice) index, a mean cross-over ratio (mIoU) index and a Jaccard (Jaccard) index to evaluate the performance of a segmentation algorithm, wherein the Dice index is the most main performance index. The Dice index is also a statistical method for comparing similarity between two sample sets, the higher the index value is, the closer the segmentation result representing the algorithm is to the real segmentation result, the better the segmentation performance is, and the corresponding calculation formula is shown in formula (2):

wherein, A and B represent the real result of the segmentation and the segmentation result of the algorithm in this document, respectively.

The Jaccard index (Jaccard) is similar to the Dice index, and is also used for comparing the similarity degree of the two sets, and the higher the Jaccard index is, the better the performance of the segmentation algorithm is represented, and the formula is as follows:

wherein, A and B represent the real result of the segmentation and the segmentation result of the algorithm in this document, respectively. Let p_i,jRepresenting the predicted value, t, of the pixel at the last output image (i, j) of the liver segmentation network_i,jRepresenting the true value, then the Dice index and the Jaccard index can also be expressed as follows:

the average cross-over ratio, i.e., the mlou index, is described below. Firstly, it is necessary to know that semantic segmentation is pixel-level classification, and on the basis, there are determination indexes of sensitivity and specificity. In machine learning, the following 4 indicators are commonly used to evaluate the performance of the classification algorithm:

(1) the True Positive (TP), representing the number of positive samples that the model judges as positive samples, i.e. the number of positive samples that are correctly judged.

(2) False Positives (FP), which represent the number of negative samples that the model judges as positive samples, i.e. the number of negative samples that are misjudged.

(3) And a True Negative (TN) representing the number of negative samples judged by the model as negative samples, namely the number of negative samples judged correctly.

(4) False Negatives (FN), representing the number of positive samples that the model calls negative, i.e., misjudged positive samples.

The actual definition of the average cross-over ratio, the mlou index, is: and calculating the ratio of the intersection and union of the real value set and the predicted value set, which is the same as the Jacard index in practical sense, but the specific implementation details are different. The mIoU can be regarded as the sum (union) of TP (intersection) ratio TP, FP and FN. Namely:

if the number of the segmentation classes is k, n_ijRepresenting the true value as the number of times i is predicted to be j, the average cross-over ratio index can also be expressed as follows:

(2) results of segmentation experiment

The experiment was performed on a test data set containing approximately 20 patient abdominal sections, with the evaluation indices being the mean cross-over ratio (mlou) index, the dess (Dice) index and the Jaccard (Jaccard) index, respectively. Through multiple tests, the segmentation model respectively reaches 96.71%, 93.76% and 96.22% in three indexes, the performance is superior to that of other classical segmentation network models, and the parameter quantity of the segmentation model only needs 5.8M and is only 50% of that of the classical segmentation network FC-DenseNet.

(3) Ablation experimental result analysis

Table 1 shows the effect of the characteristic attention mechanism on the network performance. The experimental result shows that after the characteristic attention mechanism is added, the overall performance of the network is more excellent, and the maximum performance is approximately improved by 1 percentage point.

TABLE 1 Effect of characteristic attention mechanism on the results of the experiment

TABLE 2 impact on network Performance Using convolutional downsampling and maximum pooling strategies

Table 2 and table 3 show the impact of downsampling using the maximum pooling strategy and downsampling using the popular convolution downsampling strategy on network performance and parameter amount memory, respectively. The experimental result shows that the maximum pooling strategy is adopted, compared with convolution downsampling, the gain of the network is higher, and the parameter quantity and the memory occupation are smaller.

TABLE 3 impact on network parameters and memory footprint using convolutional downsampling and max pooling strategies

In addition, the expansion rate of gradient change is specially designed to adapt to feature extraction layers with different depths, beneficial balance is made between network performance and parameters, and the influence of specific parameters on the network performance and the network parameters is shown in figure 8.

(4) Contrastive analysis with other deep learning algorithms

Tables 4 and 5 show experimental comparison results of the algorithm herein with the classical segmentation algorithm and with the current mainstream lightweight algorithm of comparison, respectively. From the overall experimental data, the algorithm has the highest mIoU index, Dice index and Jaccard index in all liver segmentation algorithms, and has excellent segmentation performance.

Table 4 compares the results with the classical segmentation algorithm

TABLE 5 comparison of experimental results with lightweight modules

In addition, the trade-off between performance and the required parameters of the network for each algorithm is compared, as shown in fig. 9. As can be seen from the algorithm parameters and performance indexes in fig. 9, the segmentation model of the present disclosure ensures higher performance, and the required parameters are also fewer in many algorithms, which proves that the practical value of the present algorithm is much higher than that of other models.

Claims

1. A CT image liver automatic segmentation method based on a deep convolutional neural network is characterized by comprising the following steps:

step 1: establishing a training sample set, including reading an original file of a liver CT data set, making a liver segmentation data set, unifying the sizes of the liver segmentation data set into a fixed size, and dividing the liver segmentation data set into a test set and a training set according to a proportion;

step 2: constructing a depth attention inverse residual error module, which comprises a feature layer attention mechanism for designing attention feature map hierarchical relation and an optional expansion rate inverse residual error module for controlling network parameter scale, and performing down-sampling by using a non-parameter down-sampling mode to further reduce the network parameter number;

and step 3: establishing a deep attention inverse residual error network, wherein a down-sampling network layer structure and an up-sampling network layer structure are constructed, and horizontal connection and fusion of up-sampling layer information and down-sampling layer information are designed;

and 4, step 4: training a deep attention inverse residual error network by using a training set, and screening and storing training models with good effect;

and 5: model availability is verified in the test set.

2. The method for automatically segmenting the liver based on the CT image of the deep convolutional neural network as claimed in claim 1,

step 1: establishing a training sample set; the method specifically comprises the following steps:

step 1.1: since the liver dataset is from a subset of the LiST challenge, containing 50-200 patient CT slices; specific data information of the data set is stored in a nii format, a nii file can be read by mricon software, in the reading process, window values of the liver original image are set to be (0,200), window values of the liver and tumor label are set to be (0,1) and (0,2), and finally, the original data sets are respectively manufactured according to the window values;

step 1.2: making a universal picture format (. png) data set; cutting the abdominal cavity image containing liver information after being intercepted by mricon software, ensuring that the pixel values of the original image and the label are in one-to-one correspondence, and storing the image in a general image format (. png);

step 1.3: dividing the original liver segmentation data set into a test picture set and a training picture set according to a ratio of about 8:2, wherein the training set covers CT slices of 50-200 patients, and the test set covers CT slices of 5-40 patients;

step 1.5: the final data set is created, specifically, the original image and the label created in the above steps are scaled in a uniform size, and the scaling size may be 256 × 256.

3. The method for automatically segmenting the liver based on the CT image of the deep convolutional neural network as claimed in claim 1,

step 2: the method comprises the following steps of constructing a depth attention inverse residual error module, wherein the depth attention inverse residual error module comprises a feature layer attention mechanism and an optional expansion rate depth inverse residual error module, and specifically comprises the following steps:

step 2.1: designing a feature layer attention mechanism, specifically, firstly distributing constant items to input features X from an upper layer convolution network according to channels, and setting N constant items according to required output channel dimension number N, wherein each constant item

Carrying out weighted point multiplication with X, splicing the results by using Concat (-) operation, and finally adding a BN layer to ensure that the feature distribution of the output feature graph is uniform, wherein the dimension of the finally output channel is N; it is specifically defined as formula (1):

Represents a linear transformation, Concat (-) represents a per-channel splicing operation;

step 2.2: designing an optional expansion rate depth inverse residual structure, wherein a characteristic graph channel output after a characteristic layer attention mechanism is assumed to be N-2, and the specific steps are as follows, firstly, performing dimensionality-increasing expansion on the characteristic graph output by the characteristic layer attention mechanism according to the channel by adopting 1 × 1 convolution, and ensuring that useful characteristics of a liver region can be fully extracted after a lightweight structure is adopted under the high-dimensional condition; designing an optional expansion rate structure, wherein three switch structures respectively correspond to three different expansion rates designed by a deep and shallow network layer, the three expansion rates can be respectively set to be 2, 4 and 6 during specific design, the characteristic diagram channels obtained through the three expansion rates are respectively 4, 8 and 12, and after the characteristic diagram is subjected to dimension enhancement, respectively adding a BN layer and a ReLU activation function to perform characteristic normalization and activation operation; then, performing feature extraction on the feature map subjected to dimensionality increase by adopting 3 x 3 depth separable convolution, and adding a BN layer and a ReLU activation layer to perform batch standardization and activation operation on the features; finally, continuously adopting 1 × 1 convolution to reduce the feature diagram dimension to be consistent with the input and adding a BN layer to carry out feature batch standardization, namely finally outputting the feature diagram channel number N which is 2; an integral framework of an optional expansion rate depth inverse residual structure;

step 2.3: designing an up-sampling structure and a down-sampling structure, performing down-sampling by using a parametrically-free maximum pooling mode, and performing up-sampling by using a 3 x 3 transposition convolution; the number of the depth attention anti-residual modules with the selectable expansion rate can be increased or decreased through setting, a plurality of modules can be arranged, and the modules can be cascaded through jump connection.

4. The method for automatically segmenting the liver based on the CT image of the deep convolutional neural network as claimed in claim 1,

step 3.1: constructing a down-sampling network layer, wherein the down-sampling network layer comprises depth attention anti-residual modules with three expansion rates, the expansion rates among the modules are reduced in a gradient manner along with the network depth and are respectively 6, 4 and 2, the three expansion rates correspond to the number of characteristic graph channels which are respectively 32/64, 128 and 256, namely, a deep network adopts a smaller expansion rate, and a shallow network adopts a larger expansion rate, so that the deep network layer structure is prevented from being excessively redundant; the whole network comprises five depth attention anti-residual modules from shallow to deep, and the number of the depth anti-residual modules with selectable expansion rates in each module is 2, 3 and 3 respectively;

step 3.2: constructing an up-sampling network layer, which also comprises three expansion rate depth attention anti-residual modules, wherein the expansion rates among the modules are increased in a gradient manner along with the network depth and are respectively 2, 4 and 6, the numbers of the characteristic diagram channels are respectively 256, 128 and 64/32, namely, the network layer close to an output layer adopts a larger expansion rate, otherwise, the network layer adopts a smaller expansion rate, and the phenomenon of under-fitting caused by too little available characteristic information of a deep network is prevented; the whole network layer also comprises five depth attention anti-residual modules, and the number of the depth anti-residual modules in each module is respectively set to be 3, 2 and 1; the whole of the down-sampling network layer and the up-sampling network layer tend to be symmetrically designed to form a U-shaped network structure;

step 3.3: adding a transverse connection structure into an up-down sampling layer structure, adding output feature maps obtained by an upper sampling layer and a lower sampling layer according to bits, fusing feature information extracted by a deep-shallow layer network, avoiding loss of detail information in the shallow layer network, keeping a fused channel consistent with output channels of the up-down sampling layer and the lower sampling layer, wherein the output feature map channel of the lower sampling layer is 128, the output of the upper sampling layer is also 128, and the feature map channel after the feature fusion of the upper sampling layer and the lower sampling layer according to bits is still 128; and at this point, the construction of the integral framework of the deep attention anti-residual error network is completed.

5. The method for automatically segmenting the liver based on the CT image of the deep convolutional neural network as claimed in claim 1,

step 4.1: adjusting each hyper-parameter, learning rate, training round number, attenuation factor and network optimizer in the training process;

step 4.2: sending the divided training data sets into a network, randomly extracting a certain proportion of data sets to be used as a verification set, and starting to train the model;

6. The method for automatically segmenting the liver based on the CT image of the deep convolutional neural network as claimed in claim 1,

step 5.1: selecting three indexes of an average cross-over ratio (mIoU), a Dice (Dice) index and a Jaccard (Jaccard) index as evaluation indexes of a test set, and carrying out comparative analysis with the existing classical network and light weight; step 5.2: recording the network parameter quantity and the memory occupation condition, and analyzing and comparing with a classical network;