CN114119635B

CN114119635B - Fatty liver CT image segmentation method based on cavity convolution

Info

Publication number: CN114119635B
Application number: CN202111391577.6A
Authority: CN
Inventors: 邹倩颖; 徐泓扉; 徐中华; 黄林森; 倘一鸣; 马瑞
Original assignee: Chengdu College of University of Electronic Science and Technology of China
Current assignee: Chengdu College of University of Electronic Science and Technology of China
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2024-05-24
Anticipated expiration: 2041-11-23
Also published as: CN114119635A

Abstract

The invention discloses a fatty liver CT image segmentation method based on hole convolution, which comprises the steps of hole convolution, multi-language fusion, pooling and other operations of hole convolution, fully integrates liver multi-scale characteristics, adds initial attributes of different decoding layers in the decoding process, and performs corresponding image pooling treatment, thereby improving the accuracy of image segmentation.

Description

Fatty liver CT image segmentation method based on cavity convolution

Technical Field

The invention relates to a fatty liver CT image segmentation method based on cavity convolution.

Background

The liver is a high-incidence area of human disease. Automatic liver segmentation may help doctors diagnose and treat patients early. In the field of medical imaging, computed Tomography (CT) is one of the most widely used medical examinations.

In recent years, with the rapid development of deep learning, and in particular, the application of revolutionary neural networks (CNNs) in computer vision tasks, such as image segmentation Long et al, have proposed a complete convolutional network (FCN) based on CNNs that can accept any size of image input. The recovered image is very rough. Through segmentation and fusion of single-channel pixels, thicker feature layers can be obtained, and additional semantic information is provided for subsequent segmentation.

For two-dimensional segmentation of liver, zhang et al propose three-dimensional inverse segmentation of a conditional network, and adopt a three-dimensional automatic segmentation framework from coarse to fine, introduce a spatial pyramid pool and a fertility mechanics learning technology into the u-net, and propose a liver CT automatic segmentation method based on sar-u-net. The 3du network can effectively utilize the segmentation features between adjacent liver segments to obtain better segmentation results.

Considering that 3D u-net utilizes information between disks in CT images, the 3D CNN of Lu et al simultaneously detects and segments the liver, lei et al designed a three-dimensional narrow back block and middle pool block suitable for LV network. By decoupling the transverse correction and the spatial correlation of the liver CT image and combining the extended convolution hybrid technology, the interlayer characteristics of the liver CT image are extracted, and the liver can be effectively segmented through the three-dimensional folding and the interlayer disk extracted by the background extraction.

Due to the characteristics of CT and u-net images, liver segmentation has the following problems:

(1) The low and high attributes are equally important for liver segmentation; the bottom-up function in u-net ignores the importance of the bottom-up feature, resulting in low network segmentation performance.

(2) The shape and size of the liver are variable, and its gray scale value is similar to that of the adjacent organs, so that information is easily lost.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide the novel solar cell.

The aim of the invention is realized by the following technical scheme:

the fatty liver CT image segmentation method based on the hole convolution comprises the step of training a hole convolution neural network, wherein the step of training the hole convolution neural network comprises the following steps of:

s1, building a cavity convolutional neural network, wherein the cavity convolutional neural network comprises a plurality of convolutional layers, a plurality of cavity convolutional layers and a plurality of deconvolution layers which are sequentially connected;

S2, initializing a weight value of the cavity convolutional neural network;

S3, inputting the fatty liver CT image in the pre-established image segmentation training camp into the cavity convolutional neural network initialized in the step S2, and obtaining the weight of the advanced semantic information through a Sigmoid function; then weighting the high-level features by using the detail information in the adjacent low-level features, and weighting the low-level features by using the semantic information in the adjacent high-level features; finally, the addition is performed to enhance the feature transfer, let F _l∈R^H×W×C denote F _l that the encoder respectively represents the height, width and channel number of the feature in the H, W, C th, and the low-level feature channel attention vector V ^c and the spatial attention vector V ^s obtained by using this module are:

S4, inputting the fatty liver CT image in the pre-established picture test set into the cavity convolutional neural network obtained through training in the step S3, and outputting a corresponding picture depth map and a picture surface normal vector map; the image segmentation training camp is at least partially different from the fatty liver CT images in the picture test set; the image surface normal vector diagram is obtained by obtaining point cloud data of pixels through the depth diagram and adopting least square plane fitting;

S5, judging whether the prediction precision of the hole convolutional neural network, which is obtained through training in the step S3, on the picture depth and the picture surface normal vector accords with preset requirements according to the picture depth map and the picture surface normal vector map which are output in the step S4: if yes, finishing training; if not, returning to the step S3 to continue training until the training is performed to obtain the cavity convolutional neural network meeting the preset requirement.

Further, the step S3 further includes a fusion method of the neural network and the multi-scale semantic feature module: the high-level feature channel attention vector Z ^c and the feature space attention vector Z ^s obtained by using the multi-scale semantic feature module are:

Where l ε [1,4], fg (·) represents global average pooling, conv (·) represents convolution operations, Φ (·) represents Singmoid activation operations, vector is computed And/>Element multiplication is carried out on the low-level output characteristics and F _l respectively, so that weighted low-level output characteristics L _l are obtained; vector/>And/>Element multiplication is carried out on the weighted high-level output characteristics H _l,L_l,L_l∈R^H×W×C by F _l respectively; when l=1, vector/>And/>The output result obtained by the input multi-scale semantic feature module can be expressed as:

Wherein Concat denotes a feature channel fusion operation.

Further, the step S4 further includes a method for encoding-decoding the hole convolution image, where the hole convolution image encodes the lost data in the hole and stores the encoded data together with the convolution kernel, and when the stored data needs to be output, the data is expanded by decoding.

Further, the step S5 further includes a pooling operation for the pictures, which specifically includes the following steps:

the method comprises the steps of respectively carrying out 1×1 convolution on features obtained through parallel cavity convolution to strengthen the extracted multi-scale features, fusing the multi-scale features with a feature map obtained after average pooling, carrying out 1×1 convolution on fused output to obtain output F' with fixed size, wherein the specific steps are as follows:

F″＝σ(f^1×1[f^1×1(P_A(F′));f^1×1(f^3×3,d＝6(F′));f^1×1(f^3×3,d＝12(F′))；

f^1×1(f^3×3,d＝18(F′));P_A(F′)]

Wherein PA represents average pooling and d represents expansion rate;

let the feature map X of the afferent parameter pooling layer be W in width, H in height, and C in channel number, i.e. W H C matrix, then its output G can be according to the formula

X _c is the C-th channel of the input feature map, and G _c represents the input corresponding to the C-th channel; w is a weight parameter matrix, whose shape is p×q, W _c represents the C-th channel of the parameter pooling layer weight, p, q is the length and width of the pooling core, β is a preactivation function, whose function is to convert the correlation operation into an interpretable pooling operation, any one value of the weight parameter is mapped to the real interval (0, 1) using the sigmoid function as the preactivation function, the parameter pooling layer assigns a parameter W _c to each channel X _c of the feature map, each parameter value being specified between 0 and 1;

After the weight parameters are added, the pooling calculation method is based on The method comprises the following steps:

Wherein, the subscript C represents the C-th channel of the calculation feature map, beta is the preactivation function, w is the weight parameter and the weight parameter needs to be updated, the gradient of the layer parameter to the loss function C needs to be calculated, the first layer is a parameter pooling layer, and the gradient to C is that

According toObtaining

Optimizing a parametric formula for the layer input using a falling gradient is/>

The beneficial effects of the invention are as follows: the multi-scale characteristics of the liver are fully integrated, the initial attributes of different decoding layers and corresponding image pooling processing are added in the decoding process, and the accuracy of image segmentation is improved.

Drawings

FIG. 1 is a schematic diagram of a flow architecture of the present invention;

FIG. 2 is a schematic diagram of a three-dimensional 3D visualization model of the hole convolution of the present invention;

FIG. 3 is a schematic diagram of a residual convolution block of the present invention;

Fig. 4 is a schematic diagram of a hole residual convolution block.

Detailed Description

The technical solutions of the present invention will be clearly and completely described below with reference to the embodiments, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by a person skilled in the art without any inventive effort, are intended to be within the scope of the present invention, based on the embodiments of the present invention.

Referring to fig. 1-4, the present invention provides a technical solution: the fatty liver CT image segmentation method based on the cavity convolution comprises the step of training a cavity convolution neural network, wherein the step of training the cavity convolution neural network comprises the following steps of:

s1, constructing a cavity convolutional neural network, wherein the cavity convolutional neural network comprises a plurality of convolutional layers, a plurality of cavity convolutional layers and a plurality of deconvolution layers which are sequentially connected;

S2, initializing a weight value of the cavity convolutional neural network;

S4, inputting the fatty liver CT image in the pre-established picture test set into the cavity convolutional neural network obtained through training in the step S3, and outputting a corresponding picture depth map and a picture surface normal vector map; the fatty liver CT images in the image segmentation training set and the image test set are at least partially different; the normal vector diagram of the picture surface is obtained by obtaining point cloud data of pixels through a depth diagram and adopting least square plane fitting;

As shown in fig. 2, the present invention proposes a technique based on hole convolution, in which each pixel is enlarged while the image pixel is enlarged, but different liver shapes and sizes depend on the patient, so that the variation of the data amount is very important, especially in the distribution image. Top-down examples are typically used to obtain a better receptive field, but the spatial resolution is greatly reduced, while the spatial resolution is better controlled by the hole convolution method, so that the problem of local distortion is basically solved. Compared with the common convolution, the cavity convolution adds a cavity into the convolution kernel so as to achieve the purpose of expanding the receptive field on the premise of ensuring that the size of an output characteristic diagram is not changed, the receptive field is an indispensable ring in the convolution neural network, the image mapping function after convolution is firmly controlled, a point on the characteristic diagram corresponds to a region on an input diagram, further, the cavity residual convolution used by us consists of two 3d cavity convolutions with the cavity ratios of 1 and 2 and residual errors after Conv1 is 1*1, and the following diagram is adopted, so that the model can fully perceive the obtained liver multi-scale structure under the condition of not increasing the depth and the complexity, namely not increasing the workload, and the foundation is laid for the subsequent multi-scale characteristic fusion, and the steps are shown in fig. 3 and 4.

Further, step S3 further includes a fusion method of the neural network and the multi-scale semantic feature module: the high-level feature channel attention vector Z ^c and the feature space attention vector Z ^s obtained by using the multi-scale semantic feature module are:

where l ε [1,4], fg (·) represents global average pooling, conv (·) represents convolution operations, Φ (·) represents Sigmoid activation operations, and vectors are computed And/>Element multiplication is carried out on the low-level output characteristics and F _l respectively, so that weighted low-level output characteristics L _l are obtained; vector/>And/>Element multiplication is carried out on the weighted high-level output characteristics H _l,L_l,L_l∈R^H×W×C by F _l respectively; when l=1, vector/>And

The output result obtained by the input multi-scale semantic feature module can be expressed as:

Wherein Concat denotes a feature channel fusion operation.

Further, step S4 further includes a method for encoding-decoding the hole convolution image, where the hole convolution image encodes the lost data in the hole and stores the encoded data together with the convolution kernel, and when the stored data needs to be output, the data is expanded by decoding.

Further, step S5 further includes a pooling operation for the pictures, which specifically includes the following steps:

f^1×1(f^3×3,d＝18(F′))P_A(F′)]

Wherein PA represents average pooling and d represents expansion rate;

According toObtaining

The foregoing is merely a preferred embodiment of the invention, and it is to be understood that the invention is not limited to the form disclosed herein and is not to be considered as excluding other embodiments, but is capable of numerous other combinations, modifications and environments and is capable of modifications within the scope of the inventive concept, either by the foregoing teachings or by the teaching of the relevant art. And that modifications and variations which do not depart from the spirit and scope of the invention are intended to be within the scope of the appended claims.

Claims

1. A fatty liver CT image segmentation method based on cavity convolution is characterized by comprising the following steps of: the method comprises the step of training the hole convolutional neural network, wherein the step of training the hole convolutional neural network comprises the following steps of:

S2, initializing a weight value of the cavity convolutional neural network;

S3, inputting the fatty liver CT image in the pre-established image segmentation training camp into the cavity convolutional neural network initialized in the step S2, and obtaining the weight of the advanced semantic information through a Sigmoid function; then weighting the high-level features by using the detail information in the adjacent low-level features, and weighting the low-level features by using the semantic information in the high-level features; finally, adding is performed to enhance the transfer of the features, let F _l∈R^H×W×C, where F _l represents the processed added features, where H, W, C represents the height, width and channel number of the features, respectively, and the low-level feature channel attention vector V ^c and the spatial attention vector V ^s obtained using the processed added features are:

2. The fatty liver CT image segmentation method based on hole convolution as set forth in claim 1, wherein: the step S3 further comprises a fusion method of the neural network and the multi-scale semantic feature module: the high-level feature channel attention vector Z ^c and the feature space attention vector Z ^s obtained by using the multi-scale semantic feature module are:

Where l ε [1,4], f _g (·) represents global average pooling, conv (·) represents convolution operations, Φ (·) represents Sigmoid activation operations, and the vector is computed And/>Element multiplication is carried out on the low-level output characteristics and F _l respectively, so that weighted low-level output characteristics L _l are obtained; vector/>And/>Element multiplication is carried out on the weighted high-level output characteristics H _l,L_l,L_l∈R^H×W×C by F _l respectively; when l=1, vector/>And/>The output result obtained by the input multi-scale semantic feature module can be expressed as:

Wherein Concat denotes a feature channel fusion operation.

3. The fatty liver CT image segmentation method based on hole convolution as set forth in claim 1, wherein: the step S4 also comprises a coding-decoding method of the cavity convolution image, wherein the lost data in the cavity is coded by the cavity convolution image and then stored together with the convolution kernel, and when the stored data needs to be output, the data is unfolded through decoding.

4. The fatty liver CT image segmentation method based on hole convolution as set forth in claim 1, wherein: the step S5 further comprises the pooling operation of the pictures, and the specific steps are as follows:

The method comprises the steps of respectively carrying out 1X 1 convolution on features obtained through parallel cavity convolution to strengthen the extracted multi-scale features, fusing the multi-scale features with a feature map obtained after average pooling, carrying out 1X 1 convolution on fused output to obtain output F' with fixed size, wherein the specific steps are as follows:

f^1×1(f^3×3,d＝18(F′));P_A(F′)]

Wherein P _A represents average pooling and d represents expansion rate;

X _i is the C-th channel of the input feature map, and G _c represents the input corresponding to the C-th channel; w is a weight parameter matrix, whose shape is p×q, W _c represents the C-th channel of the parameter pooling layer weight, p, q is the length and width of the pooling core, β is a preactivation function, whose function is to convert the correlation operation into an interpretable pooling operation, any one value of the weight parameter is mapped to the real interval (0, 1) using the sigmoid function as the preactivation function, the parameter pooling layer assigns a parameter W _c to each channel X _c of the feature map, each parameter value being specified between 0 and 1;

Wherein, the subscript C represents the C-th channel of the calculation feature map, beta is the preactivation function, w is the weight parameter and the weight parameter needs to be updated, the gradient of the parameter pooling layer parameter to the loss function C needs to be calculated, the 1 st layer is the parameter pooling layer, and the gradient to C is that

According toObtaining