CN117078692B

CN117078692B - Medical ultrasonic image segmentation method and system based on self-adaptive feature fusion

Info

Publication number: CN117078692B
Application number: CN202311321509.1A
Authority: CN
Inventors: 王东骥; 程海博; 涂燕晖; 陈一昕
Original assignee: Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory
Current assignee: Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory
Priority date: 2023-10-13
Filing date: 2023-10-13
Publication date: 2024-02-06
Anticipated expiration: 2043-10-13
Also published as: CN117078692A

Abstract

The invention relates to the technical field of image processing, and discloses a medical ultrasonic image segmentation method and a system based on self-adaptive feature fusion, which acquire ultrasonic images; performing sample increment and marking transformation on the ultrasonic image to obtain an ultrasonic image data set; constructing an adaptive integrated neural network model based on a ResNet-101 network and a plurality of deep convolutions; training the self-adaptive integrated neural network model based on a Pytorch framework; inputting an ultrasonic image array into a ResNet-101 network to obtain shallow features and deep features, combining the deep features through a plurality of deep convolution processes to output multi-scale deep features, fusing the multi-scale deep features by using a residual jump structure, outputting fused features, and converting the fused features into a prediction category result by using a softmax function. The invention has the advantages of improving the segmentation efficiency, the detail recognition capability, the edge fineness and the scene adaptability, and realizing the effects of finer, quicker and more accurate medical ultrasonic image segmentation compared with the prior proposal.

Description

Medical ultrasonic image segmentation method and system based on self-adaptive feature fusion

Technical Field

The invention relates to the technical field of image processing, in particular to a medical ultrasonic image segmentation method and system based on self-adaptive feature fusion.

Background

Medical ultrasound imaging is a practical medical imaging technique that enables non-invasive visualization of soft tissue through acoustic wave propagation and reflection. Medical ultrasound image segmentation aims at automatically identifying and extracting objects of interest in ultrasound images, such as body tissues, organs or lesion areas, thereby increasing diagnostic efficiency and accuracy. The accurate segmentation can improve the reliability of quantitative analysis of the ultrasonic image, reduce the workload of doctors and provide support for subsequent diagnosis and operation.

However, the original ultrasonic image often contains a large amount of stray noise information, which prevents the identification and subsequent quantitative analysis of a key region, and the ultrasonic image needs to be processed and then used, while the traditional image segmentation method has the problems of loose structure, unsmooth information fusion among models, insufficient feature integration of a deep learning segmentation model, limited multi-scale semantic information expression, high calculation cost and the like, so that the image segmentation accuracy is low, and the segmentation efficiency is low.

Disclosure of Invention

In order to improve the image segmentation efficiency, the invention provides a medical ultrasonic image segmentation method and a system based on self-adaptive feature fusion, which are used for segmenting images more finely, more rapidly and more accurately.

In a first aspect, the present invention provides a medical ultrasound image segmentation method based on adaptive feature fusion, which adopts the following technical scheme:

a medical ultrasonic image segmentation method based on self-adaptive feature fusion comprises the following steps:

acquiring an ultrasonic image;

performing sample increment and marking transformation on the ultrasonic image to obtain an ultrasonic image data set;

constructing an adaptive integrated neural network model based on a ResNet-101 network and a plurality of deep convolutions;

training the self-adaptive integrated neural network model based on a Pytorch framework;

inputting an ultrasonic image array into a ResNet-101 network to obtain shallow features and deep features, combining the deep features through a plurality of deep convolution processes to output multi-scale deep features, fusing the multi-scale deep features by using a residual jump structure, outputting fused features, and converting the fused features into a prediction category result by using a softmax function.

Further, the performing sample increment and label transformation on the ultrasonic image to obtain an ultrasonic image data set includes:

and carrying out translation transformation, multi-scale rotation transformation, scale scaling, overturn inversion, noise augmentation and image enhancement on the ultrasonic image in sequence to obtain an ultrasonic image array.

Further, the constructing an adaptive integrated neural network model based on the ResNet-101 network and a plurality of deep convolutions includes:

the self-adaptive integrated neural network model comprises a basic network block, a multi-depth convolution block, a self-adaptive integrated block and an up-sampling block;

inputting an ultrasonic image array into a basic network block, wherein the basic network block adopts a ResNet-101 network, the ResNet-101 network comprises a plurality of residual blocks, each residual block uses 3X3 convolution layers and jump connection, the ResNet-101 network completes preliminary feature extraction of an image and outputs the results of the plurality of residual blocks, and the results of the plurality of residual blocks are shallow features and deep features of the network.

Further, the multi-depth convolution block comprises 4 convolution branches, wherein the first branch consists of 1×1 convolution layers, the 2 nd-4 th branch consists of 1×1 convolution layers and 2-4 3×3 convolution layers connected in series, and if the 3×3 convolution layers have missing features, zero padding is used for the missing features on each 3×3 convolution layer;

the multi-depth convolution block receives deep features output by ResNet-101, multi-branch small-size convolution with different depths is used for extracting multi-scale features in parallel, all outputs of four branches are combined and connected, and multi-scale deep features are output.

Further, the adaptive integrated block performs primary fusion on the multi-scale deep feature fusion based on a 3×3 fusion convolution to obtain primary fusion features;

the method comprises the steps of inputting primary fusion features into a residual jump structure, embedding a channel attention structure into the residual jump structure, mapping and converting the primary fusion features into D-dimensional weight vectors with global average pooling by using global pooling branches by the channel attention structure, wherein D represents the number of channels, multiplying the primary fusion features of each channel by corresponding elements in the weight vectors to obtain a plurality of vector values, performing matrix point addition on the vector multiplied vector values, and outputting the fusion features.

Further, the up-sampling block comprises up-sampling and a 1x1 convolution layer, the up-sampling is utilized to amplify the fusion characteristic, the 1x1 convolution layer reduces the channel number processing to one channel, the amplified fusion characteristic enters the channel processed by the 1x1 convolution layer, a characteristic diagram is output, and the obtained characteristic diagram is converted into a prediction type result through a softmax function.

Further, the training the adaptive integrated neural network model based on the Pytorch framework includes:

based on the Pytorch framework, mapping probability distribution distance between a prediction category result and a real label, training a network model by using an Adam algorithm, and supervising and training and optimizing the model by using a cross entropy loss function.

In a second aspect, a multi-scale medical ultrasound image segmentation system based on adaptive feature fusion, comprising:

a data acquisition module configured to scan the patient's body by the ultrasound probe module to acquire an ultrasound image;

the data conversion module is configured to perform sample increment and mark conversion on the ultrasonic image to form an ultrasonic image data set;

an adaptive integrated neural network model module configured to provide an adaptive integrated neural network model;

and the result output module is configured to input the real-time ultrasonic image into the trained self-adaptive integrated neural network model, and the self-adaptive integrated neural network model outputs an ultrasonic image segmentation result.

In a third aspect, a computer readable storage medium has stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the method and system for medical ultrasound image segmentation based on adaptive feature fusion.

In a fourth aspect, a terminal device includes a processor and a computer readable storage medium, the processor configured to implement instructions; the computer readable storage medium is for storing a plurality of instructions adapted to be loaded by a processor and to perform the method and system for medical ultrasound image segmentation based on adaptive feature fusion.

In summary, the invention has the following beneficial technical effects:

the invention builds a self-adaptive integrated neural network model, realizes the integration of shallow and deep features in the neural network for an ultrasonic image, and particularly provides a method for extracting multi-scale features by using a multi-depth convolution block, wherein multi-scale information can be obtained by using simple convolution of multi-branches 1×1 and 3×3 under very small calculation cost, and the invention provides a self-adaptive integrated block which can finish feature fusion in space and channel dimension by simply convoluting and global pooling after splicing deep and shallow features.

Under the condition of low calculation cost, the invention effectively improves the detail recognition capability, the edge fineness and the scene adaptability of the segmentation, realizes the medical ultrasonic image segmentation with higher speed and higher precision than the prior proposal, and provides powerful technical support for a clinician to better judge the illness state of a patient.

Drawings

FIG. 1 is a flow chart of a method and system for medical ultrasound image segmentation based on adaptive feature fusion in accordance with the present invention.

FIG. 2 is a flow chart of an adaptive integrated neural network model in a medical ultrasound image segmentation method and system based on adaptive feature fusion.

FIG. 3 is a block diagram of a multi-depth convolution block in a medical ultrasound image segmentation method and system based on adaptive feature fusion according to the present invention.

Fig. 4 is a diagram of the configuration of an adaptive integration block in a medical ultrasound image segmentation method and system based on adaptive feature fusion according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.

The embodiment of the invention discloses a medical ultrasonic image segmentation method and system based on self-adaptive feature fusion.

Example 1

Referring to fig. 1, the medical ultrasound image segmentation method based on adaptive feature fusion according to the present embodiment includes:

scanning the body of a patient through a preset ultrasonic probe module to obtain an ultrasonic image;

performing sample increment and mark transformation on the ultrasonic image to form an ultrasonic image data set;

constructing an adaptive integrated neural network model, and inputting an ultrasonic image data set into the adaptive integrated neural network model for training;

inputting the real-time ultrasonic image into a trained self-adaptive integrated neural network model, and outputting an ultrasonic image segmentation result by the self-adaptive integrated neural network model.

The method specifically comprises the following steps:

firstly, acquiring ultrasonic images, scanning the bodies of a plurality of patients through ultrasonic probes preset in a hospital, and acquiring a plurality of groups of ultrasonic images for use by a subsequent training model if different patients possibly have different parameters such as age, wearing and the like.

Secondly, in order to make the model more robust, sample augmentation is needed to be carried out on the original ultrasonic image, corresponding changes are synchronously carried out on the real marked image, and according to the characteristics of the medical ultrasonic image, the sample augmentation method used in the proposal has the following steps under the premise of not changing the image property:

translation transformation, namely translating the image along the horizontal axis and the vertical axis, and expanding the number of samples.

The multi-scale rotation transformation is that the sample is randomly rotated within a certain angle range, and the missing part after rotation is filled by mirror image, so that the sample with more angles can be obtained compared with the existing method which only uses a plurality of fixed angle rotations.

Scaling, namely scaling the sample in a certain range to obtain samples with multiple resolutions.

Inversion, namely, performing inversion in the horizontal and vertical directions on the sample, and reversing the distribution of the sample.

Noise enhancement, namely adding different types and levels of noise into the image, and improving the robustness of the model. The true image of the mark remains unchanged.

Image enhancement such as brightness and contrast is adjusted. The marker image is unchanged.

By adopting the synchronous sample augmentation and the mark transformation, the scale of the ultrasonic image training set can be obviously expanded, the adaptability of the model to various conditions is improved, and the generalization performance of the model is enhanced. And meanwhile, the semantic consistency of the amplified sample and the mark is strictly ensured.

After a series of sample augmentation operations, a richer ultrasound image dataset is output.

And thirdly, constructing an adaptive integrated neural network model, wherein the adaptive integrated neural network model consists of a basic network block, a multi-depth convolution block, an adaptive integrated block and an up-sampling block. The basic network block is a ResNet-101 network, the multi-depth convolution block receives the features output by ResNet-101, different receptive field features are extracted through convolution operations of different depths, extraction of multi-scale features is completed, and finally up-sampling of the feature images is completed by using three self-adaptive integrated blocks, so that the network outputs the feature images with the same resolution as the marked images, and the self-adaptive integrated blocks perform weighted fusion on the features of different branches and introduce a attention mechanism in the channel dimension. The overall structure of the model is shown in fig. 2, and the specific structure of each module is as follows:

(1) The ResNet-101 network contains multiple residual blocks, each using 3x3 convolutional layers and a skip connection. The jump connection can directly transfer input to output, residual learning is realized, gradient disappearance or gradient explosion is avoided, meanwhile, the used ResNet-101 network calls an ImageNet data set, the number of samples is increased by utilizing the ImageNet data set, a large amount of pre-training is performed, rich general image characteristic representations are obtained from a large-scale image data set, and a good characteristic extraction basis is provided for an ultrasonic image processing task. The ResNet-101 network completes the primary feature extraction of the image through jump connection, and outputs the results of a plurality of residual blocks, namely the shallow features and the deep features of the network.

The convolution layer formula is as follows:

；(1)

wherein,Krepresenting the output channel index (i.e. the output channel index),crepresenting the index of the input channelm，n) Representing the coordinates of convolution kernels of the respective sizeshAndw(reference to 1x1 and 3x3 convolutions in this disclosure refers to the convolution kernel size),biasfor the corresponding offset of the channel(s),representing all positions of the convolution kernelm，n) And all input channelsc And (5) summing.

It should be noted that all convolution formulas in the whole adaptive integrated neural network model are the formulas.

(2) The multi-depth convolution block receives deep features output by ResNet-101, takes the deep features as input, and uses multi-branch small-size convolution with different depths to extract multi-scale deep features in parallel, namely, different network depths correspond to different receptive fields, so that different scales are mapped.

Referring to fig. 3, the present module contains 4 parallel branches. The first branch consists of only 1x1 convolution layers, which is used to reduce the number of channels, reduce the complexity of the model and preserve the original scale features. The 2-4 branches consist of 1x1 convolutional layers and 2-4 3x3 convolutional layers in series. The 3x3 convolution captures local features and the multi-layer stack extracts semantic information of different scales. In addition, zero padding operations are used on each 3x3 convolutional layer, i.e., zero padding around the input signature to ensure that the convolved output signature is consistent in size with the input. The multi-scale is extracted by a multi-branch simple convolution mode, so that the calculation efficiency is greatly improved, and information brought by pooling or cavity convolution is avoided. Batch normalization is used to reduce internal covariate offset after deconvolution, and then all the outputs of the four branches are combined and connected to output multi-scale deep features. The multi-depth convolution block provided by the invention only uses the simple 1 multiplied by 1 and 3 multiplied by 3 convolutions of multiple branches, so that the parameter quantity and the calculation cost are reduced to the greatest extent compared with the existing multi-scale feature extraction method, in addition, more detail features are reserved and the influence of grid effect is avoided because the multi-scale is acquired without pooling or cavity convolution.

And finally, using a ReLU function as an activation function to enable the multi-depth convolution block to learn more nonlinear characteristics, and obtaining better model performance.

The formula for the ReLU function as an activation function is as follows:

； (2)

wherein the method comprises the steps ofxFor input, ifThen->I.e. back to itself. If->ThenI.e. truncated to 0.ReLU is simply calculated and implementedNonlinear processing is performed, and the problem of gradient disappearance is avoided.

(3) The main body of the self-adaptive integrated block is formed by nesting a residual jump structure and a channel attention structure. Specifically, the module receives shallow features output from the ResNet-101 network and deep features output from the multi-depth convolution block, performs space preliminary fusion on the two features through 3×3 fusion convolution, and simultaneously realizes feature dimension reduction and reduces calculation amount. Further, a residual jump structure is added to the fused features, so that the features are further fused in a channel, feature fusion in two dimensions of space and channel is realized, compared with the existing method, the method is characterized in that feature dimension reduction is firstly carried out, and then the space fusion and the channel fusion are integrated into one module, so that complex multi-module fusion is avoided, and redundant calculation is reduced.

Then, considering that features from different channels generally have different degrees of importance to the semantic labeling task, embedding a channel attention structure in the residual structure, the structure uses global pooling branches to transform feature mapping into a D-dimensional (D represents the number of channels) weight vector with global averaging pooling, the D-dimensional represents the weight of each channel, i.e. the features of a certain channel are more important, then all features of each channel are multiplied by corresponding elements in the weight vector, and multiplication can enhance the output features of the channel with larger weight and more importance. The method for obtaining the channel information through global pooling also greatly reduces the parameter quantity and the calculated quantity of the channel attention.

And (5) performing matrix point addition after scalar multiplication, and outputting fusion characteristics. The construction of the adaptive integrated blocks is described with reference to fig. 4.

Wherein the matrix point addition formula is:

(3)

，i=1,2,…m，j=1,2,…n (4)

wherein matrix C is the result of the addition of the points of matrices a and B.iAndjrepresenting the matrix coordinates. I.e. the first of the result matrix CiLine 1jThe column elements are the direct addition of the corresponding position elements of matrix a and matrix B.

(4) The resulting feature map of the fused features is then converted to probability values for the predicted class by upsampling and a 1x1 convolution layer to reduce the number of channels to the number of classes, and then by a softmax function. Wherein, the softmax function formula is as follows:

,i=1, 2, … K (5)

wherein,represents +.>To the power. />Represents the sum of K, i.e. the sum of the exponents of all K. The output element belongs to the (0, 1) interval and the sum is 1, indicating the classification probability.

Fourth, based on Pytorch frame, mapping the probability distribution distance between the prediction category result and the real label, training a network model by using Adam algorithm, monitoring and training by cross entropy loss function, and optimizing the model.

The training method is implemented using a Pytorch framework, in particular using cross entropy as a loss function, mapping the probability distribution distance between the predicted outcome and the true annotation. During model training, all weight parameters are initialized with He, which takes into account the number of input elements, sampling from a uniform distribution.

The Adam optimization algorithm is basically a combination of Momentum and RMSprop.

1. Setting an initial value, generally initializing V _dw =0；S _dw =0；V _db =0；S _db =0；

2. In the t-th iteration, dw and db are calculated by a mini-batch gradient descent method;

3. calculating a Momentum exponential weighted average;

4. updating with RMSprop;

5. calculating correction deviation of Momentum and RMSprop;

and 6, updating the weight.

In the scheme, the initial learning rate is set to be 1e-3, the initial learning rate is attenuated in stages in the training process, the final attenuation is 1e-5, and the momentum value is set to be 0.9, so that the model is ensured to be sufficiently converged, wherein Adam uses an adaptive learning rate and momentum mechanism, and the convergence speed is higher. And simultaneously setting an early-stopping mechanism, and stopping training when the indexes of the model on the verification set are continuously 10 epochs (the iteration times) and are not improved any more, so as to prevent over-fitting. Further, the batch size is set to be 16, the weight parameter is penalized by L2 regularization, the strength is 1e-4, so that the model tends to select smaller weight, and the complexity of the model is reduced.

The self-adaptive integrated neural network model after training is used for model application, and a hospital can use the self-adaptive integrated neural network model to carry out intelligent segmentation prediction on a newly acquired ultrasonic image and output an ultrasonic image segmentation result by using the self-adaptive integrated neural network model. Accurately extracting lesion areas, organ boundaries and the like. Based on the segmentation result, quantitative analysis of the information of the size, the shape and the like of the lesions can be realized, and references are provided for doctors to judge the illness state and plan the treatment scheme.

Example 2

A multi-scale medical ultrasound image segmentation system based on adaptive feature fusion, comprising:

A computer readable storage medium having stored therein a plurality of instructions adapted to be loaded and executed by a processor of a terminal device for the medical ultrasound image segmentation method and system based on adaptive feature fusion.

A terminal device comprising a processor and a computer readable storage medium, the processor configured to implement instructions; the computer readable storage medium is for storing a plurality of instructions adapted to be loaded by a processor and to perform the method and system for medical ultrasound image segmentation based on adaptive feature fusion.

The above embodiments are not intended to limit the scope of the present invention, so: all equivalent changes in structure, shape and principle of the invention should be covered in the scope of protection of the invention.

Claims

1. A medical ultrasonic image segmentation method based on self-adaptive feature fusion is characterized in that: comprising the following steps:

acquiring an ultrasonic image;

inputting an ultrasonic image array into a ResNet-101 network to obtain shallow features and deep features, combining the deep features through a plurality of deep convolution processes to output multi-scale deep features, fusing the multi-scale deep features by using a residual jump structure, outputting fused features, and converting the fused features into a prediction category result by using a softmax function;

the construction of the adaptive integrated neural network model based on the ResNet-101 network and a plurality of deep convolutions comprises the following steps:

inputting an ultrasonic image array into a basic network block, wherein the basic network block adopts a ResNet-101 network, the ResNet-101 network comprises a plurality of residual blocks, each residual block uses 3X3 convolution layers and jump connection, the ResNet-101 network completes the primary feature extraction of an image and outputs the results of the plurality of residual blocks, and the results of the plurality of residual blocks are the shallow features and the deep features of the network;

the multi-depth convolution block comprises 4 convolution branches, wherein the first branch consists of 1X1 convolution layers, the 2 nd-4 th branch consists of 1X1 convolution layers and 2-4 3X3 convolution layers which are connected in series, and if the 3X3 convolution layers have missing features, zero filling is used for the missing features on each 3X3 convolution layer;

the multi-depth convolution block receives deep features output by ResNet-101, multi-branch small-size convolution with different depths is used for extracting multi-scale features in parallel, all outputs of four branches are combined and connected, and multi-scale deep features are output;

the self-adaptive integrated block performs preliminary fusion on multi-scale deep feature fusion based on a 3×3 fusion convolution to obtain preliminary fusion features;

inputting the primary fusion features into a residual jump structure, embedding a channel attention structure into the residual jump structure, mapping and converting the primary fusion features into D-dimensional weight vectors with global average pooling by using global pooling branches by the channel attention structure, wherein D represents the number of channels, multiplying the primary fusion features of each channel by corresponding elements in the weight vectors to obtain a plurality of vector values, performing matrix point addition on the vector multiplied vector values, and outputting the fusion features;

the up-sampling block comprises up-sampling and a 1x1 convolution layer, the up-sampling is utilized to amplify the fusion characteristic, the 1x1 convolution layer reduces the channel number processing to one channel, the amplified fusion characteristic enters the channel processed by the 1x1 convolution layer, a characteristic diagram is output, and the obtained characteristic diagram is converted into a prediction type result through a softmax function.

2. The medical ultrasound image segmentation method based on adaptive feature fusion according to claim 1, wherein: the ultrasonic image is subjected to sample increment and marking transformation to obtain an ultrasonic image data set, and the method comprises the following steps:

3. The medical ultrasound image segmentation method based on adaptive feature fusion according to claim 1, wherein: training the adaptive integrated neural network model based on the Pytorch framework comprises the following steps:

4. A multi-scale medical ultrasound image segmentation system based on adaptive feature fusion, based on the medical ultrasound image segmentation method based on adaptive feature fusion of claim 1, comprising:

5. A computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform a medical ultrasound image segmentation method based on adaptive feature fusion as claimed in claim 1.

6. A terminal device comprising a processor and a computer readable storage medium, the processor configured to implement instructions; a computer readable storage medium is for storing a plurality of instructions adapted to be loaded by a processor and to perform a medical ultrasound image segmentation method based on adaptive feature fusion as defined in claim 1.