CN113409326A

CN113409326A - Image segmentation method and system

Info

Publication number: CN113409326A
Application number: CN202110600755.5A
Authority: CN
Inventors: 李建强; 文棚嶒; 徐春
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2021-09-17
Anticipated expiration: 2041-05-31
Also published as: CN113409326B

Abstract

The invention provides an image segmentation method and an image segmentation system, wherein the image segmentation method comprises the following steps: inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model; the image segmentation model is obtained by training a sample image set; the image segmentation model is used for determining a target feature map corresponding to the image to be segmented based on an attention mechanism and a residual error network, and determining the segmented target image based on the target feature map and a pyramid scene analysis network. Important features in the image to be segmented can be effectively focused, unnecessary features are inhibited, the fuzzy of the segmented image is avoided, and the segmentation precision of the image segmentation is improved.

Description

Image segmentation method and system

Technical Field

The invention relates to the technical field of deep learning, in particular to an image segmentation method and an image segmentation system.

Background

Hydronephrosis is a common kidney disease, and can cause a series of complications such as abdominal mass, hematuria, uremia, hypertension and even kidney rupture. The B-mode ultrasonic examination is a basic examination for suspected hydronephrosis patients, and is convenient, time-saving, economical and radiation-free. B-mode ultrasound imaging scans the human body with ultrasound beams to obtain two-dimensional sectional tomographic images reflecting tissues and organs of the human body.

The combination of medical images and artificial intelligence has prevailed in recent years. Artificial intelligence medical imaging techniques alleviate medical resource shortages and imbalances between different regions to a large extent and facilitate the diagnostic capabilities of physicians. If the disease can be judged and classified by using a deep learning method in the ultrasonic examination stage, unnecessary follow-up examination can be saved, and a large amount of money, manpower and medical resources are saved.

However, due to different characteristics of human kidneys, the kidney area and the water accumulation area in the kidney ultrasonic image are usually in an overlay type relationship, and the individual shape and texture have large differences. The conventional method for segmenting the kidney ultrasonic image based on deep learning has the problems of fuzzy kidney images, unclear middle outline, unclear distribution with peripheral tissues, strange shape and the like.

Therefore, it is an urgent problem to provide an image segmentation method and system that focuses on important features in an image to be segmented, suppresses unnecessary features, avoids blurring of the segmented image, and improves the segmentation accuracy of image segmentation.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides an image segmentation method and an image segmentation system.

The invention provides an image segmentation method, which comprises the following steps: inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model;

the image segmentation model is obtained by training a sample image set; the image segmentation model is used for determining a target feature map corresponding to the image to be segmented based on an attention mechanism and a residual error network, and determining the segmented target image based on the target feature map and a pyramid scene analysis network.

According to the image segmentation method provided by the invention, the image segmentation model comprises the following steps: a feature extraction layer and an image segmentation layer;

correspondingly, the inputting the image to be segmented into the image segmentation model to obtain the target image output and segmented by the image segmentation model specifically comprises:

inputting the image to be segmented into the feature extraction layer, and determining a target feature map corresponding to the image to be segmented based on the attention mechanism and the residual error network;

inputting the target feature map into the image segmentation layer, determining feature pyramid set global features based on the pyramid scene analysis network, and determining the segmented target image based on the feature pyramid set global features.

According to the image segmentation method provided by the invention, the feature extraction layer comprises: a first attention layer, a depth residual layer, and a second attention layer;

correspondingly, the inputting the image to be segmented into the feature extraction layer, and determining the target feature map corresponding to the image to be segmented based on the attention mechanism and the residual error network specifically include:

inputting the image to be segmented into a first attention layer, and determining a first feature map based on the attention mechanism;

inputting the first feature map into the depth residual layer, and determining a second feature map based on a residual network;

inputting the second feature map into the second attention layer, and determining a target feature map based on the attention mechanism.

According to the image segmentation method provided by the invention, the first attention layer and the second attention layer comprise: and a convolution attention mechanism module.

According to the image segmentation method provided by the invention, the first attention layer comprises: a first channel attention layer, a first spatial attention layer, and a first fusion layer;

correspondingly, the inputting the image to be segmented into a first attention layer, and determining a first feature map based on the attention mechanism specifically includes:

inputting the image to be segmented into the first channel attention layer, and determining a first channel attention diagram;

inputting the first channel attention map into the first spatial attention layer, determining a first spatial attention map;

inputting the first channel attention map and the first spatial attention map into the first fusion layer, and determining a first feature map.

According to the image segmentation method provided by the invention, the second attention layer comprises: a second channel attention layer and a second spatial attention layer and a second fusion layer;

correspondingly, the inputting the second feature map into the second attention layer and determining a target feature map based on the attention mechanism specifically include:

inputting the second feature map into the second channel attention layer, and determining a second channel attention map;

inputting the second channel attention map into the second spatial attention, determining a second spatial attention map;

and inputting the second channel attention diagram and the second space attention diagram into the second fusion layer, and determining a target feature diagram.

According to the image segmentation method provided by the invention, before the step of inputting the image to be segmented into the image segmentation model and obtaining the target image output and segmented by the image segmentation model, the method further comprises the following steps: training the image segmentation model;

the training of the image segmentation model specifically includes:

training the image segmentation model using the sample image set;

based on a preset loss function, obtaining a gradient for a network parameter of the image segmentation model by using a back propagation algorithm; the preset loss function consists of a cross entropy loss function and a dice loss function;

updating the network parameters of the image segmentation model based on the gradient, and performing iterative training on the image segmentation model based on the updated network parameters until the image segmentation model converges.

The present invention also provides an image segmentation system, comprising: an image segmentation unit;

the image segmentation unit is used for inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model;

The invention also provides electronic equipment which comprises a memory and a processor, wherein the processor and the memory finish mutual communication through a bus; the memory stores program instructions executable by the processor, which when invoked by the processor are capable of performing the various steps of the image segmentation method as described above.

The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the image segmentation method as described above.

The image segmentation method and the image segmentation system provided by the invention have the advantages that the image to be segmented is input into the image segmentation model, the target characteristic diagram corresponding to the image to be segmented is determined based on the attention mechanism and the residual error network extracted from the basic characteristics, and the segmented target image is determined based on the target characteristic diagram and the pyramid scene analysis network. The important features in the feature map to be identified are focused through an attention mechanism and a residual error network, unnecessary features are restrained, the capability of obtaining global features is improved through a pyramid scene analysis network, the probability of segmentation errors is reduced under the condition that the added expense is very small, the segmentation precision of a model is improved, and the condition that the edges of a segmentation area are fuzzy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flowchart of an image segmentation method provided by the present invention;

FIG. 2 is an ultrasound image of hydronephrosis of a patient provided by the present invention;

FIG. 3 is a segmented target image of an ultrasound image of hydronephrosis of a patient according to the present invention;

FIG. 4 is a schematic diagram of an image segmentation model structure provided by the present invention;

FIG. 5 is a schematic diagram of an image segmentation system according to the present invention;

fig. 6 is a schematic physical structure diagram of an electronic device provided in the present invention.

Reference numerals:

410: a first attention layer; 420: a depth residual layer;

430: a second attention layer; 440: a target feature map;

450: an image segmentation layer; 460: collecting global features by a pyramid;

470: and (4) the segmented target image.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to facilitate the following detailed description of the present invention, some terms are described:

an attention mechanism is as follows: the attention machine model simulates the method of a person looking at an object: by simply glancing at a glance, a significant portion of the image is analyzed and then focused on that location, rather than giving equal importance to all areas in the picture.

Pyramid Scene Parsing Network (PSPNet for short): is an improvement over fcn (full convolution networks), where objects of different sizes exist in the image, and different objects have different characteristics. Simple targets can be distinguished by using shallow features; complex objects can be distinguished by using deep features. More context information is introduced. When the segmentation layer has more global information, the probability of mis-segmentation is lower.

Convolutional Attention Module (CBAM): a lightweight universal attention module. It can be seamlessly integrated into any CNN architecture with negligible overhead and end-to-end training with the basic CNN.

Fig. 1 is a flowchart of an image segmentation method provided by the present invention, and as shown in fig. 1, the present invention provides an image segmentation method, including:

step S1, inputting the image to be segmented into an image segmentation model to obtain a target image which is output and segmented by the image segmentation model;

Specifically, in step S1, the image to be segmented is input into a pre-trained image segmentation model, where the image segmentation model is used to determine a target feature map corresponding to the image to be segmented based on an attention mechanism and a Residual Network (ResNet), and determine and output the segmented target image based on the target feature map and a Pyramid Scene Parsing Network (PSPNet).

For example: fig. 2 is an ultrasonic image of hydronephrosis of a patient according to the present invention, fig. 3 is a target image obtained by segmenting the ultrasonic image of hydronephrosis of the patient according to the present invention, the ultrasonic image of hydronephrosis of the patient (fig. 2) is collected as an image to be segmented, the ultrasonic image is input into a kidney ultrasonic image segmentation model trained in advance, and the segmented target image is determined. As shown in fig. 3, in the segmented target image, the waterlogged area and the kidney area are divided, where an area a is the waterlogged area, an area b is the kidney area, and an area c is an area other than the kidney in the image. The outlines of the water accumulation area and the kidney area can be effectively divided.

Before the image to be segmented is segmented, an image segmentation model needs to be trained in advance, the image segmentation model is obtained by training a sample image set, for the images in the sample image training set, corresponding labels are set in the images, and the model is trained based on the labels and the images. The selection of the specific sample data set, the setting of the label and the sample training method may be adjusted based on the actual situation, which is not limited in the present invention.

Secondly, in the scheme, the network structure of the model can be improved on the basis of the existing residual error network and pyramid scene analysis network, and an attention mechanism is added. Other adjustments can be made according to actual needs, which are not limited in the present invention.

In addition, it is to be understood that the present invention is not limited to its application to the segmentation of hydronephrosis ultrasound images, but may also be applied to the segmentation of other medical images (e.g., CT, MR, PET) for aiding in the diagnosis of other conditions or detecting physiological changes. The sample image set may be adapted to different fields by adjusting the sample image set in addition to the medical field, and the present invention is not limited thereto.

The image segmentation method provided by the invention comprises the steps of inputting an image to be segmented into an image segmentation model, determining a target feature map corresponding to the image to be segmented based on a residual error network extracted by an attention mechanism and basic features, and determining the segmented target image based on the target feature map and a pyramid scene analysis network. The important features in the feature map to be identified are focused through an attention mechanism and a residual error network, unnecessary features are restrained, the capability of obtaining global features is improved through a pyramid scene analysis network, the probability of segmentation errors is reduced under the condition that the added expense is very small, the segmentation precision of a model is improved, and the condition that the edges of a segmentation area are fuzzy is improved.

Optionally, according to the image segmentation method provided by the present invention, the image segmentation model includes: a feature extraction layer and an image segmentation layer;

Specifically, the image segmentation model includes: a feature extraction layer and an image segmentation layer.

The feature extraction layer is used for extracting a feature map of the image to be segmented, inputting the image to be segmented into the feature extraction layer, extracting features of the image to be segmented by combining an attention mechanism and a residual error network, and determining a target feature map corresponding to the image to be segmented.

The residual error network can keep the depth of a deep network, simultaneously avoids the problem of degradation, has low complexity and high processing speed, and can be suitable for segmenting fuzzy images. On the basis of a residual error network, an attention unit is embedded, adaptive feature optimization is realized through an attention mechanism, the capability of a model is effectively improved, almost no extra complexity exists, and no extra overhead is needed.

It should be noted that the attention mechanism can be divided into two types, namely a soft attention mechanism and a hard attention mechanism, and when information is selected, the soft attention mechanism calculates a weighted average value of a plurality of input information, and then the weighted average value is input into the neural network for calculation. In contrast, the hard attention mechanism selects information at a position in the input sequence, such as randomly selecting a piece of information or selecting the information with the highest probability. In the present invention, the specific type of the attention mechanism can be adjusted according to the actual requirement, and the present invention is not limited to this.

The image segmentation layer is used for achieving image segmentation based on global features of a target feature map, inputting the determined target feature map into the image segmentation layer, mapping and dividing the target feature map into a plurality of different sub-regions based on a pyramid scene analysis network to obtain feature representations of different positions, reducing the dimension of context features into low-dimensional feature representations after each pyramid level, sampling the low-dimensional feature maps to enable the scale of the low-dimensional feature maps to be the same as that of the original target feature map, and splicing the feature maps of different levels to obtain global features of a feature pyramid set. And segmenting the target feature map based on the global feature of the feature pyramid set, and determining the segmented target image.

In the medical field, the features of organs and lesions in medical images are different from patient to patient, and in the same category, the medical images of patients may have huge contour difference, which means that high-performance image segmentation cannot be performed only under the condition of basic features, more semantic information needs to be obtained, and the feature pyramid is helpful for obtaining global features and improving the accuracy of image segmentation.

The image segmentation method provided by the invention comprises the steps of inputting an image to be segmented into an image segmentation model, determining a target feature map corresponding to the image to be segmented based on a residual error network extracted by an attention mechanism and basic features, and determining the segmented target image based on the target feature map and a pyramid scene analysis network. The important features in the feature map to be identified are focused through an attention mechanism and a residual error network, unnecessary features are restrained, the capability of obtaining global features is improved through a pyramid scene analysis network, the probability of segmentation errors is effectively reduced under the condition that the added expense is very small, the segmentation precision of a model is improved, and the condition that the edges of segmented regions are fuzzy is improved.

Optionally, according to the image segmentation method provided by the present invention, the feature extraction layer includes: a first attention layer, a depth residual layer, and a second attention layer;

Specifically, fig. 4 is a schematic structural diagram of an image segmentation model provided by the present invention, and as shown in fig. 4, the feature extraction layer includes: a first attention layer 410, a depth residual layer 420, and a second attention layer 430. Attention layers are provided at both the front and back ends of the depth residual layer 420 (residual network). Correspondingly, inputting an image to be segmented into the feature extraction layer, and determining a target feature map corresponding to the image to be segmented based on an attention mechanism and a residual error network, specifically comprising:

inputting an image to be segmented into a first attention layer 410, determining a first feature map based on an attention mechanism, inputting the first feature map into a depth residual layer 420, determining a second feature map based on a residual network, inputting the second feature map into a second attention layer 430, and determining a target feature map 440 based on the attention mechanism.

Further, the determined target feature map 440 is input into the image segmentation layer 450, the target feature map 440 is mapped and divided into a plurality of different sub-regions based on the pyramid scene analysis network, feature representations of different positions are obtained, the dimension of the context feature is reduced to be a low-dimensional feature representation after each pyramid level, the low-dimensional feature map is sampled, the scale of the low-dimensional feature map is the same as that of the original target feature map, and feature maps of different levels are spliced to obtain the feature pyramid set global feature 460. And based on the feature pyramid set global features 460, the target feature map is segmented, and a segmented target image 470 is determined.

It should be noted that the first Attention layer and the second Attention layer may use an Attention module such as CBAM, SE (Squeeze-and-Attention) or sam (spatial Attention module), and the setting of the Attention mechanism may be adjusted according to the actual situation, which is not limited in the present invention.

The image segmentation method provided by the invention comprises the steps of inputting an image to be segmented into an image segmentation model, determining a target feature map corresponding to the image to be segmented based on a residual error network extracted by an attention mechanism and basic features, and determining the segmented target image based on the target feature map and a pyramid scene analysis network. Important features in the feature graph to be identified are focused through a focus mechanism and a residual error network, unnecessary features are restrained, and the capability of obtaining global features is improved through a pyramid scene analysis network. The original structure of a residual error network is not damaged, the parameter quantity is effectively reduced, the pre-training model can be repeatedly used, the probability of segmentation errors is effectively reduced under the condition of small added expense, the segmentation precision of the model is improved, and the condition that the edge of a segmented region is fuzzy is improved.

Optionally, according to the image segmentation method provided by the present invention, the first attention layer and the second attention layer include: and a convolution attention mechanism module.

Specifically, the soft attention mechanism calculates a weighted average of a plurality of input information, and then inputs the weighted average into the neural network for calculation. Soft attention is differentiable, meaning that the weight of attention can be learned by using neural networks to compute gradients and forward propagation and backward feedback.

The convolutional attention module CBAM is a soft attention mechanism that combines the channel domain and the spatial domain. The main advantage is attention in the spatial domain, i.e. it is possible to teach not only what but where the network looks through the direction of the channel, but also it is very lightweight and easy to deploy end-to-end. Adding CBAM at the stage of extracting the essential features helps to better identify the parts interested in the segmented image and helps to reduce the complexity of the model.

Compared with other model improvement methods which need to occupy more resources and increase the depth, width and base number of the network, the CBAM is embedded in the front end and the rear end of the residual error network, so that the method can be widely used for improving the characterization capability of the basic feature extraction layer, does not damage the original network structure of the residual error network, and can effectively reduce the parameters. Therefore, if the pre-training model exists, the pre-training model can still be used, and the trouble of repeatedly training the model is avoided.

It should be noted that the CBAM module calculates the attention map in turn along two independent dimensions (channel and space). Since the placement of attention cells in a CBAM module can be changed with the form of the tandem CBAM kept consistent, the placement of the spatial cells and channel cells can be adjusted when designing the CBAM attention module, for example: the front and rear CBAM attention modules are respectively provided with a space unit before a channel unit, or are respectively provided with a channel unit before a space unit, or only adopt one of the space units (the front and rear CBAM attention modules are respectively provided with only a space unit or only a channel unit). The specific location setting conditions of the space unit and the channel unit can be adjusted according to actual requirements, which is not limited in the present invention.

Optionally, according to the image segmentation method provided by the present invention, the first attention layer includes: a first channel attention layer, a first spatial attention layer, and a first fusion layer;

Specifically, the convolution attention mechanism module is set to be in a channel unit first and then space unit mode. The first attention layer includes: a first channel attention layer (i.e., a convolution attention mechanism module channel unit), a first spatial attention layer (i.e., a convolution attention mechanism module spatial unit), and a first fusion layer. The output result of the convolutional layer firstly passes through the first channel attention layer to obtain a weighted result, and then passes through the first spatial attention layer to finally perform fusion weighting to obtain a first characteristic diagram.

The core idea of the channel attention layer is to generate a channel attention map by utilizing the relationship between channels of features. And inputting the image to be segmented into the first channel attention layer to obtain a final first channel attention diagram Mc.

The core idea of the spatial attention layer is to generate a spatial attention map by using spatial relations among features. Inputting a first channel attention map Mc (a feature map after channel attention refinement) into a first spatial attention layer, and determining a first spatial attention map Ms;

the first channel attention map Mc and the first spatial attention map Ms are input into the first fusion layer to be weighted, and a first feature map is determined.

When the channel attention map is generated by using the inter-channel relationship of the features, the C-dimensional feature maps are input, average pooling and maximum pooling are respectively carried out on the feature maps (in order to better aggregate the information of the feature maps and reduce the parameter number), two C-dimensional pooled feature maps, namely F _ avg and F _ max, are obtained, and the F _ avg and the F _ max are sent into a multilayer perceptron MLP comprising a hidden layer, so that two 1x1xC channel attention maps are obtained. In order to reduce the number of parameters, the number of hidden layer neurons is C/r, and r is also called the compression ratio. And adding corresponding elements of the two channel attention diagrams obtained by the MLP multilayer perceptron, and activating to obtain a final output channel attention diagram Mc with the dimension of 1x1 xC.

When generating a spatial attention by using a spatial relationship among features, inputting a feature map F 'subjected to channel attention force refinement, performing maximum pooling and average pooling on the F' along a channel direction to obtain two-dimensional feature maps F _ avg and F _ max, wherein the attribute is 1xHxW, and performing dimension splicing (concatemate) on the two obtained feature maps to obtain a spliced feature map. For the stitched feature map, a spatial attention map Ms is generated using a convolutional layer with a size of 7 × 7.

It should be noted that, the method for generating a feature map through a channel attention layer and a spatial attention layer is only used as an example to explain the present invention, and in the practical application process of the image segmentation method provided by the present invention, the specific structural details of the channel attention layer and the spatial attention layer of the model may also be adaptively adjusted according to the practical requirements, which is not limited by the present invention.

Optionally, according to the image segmentation method provided by the present invention, the second attention layer includes: a second channel attention layer and a second spatial attention layer and a second fusion layer;

Specifically, the convolution attention mechanism module is set to be in a channel unit first and then space unit mode. The second attention layer includes: a second channel attention layer (i.e., a convolution attention mechanism module channel unit), a second spatial attention layer (i.e., a convolution attention mechanism module spatial unit), and a second fusion layer. And the output result of the convolutional layer firstly passes through the second channel attention layer to obtain a weighting result, then passes through the second spatial attention layer, and finally is subjected to fusion weighting to obtain the target characteristic diagram.

The core idea of the channel attention layer is to generate a channel attention map by utilizing the relationship between channels of features. And inputting the image to be segmented into a second channel attention layer to obtain a final second channel attention map Mc'.

The core idea of the spatial attention layer is to generate a spatial attention map by using spatial relations among features. Inputting a second channel attention map Mc '(a feature map subjected to channel attention thinning) into a second spatial attention layer, and determining a second spatial attention map Ms';

and inputting the second channel attention diagram Mc 'and the second spatial attention diagram Ms' into the second fusion layer for weighting, and determining the target feature diagram.

It can be understood that the output fout of the present invention can be expressed by a formula, two CBAM modules are disposed before and after the residual module, and the output is determined after the processing of the front CBAM module, the residual module and the rear CBAM module in sequence.

It should be noted that, the setting rule of the second channel attention layer and the second spatial attention layer is completely the same as the setting rule of the first channel attention layer and the first spatial attention layer, and the specific characteristic diagram generating method thereof may refer to the explanation of the method for generating the characteristic diagram through the channel attention layer and the spatial attention layer.

Optionally, according to the image segmentation method provided by the present invention, before the step of inputting the image to be segmented into the image segmentation model and obtaining the target image output and segmented by the image segmentation model, the method further includes: training the image segmentation model;

the training of the image segmentation model specifically includes:

training the image segmentation model using the sample image set;

Specifically, before using the image segmentation model, the image segmentation model needs to be trained in advance, and the step of training the image segmentation model specifically includes:

and training an image segmentation model by using the sample image set, setting corresponding labels for all images in the sample image set, and performing image segmentation. And inputting the segmented sample image into an image segmentation network for deep learning.

In the iterative learning process, based on a preset loss function, a back propagation algorithm is used for obtaining a gradient for a network parameter of the image segmentation model, the network parameter of the image segmentation model is updated in a gradient mode, and iterative training is carried out on the image segmentation model based on the updated network parameter until the image segmentation model is converged.

It should be noted that, due to the problem of category imbalance often occurring in medical images, training may be dominated by the class with more pixels, and it is difficult for smaller objects to learn their features, thereby reducing the effectiveness of the network.

In the invention, the preset Loss function is composed of a cross entropy Loss (CE Loss) function and a Dice Loss (Dice Loss) function. Dice loss is added as a regular term and is subjected to weighted summation with cross entropy loss, so that the problem caused by the unbalance of sample data set categories is effectively inhibited, and the image segmentation precision is effectively improved.

The die loss function is derived from two classes, essentially weighing the overlapping parts of two samples. The index ranges from 0 to 1, where "1" indicates complete overlap. While Dice has 2 in its numerator because the denominator "recalculates" the common elements between the two groups.

To form a loss function that can be minimized, 1-Dice will simply be used. This loss function is called softdicells, and uses the prediction probabilities directly instead of using thresholds or converting them to binary masks.

For each class of mask, the Dice loss is calculated:

wherein, y_predIs a sample prediction value, y_trueIs the true value of the sample. Sigma_pixelsIs a pixel-by-pixel summation. And summing and averaging the Dice losses of each class to obtain the final Diceofloss.

The model of the invention is explained by combining the specific image processing steps and the image dimension change:

a picture with an input dimension of (H × W × C) is, for example, input dimensions of: 768 × 1024 × 3. The image is resized without distortion to obtain an image with dimensions 354 × 473 × 3. The obtained image was filled with gray bars into a square image, and an image having dimensions of 473 × 473 × 3 was obtained.

After activation (ReLU) (normalization (convolution (3 ═ 64))), an image of dimensions 1 × 64 × 237 × 237 is obtained. After activation (normalization (convolution (64 ═ 128))), an image of dimensions 1 × 128 × 237 × 237 is obtained; after passing through the maximum pooling layer (maxpool), an image with dimensions of 1 × 128 × 119 × 119 was obtained.

The obtained image is input into a channel attention module, and after convolution (activation (convolution (mean pooling (x)))), a feature map with dimensions of 1 × 128 × 1 × 1 is obtained. After (activation (convolution (max pooling (x)))), a feature map with dimensions of 1 × 128 × 1 × 1 is obtained. And adding the two and passing through a sigmoid activation function to output a feature map Mc with the dimension of 1 multiplied by 128 multiplied by 1.

The feature map of Mc × x (dimension 1 × 128 × 119 × 119) is input into the spatial attention module space, and subjected to mean value avg processing to obtain the feature map of dimension 1 × 1 × 119 × 119. After the spatial maximum max, a feature map with dimensions of 1 × 1 × 119 × 119 is obtained. And splicing the two images to obtain a characteristic map with the dimension of 1 × 2 × 119 × 119. The dimension is adjusted by the deconvolution to obtain a characteristic diagram of 1 × 1 × 119 × 119. And obtaining and outputting a feature map Ms with the dimensions of 1 × 1 × 119 × 119 through the sigmoid activation function.

The feature map of Ms × X (dimension 1 × 128 × 119 × 119) is used as the next Layer input, and the output is input into the channel attention module after being processed by Layer1 (dimension 1 × 256 × 119 × 119), Layer2 (dimension 1 × 512 × 60 × 60), Layer3 (dimension 1 × 1024 × 30 × 30), and Layer4 (dimension 1 × 2048 × 30 × 30).

After convolution (activation (convolution (mean pooling (x)))) in the channel attention module, a feature map with dimensions of 1 × 2048 × 1 × 1 is obtained. After convolution (activation (convolution (max pooling (x)))), a feature map with dimensions of 1 × 2048 × 1 × 1 is obtained. Adding the two and passing through sigmoid activation function to output a feature map Mc with the dimensionality of 1 multiplied by 2048 multiplied by 1₁。

Inputting the feature map of Mc × X (dimension is 1 × 2048 × 30 × 30) into the spatial attention module space, and obtaining the feature map with dimension of 1 × 1 × 30 × 30 after the mean value avg processing. After the spatial maximum max, a feature map with dimensions of 1 × 1 × 30 × 30 is obtained. And splicing the two images to obtain a feature map with the dimension of 1 multiplied by 2 multiplied by 30. The dimensions are adjusted by deconvolution to obtain a feature map of 1 × 1 × 30 × 30. And obtaining and outputting the feature map Ms with the dimensions of 1 × 1 × 30 × 30 through the sigmoid activation function.

Semantic segmentation is performed based on a pyramid scene resolution network, replacing 320 with 2048, 80 with 512, e.g., [30, 320] with [1,2048,30,30], and the resulting output is [1,3,473 ]. The final output is (3, 473, 473) prediction results, which are then classified for each pixel point.

It should be noted that the specific application process of the image segmentation method provided by the present invention is only used as a specific example to assist in explaining the present invention, and in an actual situation, the adaptive adjustment may also be performed according to parameters and specific structural details of an adaptive adjustment model of an actual demand, which is not limited by the present invention. Fig. 5 is a schematic structural diagram of an image segmentation system provided by the present invention, and as shown in fig. 5, the present invention further provides an image segmentation system, which includes: an image segmentation unit;

the image segmentation unit 510 is configured to input an image to be segmented into an image segmentation model, so as to obtain a target image output and segmented by the image segmentation model;

Specifically, the image segmentation unit 510 is configured to input an image to be segmented into a pre-trained image segmentation model, where the image segmentation model is configured to determine a target feature map corresponding to the image to be segmented based on an attention mechanism and a Residual Network (ResNet), and determine and output the segmented target image based on the target feature map and a Pyramid Scene Parsing Network (PSPNet).

For example: fig. 2 is an ultrasonic image of hydronephrosis of a patient according to the present invention, fig. 3 is a target image obtained by segmenting the ultrasonic image of hydronephrosis of the patient according to the present invention, the ultrasonic image of hydronephrosis of the patient (fig. 2) is collected as an image to be segmented, the ultrasonic image is input into a kidney ultrasonic image segmentation model trained in advance, and the segmented target image is determined. As shown in fig. 3, in the segmented target image, the waterlogged area and the kidney area are divided, the green area is the waterlogged area, and the red area is the kidney area. The outlines of the water accumulation area and the kidney area can be effectively divided.

The image segmentation system inputs an image to be segmented into an image segmentation model, determines a target feature map corresponding to the image to be segmented based on a residual error network extracted by an attention mechanism and basic features, and determines the segmented target image based on the target feature map and a pyramid scene analysis network. The important features in the feature map to be identified are focused through an attention mechanism and a residual error network, unnecessary features are restrained, the capability of obtaining global features is improved through a pyramid scene analysis network, the probability of segmentation errors is reduced under the condition that the added expense is very small, the segmentation precision of a model is improved, and the condition that the edges of a segmentation area are fuzzy is improved.

It should be noted that, the image segmentation system provided in the embodiment of the present invention is used for executing the image segmentation method, and a specific implementation manner thereof is consistent with the method implementation manner, and is not described herein again.

Fig. 6 is a schematic physical structure diagram of an electronic device provided in the present invention, and as shown in fig. 6, the electronic device may include: a processor (processor)610, a communication interface (communication interface)620, a memory (memory)630 and a communication bus (bus)640, wherein the processor 610, the communication interface 620 and the memory 630 complete communication with each other through the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform the image segmentation method described above, including: inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model; the image segmentation model is obtained by training a sample image set; the image segmentation model is used for determining a target feature map corresponding to the image to be segmented based on an attention mechanism and a residual error network, and determining the segmented target image based on the target feature map and a pyramid scene analysis network.

In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

In another aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer can execute the image segmentation method provided by the above-mentioned method embodiments, where the method includes: inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model; the image segmentation model is obtained by training a sample image set; the image segmentation model is used for determining a target feature map corresponding to the image to be segmented based on an attention mechanism and a residual error network, and determining the segmented target image based on the target feature map and a pyramid scene analysis network.

In yet another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the method for image segmentation provided in the foregoing embodiments, the method including: inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model; the image segmentation model is obtained by training a sample image set; the image segmentation model is used for determining a target feature map corresponding to the image to be segmented based on an attention mechanism and a residual error network, and determining the segmented target image based on the target feature map and a pyramid scene analysis network.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An image segmentation method, comprising:

inputting an image to be segmented into an image segmentation model to obtain a target image output and segmented by the image segmentation model;

2. The image segmentation method according to claim 1, wherein the image segmentation model comprises: a feature extraction layer and an image segmentation layer;

3. The image segmentation method according to claim 2, wherein the feature extraction layer includes: a first attention layer, a depth residual layer, and a second attention layer;

4. The image segmentation method according to any one of claims 1 to 3, wherein the first attention layer and the second attention layer include: and a convolution attention mechanism module.

5. The image segmentation method according to claim 3,

the first attention layer includes: a first channel attention layer, a first spatial attention layer, and a first fusion layer;

6. The image segmentation method according to claim 3,

the second attention layer includes: a second channel attention layer and a second spatial attention layer and a second fusion layer;

7. The image segmentation method according to any one of claims 1 to 3, wherein before the step of inputting the image to be segmented into the image segmentation model and obtaining the target image output by the image segmentation model after segmentation, the method further comprises: training the image segmentation model;

the training of the image segmentation model specifically includes:

training the image segmentation model using the sample image set;

8. An image segmentation system, comprising:

9. An electronic device, comprising a memory and a processor, wherein the processor and the memory communicate with each other via a bus; the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the image segmentation method of any of claims 1 to 7.

10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the image segmentation method according to any one of claims 1 to 7.