CN115564953A

CN115564953A - Image segmentation method, device, equipment and storage medium

Info

Publication number: CN115564953A
Application number: CN202211161006.8A
Authority: CN
Inventors: 蒿杰; 周怡; 孙亚强; 赵美花; 许天赐
Original assignee: Xintiao Technology Guangzhou Co ltd; Guangdong Institute of Artificial Intelligence and Advanced Computing
Current assignee: Xintiao Technology Guangzhou Co ltd; Guangdong Institute of Artificial Intelligence and Advanced Computing
Priority date: 2022-09-22
Filing date: 2022-09-22
Publication date: 2023-01-03

Abstract

The invention provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, wherein the image segmentation device comprises the following steps: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on a three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on a pulse sequence sample formed by pulse coding on an original three-dimensional image and an image segmentation label of the original three-dimensional image. According to the method, the pulse sequence of the three-dimensional image to be segmented is segmented and predicted by using the pulse model, and the neurons in the pulse model are in an active state when receiving or sending peak signals, so that the time consumption of deep learning of a neural network can be reduced, and small targets in feature maps of different scales can be accurately segmented based on the image segmentation pulse model obtained through multi-scale feature fusion training, so that the efficiency and the accuracy of three-dimensional image segmentation are improved.

Description

Image segmentation method, device, equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image segmentation method, an image segmentation apparatus, an image segmentation device, and a storage medium.

Background

Three-dimensional volume image segmentation in medical images is mainly used for segmenting organs, tumors, blood vessels and other regions in the three-dimensional images, thereby being greatly helpful for disease diagnosis, monitoring and specifying corresponding treatment plans.

Currently, image segmentation is mainly performed on a three-dimensional volume image in a medical image through manual segmentation and a CNN-based deep learning network model, however, the manual segmentation process is tedious, time-consuming and labor-consuming, and is easy to mix with human segmentation errors, which results in low image segmentation accuracy, and additionally, the efficiency of performing semantic segmentation on a three-dimensional image through the CNN deep learning network model is low due to the redundancy of the deep learning network model.

Disclosure of Invention

The invention provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, and aims to solve the technical problems of low efficiency and accuracy of three-dimensional image segmentation.

The invention provides an image segmentation method, which comprises the following steps:

acquiring a three-dimensional image to be segmented;

carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;

inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;

the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.

Optionally, according to an image segmentation method provided by the present invention, the image segmentation pulse model includes an encoding module, a decoding module and a segmentation output module, where:

the coding module comprises a plurality of cascaded coding units; each coding unit except the last coding unit comprises a first pulse convolution layer and a pulse down-sampling layer, and the last coding unit comprises the first pulse convolution layer;

the decoding module comprises a plurality of cascaded decoding units, and each decoding unit comprises a pulse up-sampling layer, a multi-scale feature fusion layer, a feature connection layer and a second pulse convolution layer;

the output of the last decoding unit is used as the input of the segmentation output module.

Optionally, according to an image segmentation method provided by the present invention, the inputting the pulse sequence to an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model includes:

inputting the pulse sequence into a first pulse convolution layer in the first coding unit to obtain a coding characteristic diagram output by the first pulse convolution layer;

performing downsampling processing on the coding feature map through a pulse downsampling layer in a first coding unit to obtain the downsampling feature map, and taking the downsampling feature map as the input of a next coding unit until a coding feature map output by a first pulse convolution layer in a last coding unit is obtained;

fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit which is at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;

performing upsampling processing on the coding feature map output by the last coding unit through a pulse upsampling layer in the first decoding unit to obtain a first upsampling feature map;

performing feature splicing on the first fused feature map and the first up-sampling feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map;

performing convolution processing on the first splicing feature map through a second pulse convolution layer in the first decoding unit to obtain a decoding feature map output by the first decoding unit, and taking the decoding feature map output by the first decoding unit as the input of a next decoding unit;

fusing the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit which is at the same depth level as the next decoding unit through a multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map;

performing up-sampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse up-sampling layer in the next decoding unit to obtain a second up-sampling characteristic diagram;

performing feature splicing on the second fused feature map and the second up-sampling feature map through a feature connection layer in the next decoding unit to obtain a second spliced feature map;

performing convolution processing on the second splicing characteristic diagram through a second pulse convolution layer in the next decoding unit to obtain a decoding characteristic diagram output by the next decoding unit;

returning to execute the step of performing fusion processing on the decoding feature graph output by the first decoding unit and the coding feature graph output by the coding unit at the same depth level as the next decoding unit through the multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature graph until the decoding feature graph output by the last decoding module is obtained;

and inputting the decoding feature map of the last decoding module into the segmentation output module to obtain an image segmentation result output by the segmentation output module.

Optionally, according to an image segmentation method provided by the present invention, the first pulse convolution layer includes a first convolution layer, a first normalization layer, and a first pulse emission layer that are cascaded;

the pulse down-sampling layer comprises a second convolution layer and a second pulse transmitting layer which are cascaded;

the pulse up-sampling layer comprises a cascaded deconvolution layer and a third pulse emission layer;

the second pulse convolution layer comprises a third convolution layer, a second normalization layer and a fourth pulse emission layer which are cascaded.

Optionally, according to an image segmentation method provided by the present invention, the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer, and a feature point convolution layer;

the obtaining of the fused feature map by fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through the multi-scale feature fusion layer in the first decoding unit includes:

performing convolution processing on the coding feature map output by the last coding unit through the fourth convolution layer to obtain a first convolution feature map; performing convolution processing on the coding feature maps output by the coding units at the same depth level through a fifth convolution layer to obtain a second convolution feature map;

adding the first convolution feature map and the second convolution feature map through the feature fusion layer to obtain a target feature map;

performing convolution processing on the target characteristic diagram through the third pulse convolution layer to obtain an attention characteristic diagram;

and performing dot product processing on the attention feature map and the coding feature map output by the last coding unit through the feature point stacking layer to obtain the fusion feature map.

Optionally, according to an image segmentation method provided by the present invention, the image segmentation pulse model is obtained by training based on the following steps:

acquiring a plurality of original three-dimensional images subjected to data enhancement processing;

respectively carrying out pulse coding processing on each original three-dimensional image to obtain a plurality of pulse sequence samples corresponding to preset time step lengths;

and performing iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.

Optionally, according to an image segmentation method provided by the present invention, the performing multi-scale feature fusion iterative training on a pulse model to be trained based on each pulse sequence sample and an image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model includes:

inputting the pulse sequence sample into the pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained;

calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label;

and updating parameters of the pulse model to be trained by using a gradient substitution algorithm based on the model loss value obtained by each iteration to obtain the image segmentation pulse model.

The present invention also provides an image segmentation apparatus, comprising:

the acquisition module is used for acquiring a three-dimensional image to be segmented;

the pulse coding module is used for carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;

the image segmentation module is used for inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;

the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on the basis of a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.

The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image segmentation method as described in any of the above when executing the program.

The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements an image segmentation method as described in any of the above.

The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements the image segmentation method as described in any one of the above.

According to the image segmentation method, the device, the equipment and the storage medium, the three-dimensional image to be segmented is coded to form the pulse sequence, so that the pulse sequence is segmented and predicted by using the pulse model, as neurons in the pulse model are in an active state when receiving or sending a peak signal, the time consumption of a common deep learning neural network is greatly reduced, the efficiency of three-dimensional image segmentation is improved, and small targets in feature graphs of different scales can be accurately segmented by the image segmentation pulse model obtained through multi-scale feature fusion training, so that the accuracy of three-dimensional image segmentation is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of an image segmentation method provided by the present invention;

FIG. 2 is a schematic structural diagram of an image segmentation pulse model provided by the present invention;

FIG. 3 is a schematic structural diagram of a multi-scale feature fusion layer in an image segmentation pulse model provided by the present invention;

FIG. 4 is a schematic structural diagram of an image segmentation apparatus provided in the present invention;

fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.

The terminology used in the one or more embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the invention. As used in one or more embodiments of the present invention, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present invention refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used herein to describe various information in one or more embodiments of the present invention, such information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present invention. The word "if" as used herein may be interpreted as "at" \8230; "or" when 8230; \8230; "or" in response to a determination ", depending on the context.

Exemplary embodiments of the present invention will be described in detail below with reference to fig. 1 to 3.

Fig. 1 is a schematic flow chart of an image segmentation method provided by the present invention, and as shown in fig. 1, the image segmentation method includes:

step 11, acquiring a three-dimensional image to be segmented;

it should be noted that the three-dimensional image to be segmented is a 3D medical image, and the dimensions of the three-dimensional image to be segmented include the depth, length, width, and number of channels of the image.

Step 12, performing pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;

it should be noted that, the pulse encoding process includes poisson encoding, and the poisson encoding is rate-based encoding, and can encode images into discrete pulse sequences.

Step 13, inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;

Specifically, the pulse sequence is input to the image segmentation pulse model, and an image segmentation result is obtained according to an output result of the image segmentation pulse model. The image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image. It can be understood that the image segmentation pulse model can effectively segment the three-dimensional image to be segmented after being trained to obtain the image segmentation result of the three-dimensional image to be segmented.

It should be noted that the image segmentation pulse model is a model based on the encoder-decoder architecture coupling of the 3D U-net model, and the image segmentation pulse model is a pulse neural network model based on the form of biological neurons, and the image segmentation pulse model includes an encoding module, a decoding module, and a segmentation output module.

The coding module comprises a plurality of cascaded coding units; each coding unit except the last coding unit comprises a first pulse convolution layer and a pulse down-sampling layer, and the last coding unit comprises the first pulse convolution layer; the decoding module comprises a plurality of cascaded decoding units, and each decoding unit comprises a pulse up-sampling layer, a multi-scale feature fusion layer, a feature connection layer and a second pulse convolution layer; the output of the last decoding unit is used as the input of the segmentation output module.

As shown in fig. 2, the encoding module includes 4 encoding units, the decoding module includes 3 decoding units, the encoding feature map output by the fourth encoding unit is directly used as an input of the first decoding unit, the encoding unit is connected to the multi-scale feature fusion layer in the decoding unit in the same depth level, wherein the third encoding unit and the first decoding unit are in the same depth level, the second encoding unit and the second decoding unit are in the same depth level, and the first encoding unit and the third decoding unit are in the same depth level.

Specifically, the pulse sequence is input to a first pulse convolutional layer in a first coding unit, the convolution processing is performed on the pulse sequence by using the first pulse convolutional layer to obtain a convolution characteristic diagram, a convolution characteristic diagram is obtained, the convolution characteristic diagram is further input to a pulse downsampling layer in the first coding unit, and a downsampling characteristic diagram output by the pulse downsampling layer is obtained, wherein the downsampling characteristic diagram output by a last coding unit is used as the input of a next coding unit until an encoding characteristic diagram output by a last coding unit is obtained, further, in the first decoding unit, an encoding characteristic diagram output by the last coding unit and an encoding characteristic diagram output by a coding unit which is at the same depth level as the first decoding unit are used as the input of the first decoding unit, performing fusion processing on the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map, performing upsampling processing on the coding feature map output by the last coding unit through a pulse upsampling layer in the first decoding unit to obtain a first upsampling feature map, performing feature splicing on the first fusion feature map and the first upsampling feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map, and performing convolution processing on the spliced feature map through a second pulse convolution layer in the first decoding unit after obtaining the spliced feature map output by the feature connection layer to obtain a decoding feature map, wherein, and taking the decoding characteristic diagram output by the last decoding unit as the input of the next decoding unit, circulating until obtaining the decoding characteristic diagram output by the last decoding unit, and further inputting the decoding characteristic diagram output by the last decoding unit into the segmentation output module to obtain the image segmentation result output by the segmentation output module.

According to the scheme, the pulse sequence is formed by encoding the three-dimensional image to be segmented, the pulse sequence is segmented and predicted by using the pulse model, and the neurons in the pulse model are in an active state when receiving or sending a spike signal, so that the time consumption of a common deep learning neural network is greatly reduced, the efficiency of segmenting the three-dimensional image is improved, and the small targets in feature maps of different scales can be accurately segmented by the image segmentation pulse model obtained by multi-scale feature fusion training, so that the accuracy of segmenting the three-dimensional image is improved.

In one embodiment, the step 13: inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model, wherein the image segmentation result comprises the following steps:

step 131, inputting the pulse sequence into the first pulse convolution layer in the first coding unit to obtain a coding characteristic diagram output by the first pulse convolution layer;

specifically, the pulse sequence is input to a first pulse convolution layer in the first coding unit, and the first pulse convolution layer performs convolution processing on the pulse sequence to obtain a coding characteristic diagram output by the first pulse convolution layer.

It should be noted that the first pulse convolution layer includes a first convolution layer, a first normalization layer, and a first pulse distribution layer that are cascaded, and preferably, the number of the first pulse convolution layers is 2. The first convolution layer is a 3D convolution layer, the first normalization layer is a BatchNorm normalization layer, the first pulse emitting layer is a LIF neuron model with parameters based on neuron kinetic equation modeling, the LIF neuron model accumulates membrane potential in an integral mode, the membrane potential does not exponentially attenuate along time when input is not performed, when the charge is accumulated to a certain degree, namely the membrane potential reaches a preset threshold value, the neuron generates a pulse and emits the pulse, and then the membrane potential is reset.

The neurodynamic equation is specifically as follows:

wherein,

w is a learnable parameter, V [ t ]]Membrane potential of LIF neuron model representing time t, X [ t ]]Representing the input, V, of the LIF neuron model at time t _reset Indicating a reset membrane potential.

Step 132, performing downsampling processing on the coding feature map through a pulse downsampling layer in a first coding unit to obtain the downsampling feature map, and using the downsampling feature map as the input of a next coding unit until a coding feature map output by a first pulse convolution layer in a last coding unit is obtained;

it should be noted that the pulse downsampling layer includes a second convolution layer and a second pulse emission layer that are cascaded, where the second convolution layer is a 3D convolution layer, and the second pulse emission layer has substantially the same structure as the first pulse emission layer in step 131, and is not described herein again.

It can be understood that, if the size of the coded feature map is 32 × 128 × 128 × 128, the coded feature map is downsampled by the pulse downsampling layer in the first coding unit to obtain a 32 × 64 × 64 × 64 downsampled feature map, so as to reduce the length, the width, and the depth in the feature map, and then the 32 × 64 × 64 × 64 feature map is used as the input of the first pulse convolution layer in the next coding unit, and so on until the coded feature map output by the first pulse convolution layer in the last coding unit is obtained.

Step 133, performing fusion processing on the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;

step 134, performing upsampling processing on the coding feature map output by the last coding unit through a pulse upsampling layer in the first decoding unit to obtain a first upsampling feature map;

it should be noted that, since the feature map obtained by upsampling the coding feature map output by the last coding unit is different in size from the coding feature map output by the coding unit at the same depth level as the first decoding unit, feature concatenation cannot be directly performed. In this embodiment, a multi-scale feature fusion layer is provided, and feature fusion processing is performed on the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit.

Specifically, the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer and a feature point lamination layer, where the third pulse convolution layer includes a sixth convolution layer and a fifth pulse issue layer, which are cascaded, and the fourth convolution layer, the fifth convolution layer and the sixth convolution layer are all 3D convolution layers, the fourth convolution layer performs convolution processing on the coding feature map output by the last coding unit to obtain a first convolution feature map, the fifth convolution layer performs convolution processing on the coding feature map output by the coding unit at the same depth level as the first decoding unit to obtain a second convolution feature map, and then the feature fusion layer adds the first convolution feature map and the second convolution feature map to obtain a target feature map, and further, the third pulse convolution layer performs convolution processing on the target feature map to obtain an attention feature map, and then performs processing on the attention feature map and the coding feature map output by the last coding unit to obtain different fusion feature maps, thereby realizing different fusion feature maps.

Additionally, the coding feature map output by the last coding unit is input to a pulse upsampling layer in the first decoding unit, so as to perform upsampling processing on the coding feature map output by the last coding unit through the pulse upsampling layer, so as to obtain a first upsampling feature map, wherein the pulse upsampling layer includes a cascaded deconvolution layer and a third pulse issue layer, where the deconvolution layer is a 3D convolution layer, and the third pulse issue layer has substantially the same structural action as the first pulse issue layer in step 131, and is not described herein again.

Additionally, the sequence of steps 133 and 134 may be executed first in step 133 and then in step 134, or may be executed first in step 134 and then in step 133, which is not limited herein.

Step 135, performing feature splicing on the first fused feature map and the first upsampled feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map;

it can be understood that if the first fused feature map with the size of 128 × 32 × 32 × 32 is obtained through the multi-scale feature fusion layer fusion process, the first upsampled feature map with the size of 128 × 32 × 32 × 32 is obtained through the pulse upsampling layer upsampling process, and the first feature map with the size of 256 × 32 × 32 × 32 is obtained after feature stitching.

Step 136, performing convolution processing on the first splicing feature map through the second pulse convolution layer in the first decoding unit to obtain a decoding feature map output by the first decoding unit, and taking the decoding feature map output by the first decoding unit as the input of the next decoding unit;

it should be noted that the second pulse convolution layers include a third convolution layer, a second normalization layer, and a fourth pulse emission layer, which are cascaded, and preferably, in the model network structure, the number of the second pulse convolution layers is 2, where the third convolution layer is a 3D convolution layer, and the structural functions of the fourth pulse emission layer and the first pulse emission layer in step 131 are substantially the same, and are not described herein again. Specifically, the first splicing feature map is input into a second pulse convolution layer in a first decoding unit, so that the first splicing feature map is subjected to convolution processing by using the second pulse convolution layer, and the decoding feature map output by the first decoding unit is obtained.

Step 137, performing fusion processing on the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit at the same depth level as the next decoding unit through a multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map;

step 138, performing upsampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse upsampling layer in the next decoding unit to obtain a second upsampling characteristic diagram;

step 139, performing feature splicing on the second fused feature map and the second upsampled feature map through a feature connection layer in the next decoding unit to obtain a second spliced feature map;

step 1310, performing convolution processing on the second splicing feature map through a second pulse convolution layer in the next decoding unit to obtain a decoding feature map output by the next decoding unit;

returning to execute the step of performing fusion processing on the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit at the same depth level as the next decoding unit through the multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map until the decoding feature map output by the last decoding module is obtained;

specifically, the decoding feature map output by the first decoding unit is input into the next decoding unit, so that the decoding feature map and the coding feature map output by the coding unit at the same depth level as the next decoding unit are fused by the multi-scale feature fusion layer in the next decoding unit to obtain a second fused feature map, and the decoding feature map is upsampled by the pulse upsampling layer in the next decoding unit, where it should be noted that the specific implementation processes of steps 137 to steps 1310 to 136 in this embodiment are substantially the same as those of steps 1310 to 136, that is, the decoding process of each decoding unit is substantially the same, and no further description is given here until the decoding feature map output by the last decoding module is obtained.

Step 1311, inputting the decoding feature map of the last decoding module to the segmentation output module, and obtaining an image segmentation result output by the segmentation output module.

Specifically, the decoding feature map of the last decoding module is input to the segmentation output module to obtain the image segmentation result output by the segmentation output module.

According to the scheme, the three-dimensional image segmentation is realized based on the form of the biological neurons, the neurons are in an active state when receiving or sending spike signals, the time consumption of a common deep learning neural network is effectively reduced, multi-scale feature fusion is introduced in the decoding process, small targets in feature maps of different scales can be accurately segmented, and the accuracy of three-dimensional image segmentation is improved.

In an embodiment, the fusing, by a multi-scale feature fusion layer in a first decoding unit, the encoding feature map output by the last encoding unit and the encoding feature map output by the encoding unit at the same depth level as the first decoding unit to obtain a fused feature map includes:

performing convolution processing on the coding feature map output by the last coding unit through the fourth convolution layer to obtain a first convolution feature map; performing convolution processing on the coding feature maps output by the coding units at the same depth level through a fifth convolution layer to obtain a second convolution feature map; adding the first convolution feature map and the second convolution feature map through the feature fusion layer to obtain a target feature map; performing convolution processing on the target feature map through the third pulse convolution layer to obtain a concerned feature map; and performing dot product processing on the attention feature map and the coding feature map output by the last coding unit through the feature point lamination layer to obtain the fusion feature map.

It should be noted that fig. 3 is a schematic structural diagram of a multi-scale feature fusion layer in an image segmentation pulse model provided by the present invention, as shown in fig. 3, the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer, and a feature point superposition layer, specifically, an encoding feature map output by the last encoding unit is input into the fourth convolution layer, so that the fourth convolution layer is used to perform convolution processing on the encoding feature map output by the last encoding unit, thereby obtaining a first convolution feature map; and inputting the coding feature maps output by the coding units at the same depth level into a fifth convolutional layer, performing convolutional processing on the coding feature maps output by the coding units at the same depth level by using the fifth convolutional layer to obtain a second convolutional feature map, further inputting the first convolutional feature map and the second convolutional feature map into the feature fusion layer, performing summation processing on the first convolutional feature map and the second convolutional feature map by using the feature fusion layer to obtain a target feature map, performing convolutional processing on the target feature map by using the third pulse convolutional layer to obtain an attention feature map, and further performing dot product processing on the attention feature map and the coding feature map output by the last coding unit by using the attention feature map to re-weight image features to obtain the fusion feature map.

According to the scheme, the encoding feature maps and the decoding feature maps of different scales are fused through the multi-scale feature fusion layer in the decoding process, so that small targets for segmenting the feature maps of different scales can be accurately learned, and the accuracy of image segmentation is improved.

In one embodiment, the image segmentation pulse model is obtained based on training as follows: acquiring a plurality of original three-dimensional images subjected to data enhancement processing; respectively carrying out pulse coding processing on each original three-dimensional image to obtain a plurality of pulse sequence samples corresponding to preset time step lengths; and performing multi-scale feature fusion iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.

The iterative training of the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model comprises the following steps: inputting the pulse sequence sample into the pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained; calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label; and updating parameters of the pulse model to be trained by utilizing a gradient substitution algorithm based on the model loss value obtained by each iteration to obtain the image segmentation pulse model.

Specifically, a plurality of sample three-dimensional images are obtained, data enhancement processing is performed on the sample three-dimensional images to obtain each original three-dimensional image, wherein the dimension of each original three-dimensional image can be represented as [ C, D, H, W ], C represents the number of channels of each original three-dimensional image, D represents the depth of each original three-dimensional image, H represents the length of each original three-dimensional image, and W represents the width of each original three-dimensional image, the data enhancement processing includes processing modes of turning the image up and down, turning the image left and right, randomly cutting and the like, and further, the following steps are performed on each original three-dimensional image: and performing pulse coding processing on the original three-dimensional image to obtain a plurality of pulse sequences with space-time information within preset time steps, for example, to obtain pulse sequences within T preset time steps, where the dimension of the pulse sequence may represent [ T, C, D, H, W ].

Further, inputting each pulse sequence into a pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained, and further calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label, wherein the model loss value calculation formula is as follows:

wherein A represents the result of predictive segmentation, B represents the image segmentation label, and L _DICE (A, B) represents model loss values. In another possible implementation manner, the pulse sequence corresponding to the batch of original three-dimensional images is input to the pulse model to be trained for iterative training, where the input dimension may be represented as [ T, N, C, D, H, W]And N represents the number of samples of the original three-dimensional image input into the pulse model to be trained each time.

On this basis, in other embodiments, the model loss value may be calculated by setting a loss function according to an actual requirement, which is not specifically limited herein. After calculating to obtain the model loss value, the training process is ended, and in consideration of the inconductivity of the pulse transmit function, in the back propagation process, the gradient substitution algorithm is used to update the model parameters in the pulse model to be trained, for example: and replacing gradient values at corresponding positions by gradients of sigmoid, tanh and other functions, and then carrying out next training. In the training process, whether the updated pulse models to be trained meet preset training end conditions or not is judged, if yes, the updated pulse models to be trained are used as image segmentation pulse models, and if not, the models are continuously trained, wherein the preset training end conditions comprise loss convergence, maximum iteration threshold value reaching and the like.

According to the scheme, the loss value of the image segmentation pulse model can be controlled within a preset range by training the image segmentation pulse model, so that the accuracy of image segmentation of the image segmentation pulse model can be improved.

The following describes the image segmentation apparatus provided by the present invention, and the image segmentation apparatus described below and the image segmentation method described above may be referred to in correspondence with each other.

Fig. 4 is a schematic structural diagram of an image segmentation apparatus provided by the present invention, and as shown in fig. 4, the apparatus includes:

an obtaining module 41, configured to obtain a three-dimensional image to be segmented;

the pulse coding module 42 is configured to perform pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;

an image segmentation module 43, configured to input the pulse sequence to an image segmentation pulse model, so as to obtain an image segmentation result output by the image segmentation pulse model;

The image segmentation apparatus further includes:

the image segmentation pulse model comprises an encoding module, a decoding module and a segmentation output module, wherein:

The image segmentation module 43 is further configured to:

the coding characteristic diagram is downsampled through a pulse downsampling layer in a first coding unit to obtain the downsampling characteristic diagram, and the downsampling characteristic diagram is used as the input of the next coding unit until the coding characteristic diagram output by the first pulse convolution layer in the last coding unit is obtained;

fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;

performing up-sampling processing on the coding characteristic diagram output by the last coding unit through a pulse up-sampling layer in the first decoding unit to obtain a first up-sampling characteristic diagram;

The image segmentation apparatus further includes:

the multi-scale feature fusion layer comprises a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer and a feature point lamination layer.

The image segmentation module 43 is further configured to:

The image segmentation apparatus further includes:

the first pulse convolution layer comprises a first convolution layer, a first normalization layer and a first pulse emission layer which are cascaded;

The image segmentation apparatus further includes:

and performing multi-scale feature fusion iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.

The image segmentation apparatus further includes:

It should be noted that, the apparatus provided in the embodiment of the present invention can implement all the method steps implemented by the method embodiment and achieve the same technical effect, and detailed descriptions of the same parts and beneficial effects as the method embodiment in this embodiment are omitted here.

Fig. 5 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor) 510, a memory (memory) 520, a communication Interface (Communications Interface) 530, and a communication bus 540, wherein the processor 510, the memory 520, and the communication Interface 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 520 to perform an image segmentation method comprising: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.

In addition, the logic instructions in the memory 520 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the image segmentation method provided by the above methods, the method comprising: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer being capable of executing the image segmentation method provided by the above methods, the method comprising: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on the basis of a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An image segmentation method, comprising:

acquiring a three-dimensional image to be segmented;

2. The image segmentation method of claim 1, wherein the image segmentation pulse model comprises an encoding module, a decoding module, and a segmentation output module, wherein:

3. The image segmentation method according to claim 2, wherein the inputting the pulse sequence to an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model comprises:

fusing the decoding characteristic diagram output by the first decoding unit and the coding characteristic diagram output by the coding unit which is at the same depth level with the next decoding unit through a multi-scale characteristic fusion layer in the next decoding unit to obtain a second fusion characteristic diagram;

performing upsampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse upsampling layer in the next decoding unit to obtain a second upsampling characteristic diagram;

performing convolution processing on the second splicing feature map through a second pulse convolution layer in the next decoding unit to obtain a decoding feature map output by the next decoding unit;

and inputting the decoding characteristic graph of the last decoding module into the segmentation output module to obtain an image segmentation result output by the segmentation output module.

4. The image segmentation method according to claim 3, wherein the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer, and a feature point convolution layer;

performing convolution processing on the target feature map through the third pulse convolution layer to obtain a concerned feature map;

5. The image segmentation method according to claim 2,

the first pulse convolution layer comprises a first convolution layer, a first normalization layer and a first pulse distribution layer which are cascaded;

6. The image segmentation method according to claim 2, wherein the image segmentation pulse model is trained based on the following steps:

7. The image segmentation method according to claim 6, wherein iteratively training a pulse model to be trained based on each of the pulse sequence samples and an image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model, comprises:

8. An image segmentation apparatus, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image segmentation method according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the image segmentation method according to any one of claims 1 to 7.