CN115564953A - Image segmentation method, device, equipment and storage medium - Google Patents
Image segmentation method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN115564953A CN115564953A CN202211161006.8A CN202211161006A CN115564953A CN 115564953 A CN115564953 A CN 115564953A CN 202211161006 A CN202211161006 A CN 202211161006A CN 115564953 A CN115564953 A CN 115564953A
- Authority
- CN
- China
- Prior art keywords
- pulse
- image segmentation
- layer
- feature map
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003709 image segmentation Methods 0.000 title claims abstract description 182
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 230000004927 fusion Effects 0.000 claims abstract description 84
- 238000012545 processing Methods 0.000 claims abstract description 73
- 238000001208 nuclear magnetic resonance pulse sequence Methods 0.000 claims abstract description 69
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000010586 diagram Methods 0.000 claims description 50
- 230000011218 segmentation Effects 0.000 claims description 35
- 238000005070 sampling Methods 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 13
- 238000010606 normalization Methods 0.000 claims description 10
- 238000007499 fusion processing Methods 0.000 claims description 9
- 238000006467 substitution reaction Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 2
- 210000002569 neuron Anatomy 0.000 abstract description 12
- 238000013135 deep learning Methods 0.000 abstract description 7
- 238000013528 artificial neural network Methods 0.000 abstract description 4
- 230000008569 process Effects 0.000 description 10
- 239000012528 membrane Substances 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 238000003475 lamination Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, wherein the image segmentation device comprises the following steps: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on a three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on a pulse sequence sample formed by pulse coding on an original three-dimensional image and an image segmentation label of the original three-dimensional image. According to the method, the pulse sequence of the three-dimensional image to be segmented is segmented and predicted by using the pulse model, and the neurons in the pulse model are in an active state when receiving or sending peak signals, so that the time consumption of deep learning of a neural network can be reduced, and small targets in feature maps of different scales can be accurately segmented based on the image segmentation pulse model obtained through multi-scale feature fusion training, so that the efficiency and the accuracy of three-dimensional image segmentation are improved.
Description
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image segmentation method, an image segmentation apparatus, an image segmentation device, and a storage medium.
Background
Three-dimensional volume image segmentation in medical images is mainly used for segmenting organs, tumors, blood vessels and other regions in the three-dimensional images, thereby being greatly helpful for disease diagnosis, monitoring and specifying corresponding treatment plans.
Currently, image segmentation is mainly performed on a three-dimensional volume image in a medical image through manual segmentation and a CNN-based deep learning network model, however, the manual segmentation process is tedious, time-consuming and labor-consuming, and is easy to mix with human segmentation errors, which results in low image segmentation accuracy, and additionally, the efficiency of performing semantic segmentation on a three-dimensional image through the CNN deep learning network model is low due to the redundancy of the deep learning network model.
Disclosure of Invention
The invention provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, and aims to solve the technical problems of low efficiency and accuracy of three-dimensional image segmentation.
The invention provides an image segmentation method, which comprises the following steps:
acquiring a three-dimensional image to be segmented;
carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;
inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;
the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
Optionally, according to an image segmentation method provided by the present invention, the image segmentation pulse model includes an encoding module, a decoding module and a segmentation output module, where:
the coding module comprises a plurality of cascaded coding units; each coding unit except the last coding unit comprises a first pulse convolution layer and a pulse down-sampling layer, and the last coding unit comprises the first pulse convolution layer;
the decoding module comprises a plurality of cascaded decoding units, and each decoding unit comprises a pulse up-sampling layer, a multi-scale feature fusion layer, a feature connection layer and a second pulse convolution layer;
the output of the last decoding unit is used as the input of the segmentation output module.
Optionally, according to an image segmentation method provided by the present invention, the inputting the pulse sequence to an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model includes:
inputting the pulse sequence into a first pulse convolution layer in the first coding unit to obtain a coding characteristic diagram output by the first pulse convolution layer;
performing downsampling processing on the coding feature map through a pulse downsampling layer in a first coding unit to obtain the downsampling feature map, and taking the downsampling feature map as the input of a next coding unit until a coding feature map output by a first pulse convolution layer in a last coding unit is obtained;
fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit which is at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;
performing upsampling processing on the coding feature map output by the last coding unit through a pulse upsampling layer in the first decoding unit to obtain a first upsampling feature map;
performing feature splicing on the first fused feature map and the first up-sampling feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map;
performing convolution processing on the first splicing feature map through a second pulse convolution layer in the first decoding unit to obtain a decoding feature map output by the first decoding unit, and taking the decoding feature map output by the first decoding unit as the input of a next decoding unit;
fusing the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit which is at the same depth level as the next decoding unit through a multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map;
performing up-sampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse up-sampling layer in the next decoding unit to obtain a second up-sampling characteristic diagram;
performing feature splicing on the second fused feature map and the second up-sampling feature map through a feature connection layer in the next decoding unit to obtain a second spliced feature map;
performing convolution processing on the second splicing characteristic diagram through a second pulse convolution layer in the next decoding unit to obtain a decoding characteristic diagram output by the next decoding unit;
returning to execute the step of performing fusion processing on the decoding feature graph output by the first decoding unit and the coding feature graph output by the coding unit at the same depth level as the next decoding unit through the multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature graph until the decoding feature graph output by the last decoding module is obtained;
and inputting the decoding feature map of the last decoding module into the segmentation output module to obtain an image segmentation result output by the segmentation output module.
Optionally, according to an image segmentation method provided by the present invention, the first pulse convolution layer includes a first convolution layer, a first normalization layer, and a first pulse emission layer that are cascaded;
the pulse down-sampling layer comprises a second convolution layer and a second pulse transmitting layer which are cascaded;
the pulse up-sampling layer comprises a cascaded deconvolution layer and a third pulse emission layer;
the second pulse convolution layer comprises a third convolution layer, a second normalization layer and a fourth pulse emission layer which are cascaded.
Optionally, according to an image segmentation method provided by the present invention, the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer, and a feature point convolution layer;
the obtaining of the fused feature map by fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through the multi-scale feature fusion layer in the first decoding unit includes:
performing convolution processing on the coding feature map output by the last coding unit through the fourth convolution layer to obtain a first convolution feature map; performing convolution processing on the coding feature maps output by the coding units at the same depth level through a fifth convolution layer to obtain a second convolution feature map;
adding the first convolution feature map and the second convolution feature map through the feature fusion layer to obtain a target feature map;
performing convolution processing on the target characteristic diagram through the third pulse convolution layer to obtain an attention characteristic diagram;
and performing dot product processing on the attention feature map and the coding feature map output by the last coding unit through the feature point stacking layer to obtain the fusion feature map.
Optionally, according to an image segmentation method provided by the present invention, the image segmentation pulse model is obtained by training based on the following steps:
acquiring a plurality of original three-dimensional images subjected to data enhancement processing;
respectively carrying out pulse coding processing on each original three-dimensional image to obtain a plurality of pulse sequence samples corresponding to preset time step lengths;
and performing iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.
Optionally, according to an image segmentation method provided by the present invention, the performing multi-scale feature fusion iterative training on a pulse model to be trained based on each pulse sequence sample and an image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model includes:
inputting the pulse sequence sample into the pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained;
calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label;
and updating parameters of the pulse model to be trained by using a gradient substitution algorithm based on the model loss value obtained by each iteration to obtain the image segmentation pulse model.
The present invention also provides an image segmentation apparatus, comprising:
the acquisition module is used for acquiring a three-dimensional image to be segmented;
the pulse coding module is used for carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;
the image segmentation module is used for inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;
the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on the basis of a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image segmentation method as described in any of the above when executing the program.
The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements an image segmentation method as described in any of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements the image segmentation method as described in any one of the above.
According to the image segmentation method, the device, the equipment and the storage medium, the three-dimensional image to be segmented is coded to form the pulse sequence, so that the pulse sequence is segmented and predicted by using the pulse model, as neurons in the pulse model are in an active state when receiving or sending a peak signal, the time consumption of a common deep learning neural network is greatly reduced, the efficiency of three-dimensional image segmentation is improved, and small targets in feature graphs of different scales can be accurately segmented by the image segmentation pulse model obtained through multi-scale feature fusion training, so that the accuracy of three-dimensional image segmentation is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of an image segmentation method provided by the present invention;
FIG. 2 is a schematic structural diagram of an image segmentation pulse model provided by the present invention;
FIG. 3 is a schematic structural diagram of a multi-scale feature fusion layer in an image segmentation pulse model provided by the present invention;
FIG. 4 is a schematic structural diagram of an image segmentation apparatus provided in the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
The terminology used in the one or more embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the invention. As used in one or more embodiments of the present invention, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present invention refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used herein to describe various information in one or more embodiments of the present invention, such information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present invention. The word "if" as used herein may be interpreted as "at" \8230; "or" when 8230; \8230; "or" in response to a determination ", depending on the context.
Exemplary embodiments of the present invention will be described in detail below with reference to fig. 1 to 3.
Fig. 1 is a schematic flow chart of an image segmentation method provided by the present invention, and as shown in fig. 1, the image segmentation method includes:
it should be noted that the three-dimensional image to be segmented is a 3D medical image, and the dimensions of the three-dimensional image to be segmented include the depth, length, width, and number of channels of the image.
it should be noted that, the pulse encoding process includes poisson encoding, and the poisson encoding is rate-based encoding, and can encode images into discrete pulse sequences.
the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on the basis of a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
Specifically, the pulse sequence is input to the image segmentation pulse model, and an image segmentation result is obtained according to an output result of the image segmentation pulse model. The image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image. It can be understood that the image segmentation pulse model can effectively segment the three-dimensional image to be segmented after being trained to obtain the image segmentation result of the three-dimensional image to be segmented.
It should be noted that the image segmentation pulse model is a model based on the encoder-decoder architecture coupling of the 3D U-net model, and the image segmentation pulse model is a pulse neural network model based on the form of biological neurons, and the image segmentation pulse model includes an encoding module, a decoding module, and a segmentation output module.
The coding module comprises a plurality of cascaded coding units; each coding unit except the last coding unit comprises a first pulse convolution layer and a pulse down-sampling layer, and the last coding unit comprises the first pulse convolution layer; the decoding module comprises a plurality of cascaded decoding units, and each decoding unit comprises a pulse up-sampling layer, a multi-scale feature fusion layer, a feature connection layer and a second pulse convolution layer; the output of the last decoding unit is used as the input of the segmentation output module.
As shown in fig. 2, the encoding module includes 4 encoding units, the decoding module includes 3 decoding units, the encoding feature map output by the fourth encoding unit is directly used as an input of the first decoding unit, the encoding unit is connected to the multi-scale feature fusion layer in the decoding unit in the same depth level, wherein the third encoding unit and the first decoding unit are in the same depth level, the second encoding unit and the second decoding unit are in the same depth level, and the first encoding unit and the third decoding unit are in the same depth level.
Specifically, the pulse sequence is input to a first pulse convolutional layer in a first coding unit, the convolution processing is performed on the pulse sequence by using the first pulse convolutional layer to obtain a convolution characteristic diagram, a convolution characteristic diagram is obtained, the convolution characteristic diagram is further input to a pulse downsampling layer in the first coding unit, and a downsampling characteristic diagram output by the pulse downsampling layer is obtained, wherein the downsampling characteristic diagram output by a last coding unit is used as the input of a next coding unit until an encoding characteristic diagram output by a last coding unit is obtained, further, in the first decoding unit, an encoding characteristic diagram output by the last coding unit and an encoding characteristic diagram output by a coding unit which is at the same depth level as the first decoding unit are used as the input of the first decoding unit, performing fusion processing on the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map, performing upsampling processing on the coding feature map output by the last coding unit through a pulse upsampling layer in the first decoding unit to obtain a first upsampling feature map, performing feature splicing on the first fusion feature map and the first upsampling feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map, and performing convolution processing on the spliced feature map through a second pulse convolution layer in the first decoding unit after obtaining the spliced feature map output by the feature connection layer to obtain a decoding feature map, wherein, and taking the decoding characteristic diagram output by the last decoding unit as the input of the next decoding unit, circulating until obtaining the decoding characteristic diagram output by the last decoding unit, and further inputting the decoding characteristic diagram output by the last decoding unit into the segmentation output module to obtain the image segmentation result output by the segmentation output module.
According to the scheme, the pulse sequence is formed by encoding the three-dimensional image to be segmented, the pulse sequence is segmented and predicted by using the pulse model, and the neurons in the pulse model are in an active state when receiving or sending a spike signal, so that the time consumption of a common deep learning neural network is greatly reduced, the efficiency of segmenting the three-dimensional image is improved, and the small targets in feature maps of different scales can be accurately segmented by the image segmentation pulse model obtained by multi-scale feature fusion training, so that the accuracy of segmenting the three-dimensional image is improved.
In one embodiment, the step 13: inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model, wherein the image segmentation result comprises the following steps:
step 131, inputting the pulse sequence into the first pulse convolution layer in the first coding unit to obtain a coding characteristic diagram output by the first pulse convolution layer;
specifically, the pulse sequence is input to a first pulse convolution layer in the first coding unit, and the first pulse convolution layer performs convolution processing on the pulse sequence to obtain a coding characteristic diagram output by the first pulse convolution layer.
It should be noted that the first pulse convolution layer includes a first convolution layer, a first normalization layer, and a first pulse distribution layer that are cascaded, and preferably, the number of the first pulse convolution layers is 2. The first convolution layer is a 3D convolution layer, the first normalization layer is a BatchNorm normalization layer, the first pulse emitting layer is a LIF neuron model with parameters based on neuron kinetic equation modeling, the LIF neuron model accumulates membrane potential in an integral mode, the membrane potential does not exponentially attenuate along time when input is not performed, when the charge is accumulated to a certain degree, namely the membrane potential reaches a preset threshold value, the neuron generates a pulse and emits the pulse, and then the membrane potential is reset.
The neurodynamic equation is specifically as follows:
wherein,w is a learnable parameter, V [ t ]]Membrane potential of LIF neuron model representing time t, X [ t ]]Representing the input, V, of the LIF neuron model at time t reset Indicating a reset membrane potential.
Step 132, performing downsampling processing on the coding feature map through a pulse downsampling layer in a first coding unit to obtain the downsampling feature map, and using the downsampling feature map as the input of a next coding unit until a coding feature map output by a first pulse convolution layer in a last coding unit is obtained;
it should be noted that the pulse downsampling layer includes a second convolution layer and a second pulse emission layer that are cascaded, where the second convolution layer is a 3D convolution layer, and the second pulse emission layer has substantially the same structure as the first pulse emission layer in step 131, and is not described herein again.
It can be understood that, if the size of the coded feature map is 32 × 128 × 128 × 128, the coded feature map is downsampled by the pulse downsampling layer in the first coding unit to obtain a 32 × 64 × 64 × 64 downsampled feature map, so as to reduce the length, the width, and the depth in the feature map, and then the 32 × 64 × 64 × 64 feature map is used as the input of the first pulse convolution layer in the next coding unit, and so on until the coded feature map output by the first pulse convolution layer in the last coding unit is obtained.
Step 133, performing fusion processing on the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;
step 134, performing upsampling processing on the coding feature map output by the last coding unit through a pulse upsampling layer in the first decoding unit to obtain a first upsampling feature map;
it should be noted that, since the feature map obtained by upsampling the coding feature map output by the last coding unit is different in size from the coding feature map output by the coding unit at the same depth level as the first decoding unit, feature concatenation cannot be directly performed. In this embodiment, a multi-scale feature fusion layer is provided, and feature fusion processing is performed on the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit.
Specifically, the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer and a feature point lamination layer, where the third pulse convolution layer includes a sixth convolution layer and a fifth pulse issue layer, which are cascaded, and the fourth convolution layer, the fifth convolution layer and the sixth convolution layer are all 3D convolution layers, the fourth convolution layer performs convolution processing on the coding feature map output by the last coding unit to obtain a first convolution feature map, the fifth convolution layer performs convolution processing on the coding feature map output by the coding unit at the same depth level as the first decoding unit to obtain a second convolution feature map, and then the feature fusion layer adds the first convolution feature map and the second convolution feature map to obtain a target feature map, and further, the third pulse convolution layer performs convolution processing on the target feature map to obtain an attention feature map, and then performs processing on the attention feature map and the coding feature map output by the last coding unit to obtain different fusion feature maps, thereby realizing different fusion feature maps.
Additionally, the coding feature map output by the last coding unit is input to a pulse upsampling layer in the first decoding unit, so as to perform upsampling processing on the coding feature map output by the last coding unit through the pulse upsampling layer, so as to obtain a first upsampling feature map, wherein the pulse upsampling layer includes a cascaded deconvolution layer and a third pulse issue layer, where the deconvolution layer is a 3D convolution layer, and the third pulse issue layer has substantially the same structural action as the first pulse issue layer in step 131, and is not described herein again.
Additionally, the sequence of steps 133 and 134 may be executed first in step 133 and then in step 134, or may be executed first in step 134 and then in step 133, which is not limited herein.
Step 135, performing feature splicing on the first fused feature map and the first upsampled feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map;
it can be understood that if the first fused feature map with the size of 128 × 32 × 32 × 32 is obtained through the multi-scale feature fusion layer fusion process, the first upsampled feature map with the size of 128 × 32 × 32 × 32 is obtained through the pulse upsampling layer upsampling process, and the first feature map with the size of 256 × 32 × 32 × 32 is obtained after feature stitching.
Step 136, performing convolution processing on the first splicing feature map through the second pulse convolution layer in the first decoding unit to obtain a decoding feature map output by the first decoding unit, and taking the decoding feature map output by the first decoding unit as the input of the next decoding unit;
it should be noted that the second pulse convolution layers include a third convolution layer, a second normalization layer, and a fourth pulse emission layer, which are cascaded, and preferably, in the model network structure, the number of the second pulse convolution layers is 2, where the third convolution layer is a 3D convolution layer, and the structural functions of the fourth pulse emission layer and the first pulse emission layer in step 131 are substantially the same, and are not described herein again. Specifically, the first splicing feature map is input into a second pulse convolution layer in a first decoding unit, so that the first splicing feature map is subjected to convolution processing by using the second pulse convolution layer, and the decoding feature map output by the first decoding unit is obtained.
Step 137, performing fusion processing on the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit at the same depth level as the next decoding unit through a multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map;
step 138, performing upsampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse upsampling layer in the next decoding unit to obtain a second upsampling characteristic diagram;
step 139, performing feature splicing on the second fused feature map and the second upsampled feature map through a feature connection layer in the next decoding unit to obtain a second spliced feature map;
step 1310, performing convolution processing on the second splicing feature map through a second pulse convolution layer in the next decoding unit to obtain a decoding feature map output by the next decoding unit;
returning to execute the step of performing fusion processing on the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit at the same depth level as the next decoding unit through the multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map until the decoding feature map output by the last decoding module is obtained;
specifically, the decoding feature map output by the first decoding unit is input into the next decoding unit, so that the decoding feature map and the coding feature map output by the coding unit at the same depth level as the next decoding unit are fused by the multi-scale feature fusion layer in the next decoding unit to obtain a second fused feature map, and the decoding feature map is upsampled by the pulse upsampling layer in the next decoding unit, where it should be noted that the specific implementation processes of steps 137 to steps 1310 to 136 in this embodiment are substantially the same as those of steps 1310 to 136, that is, the decoding process of each decoding unit is substantially the same, and no further description is given here until the decoding feature map output by the last decoding module is obtained.
Step 1311, inputting the decoding feature map of the last decoding module to the segmentation output module, and obtaining an image segmentation result output by the segmentation output module.
Specifically, the decoding feature map of the last decoding module is input to the segmentation output module to obtain the image segmentation result output by the segmentation output module.
According to the scheme, the three-dimensional image segmentation is realized based on the form of the biological neurons, the neurons are in an active state when receiving or sending spike signals, the time consumption of a common deep learning neural network is effectively reduced, multi-scale feature fusion is introduced in the decoding process, small targets in feature maps of different scales can be accurately segmented, and the accuracy of three-dimensional image segmentation is improved.
In an embodiment, the fusing, by a multi-scale feature fusion layer in a first decoding unit, the encoding feature map output by the last encoding unit and the encoding feature map output by the encoding unit at the same depth level as the first decoding unit to obtain a fused feature map includes:
performing convolution processing on the coding feature map output by the last coding unit through the fourth convolution layer to obtain a first convolution feature map; performing convolution processing on the coding feature maps output by the coding units at the same depth level through a fifth convolution layer to obtain a second convolution feature map; adding the first convolution feature map and the second convolution feature map through the feature fusion layer to obtain a target feature map; performing convolution processing on the target feature map through the third pulse convolution layer to obtain a concerned feature map; and performing dot product processing on the attention feature map and the coding feature map output by the last coding unit through the feature point lamination layer to obtain the fusion feature map.
It should be noted that fig. 3 is a schematic structural diagram of a multi-scale feature fusion layer in an image segmentation pulse model provided by the present invention, as shown in fig. 3, the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer, and a feature point superposition layer, specifically, an encoding feature map output by the last encoding unit is input into the fourth convolution layer, so that the fourth convolution layer is used to perform convolution processing on the encoding feature map output by the last encoding unit, thereby obtaining a first convolution feature map; and inputting the coding feature maps output by the coding units at the same depth level into a fifth convolutional layer, performing convolutional processing on the coding feature maps output by the coding units at the same depth level by using the fifth convolutional layer to obtain a second convolutional feature map, further inputting the first convolutional feature map and the second convolutional feature map into the feature fusion layer, performing summation processing on the first convolutional feature map and the second convolutional feature map by using the feature fusion layer to obtain a target feature map, performing convolutional processing on the target feature map by using the third pulse convolutional layer to obtain an attention feature map, and further performing dot product processing on the attention feature map and the coding feature map output by the last coding unit by using the attention feature map to re-weight image features to obtain the fusion feature map.
According to the scheme, the encoding feature maps and the decoding feature maps of different scales are fused through the multi-scale feature fusion layer in the decoding process, so that small targets for segmenting the feature maps of different scales can be accurately learned, and the accuracy of image segmentation is improved.
In one embodiment, the image segmentation pulse model is obtained based on training as follows: acquiring a plurality of original three-dimensional images subjected to data enhancement processing; respectively carrying out pulse coding processing on each original three-dimensional image to obtain a plurality of pulse sequence samples corresponding to preset time step lengths; and performing multi-scale feature fusion iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.
The iterative training of the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model comprises the following steps: inputting the pulse sequence sample into the pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained; calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label; and updating parameters of the pulse model to be trained by utilizing a gradient substitution algorithm based on the model loss value obtained by each iteration to obtain the image segmentation pulse model.
Specifically, a plurality of sample three-dimensional images are obtained, data enhancement processing is performed on the sample three-dimensional images to obtain each original three-dimensional image, wherein the dimension of each original three-dimensional image can be represented as [ C, D, H, W ], C represents the number of channels of each original three-dimensional image, D represents the depth of each original three-dimensional image, H represents the length of each original three-dimensional image, and W represents the width of each original three-dimensional image, the data enhancement processing includes processing modes of turning the image up and down, turning the image left and right, randomly cutting and the like, and further, the following steps are performed on each original three-dimensional image: and performing pulse coding processing on the original three-dimensional image to obtain a plurality of pulse sequences with space-time information within preset time steps, for example, to obtain pulse sequences within T preset time steps, where the dimension of the pulse sequence may represent [ T, C, D, H, W ].
Further, inputting each pulse sequence into a pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained, and further calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label, wherein the model loss value calculation formula is as follows:
wherein A represents the result of predictive segmentation, B represents the image segmentation label, and L DICE (A, B) represents model loss values. In another possible implementation manner, the pulse sequence corresponding to the batch of original three-dimensional images is input to the pulse model to be trained for iterative training, where the input dimension may be represented as [ T, N, C, D, H, W]And N represents the number of samples of the original three-dimensional image input into the pulse model to be trained each time.
On this basis, in other embodiments, the model loss value may be calculated by setting a loss function according to an actual requirement, which is not specifically limited herein. After calculating to obtain the model loss value, the training process is ended, and in consideration of the inconductivity of the pulse transmit function, in the back propagation process, the gradient substitution algorithm is used to update the model parameters in the pulse model to be trained, for example: and replacing gradient values at corresponding positions by gradients of sigmoid, tanh and other functions, and then carrying out next training. In the training process, whether the updated pulse models to be trained meet preset training end conditions or not is judged, if yes, the updated pulse models to be trained are used as image segmentation pulse models, and if not, the models are continuously trained, wherein the preset training end conditions comprise loss convergence, maximum iteration threshold value reaching and the like.
According to the scheme, the loss value of the image segmentation pulse model can be controlled within a preset range by training the image segmentation pulse model, so that the accuracy of image segmentation of the image segmentation pulse model can be improved.
The following describes the image segmentation apparatus provided by the present invention, and the image segmentation apparatus described below and the image segmentation method described above may be referred to in correspondence with each other.
Fig. 4 is a schematic structural diagram of an image segmentation apparatus provided by the present invention, and as shown in fig. 4, the apparatus includes:
an obtaining module 41, configured to obtain a three-dimensional image to be segmented;
the pulse coding module 42 is configured to perform pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;
an image segmentation module 43, configured to input the pulse sequence to an image segmentation pulse model, so as to obtain an image segmentation result output by the image segmentation pulse model;
the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
The image segmentation apparatus further includes:
the image segmentation pulse model comprises an encoding module, a decoding module and a segmentation output module, wherein:
the coding module comprises a plurality of cascaded coding units; each coding unit except the last coding unit comprises a first pulse convolution layer and a pulse down-sampling layer, and the last coding unit comprises the first pulse convolution layer;
the decoding module comprises a plurality of cascaded decoding units, and each decoding unit comprises a pulse up-sampling layer, a multi-scale feature fusion layer, a feature connection layer and a second pulse convolution layer;
the output of the last decoding unit is used as the input of the segmentation output module.
The image segmentation module 43 is further configured to:
inputting the pulse sequence into a first pulse convolution layer in the first coding unit to obtain a coding characteristic diagram output by the first pulse convolution layer;
the coding characteristic diagram is downsampled through a pulse downsampling layer in a first coding unit to obtain the downsampling characteristic diagram, and the downsampling characteristic diagram is used as the input of the next coding unit until the coding characteristic diagram output by the first pulse convolution layer in the last coding unit is obtained;
fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;
performing up-sampling processing on the coding characteristic diagram output by the last coding unit through a pulse up-sampling layer in the first decoding unit to obtain a first up-sampling characteristic diagram;
performing feature splicing on the first fused feature map and the first up-sampling feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map;
performing convolution processing on the first splicing feature map through a second pulse convolution layer in the first decoding unit to obtain a decoding feature map output by the first decoding unit, and taking the decoding feature map output by the first decoding unit as the input of a next decoding unit;
fusing the decoding feature map output by the first decoding unit and the coding feature map output by the coding unit which is at the same depth level as the next decoding unit through a multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature map;
performing up-sampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse up-sampling layer in the next decoding unit to obtain a second up-sampling characteristic diagram;
performing feature splicing on the second fused feature map and the second up-sampling feature map through a feature connection layer in the next decoding unit to obtain a second spliced feature map;
performing convolution processing on the second splicing characteristic diagram through a second pulse convolution layer in the next decoding unit to obtain a decoding characteristic diagram output by the next decoding unit;
returning to execute the step of performing fusion processing on the decoding feature graph output by the first decoding unit and the coding feature graph output by the coding unit at the same depth level as the next decoding unit through the multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature graph until the decoding feature graph output by the last decoding module is obtained;
and inputting the decoding feature map of the last decoding module into the segmentation output module to obtain an image segmentation result output by the segmentation output module.
The image segmentation apparatus further includes:
the multi-scale feature fusion layer comprises a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer and a feature point lamination layer.
The image segmentation module 43 is further configured to:
performing convolution processing on the coding feature map output by the last coding unit through the fourth convolution layer to obtain a first convolution feature map; performing convolution processing on the coding feature maps output by the coding units at the same depth level through a fifth convolution layer to obtain a second convolution feature map;
adding the first convolution feature map and the second convolution feature map through the feature fusion layer to obtain a target feature map;
performing convolution processing on the target characteristic diagram through the third pulse convolution layer to obtain an attention characteristic diagram;
and performing dot product processing on the attention feature map and the coding feature map output by the last coding unit through the feature point stacking layer to obtain the fusion feature map.
The image segmentation apparatus further includes:
the first pulse convolution layer comprises a first convolution layer, a first normalization layer and a first pulse emission layer which are cascaded;
the pulse down-sampling layer comprises a second convolution layer and a second pulse transmitting layer which are cascaded;
the pulse up-sampling layer comprises a cascaded deconvolution layer and a third pulse emission layer;
the second pulse convolution layer comprises a third convolution layer, a second normalization layer and a fourth pulse emission layer which are cascaded.
The image segmentation apparatus further includes:
acquiring a plurality of original three-dimensional images subjected to data enhancement processing;
respectively carrying out pulse coding processing on each original three-dimensional image to obtain a plurality of pulse sequence samples corresponding to preset time step lengths;
and performing multi-scale feature fusion iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.
The image segmentation apparatus further includes:
inputting the pulse sequence sample into the pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained;
calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label;
and updating parameters of the pulse model to be trained by using a gradient substitution algorithm based on the model loss value obtained by each iteration to obtain the image segmentation pulse model.
It should be noted that, the apparatus provided in the embodiment of the present invention can implement all the method steps implemented by the method embodiment and achieve the same technical effect, and detailed descriptions of the same parts and beneficial effects as the method embodiment in this embodiment are omitted here.
Fig. 5 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor) 510, a memory (memory) 520, a communication Interface (Communications Interface) 530, and a communication bus 540, wherein the processor 510, the memory 520, and the communication Interface 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 520 to perform an image segmentation method comprising: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
In addition, the logic instructions in the memory 520 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the image segmentation method provided by the above methods, the method comprising: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer being capable of executing the image segmentation method provided by the above methods, the method comprising: acquiring a three-dimensional image to be segmented; carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence; inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model; the image segmentation pulse model is obtained by carrying out multi-scale feature fusion training on the basis of a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. An image segmentation method, comprising:
acquiring a three-dimensional image to be segmented;
carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;
inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;
the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
2. The image segmentation method of claim 1, wherein the image segmentation pulse model comprises an encoding module, a decoding module, and a segmentation output module, wherein:
the coding module comprises a plurality of cascaded coding units; each coding unit except the last coding unit comprises a first pulse convolution layer and a pulse down-sampling layer, and the last coding unit comprises the first pulse convolution layer;
the decoding module comprises a plurality of cascaded decoding units, and each decoding unit comprises a pulse up-sampling layer, a multi-scale feature fusion layer, a feature connection layer and a second pulse convolution layer;
the output of the last decoding unit is used as the input of the segmentation output module.
3. The image segmentation method according to claim 2, wherein the inputting the pulse sequence to an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model comprises:
inputting the pulse sequence into a first pulse convolution layer in the first coding unit to obtain a coding characteristic diagram output by the first pulse convolution layer;
performing downsampling processing on the coding feature map through a pulse downsampling layer in a first coding unit to obtain the downsampling feature map, and taking the downsampling feature map as the input of a next coding unit until a coding feature map output by a first pulse convolution layer in a last coding unit is obtained;
fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through a multi-scale feature fusion layer in the first decoding unit to obtain a first fusion feature map;
performing up-sampling processing on the coding characteristic diagram output by the last coding unit through a pulse up-sampling layer in the first decoding unit to obtain a first up-sampling characteristic diagram;
performing feature splicing on the first fused feature map and the first up-sampling feature map through a feature connection layer in the first decoding unit to obtain a first spliced feature map;
performing convolution processing on the first splicing feature map through a second pulse convolution layer in the first decoding unit to obtain a decoding feature map output by the first decoding unit, and taking the decoding feature map output by the first decoding unit as the input of a next decoding unit;
fusing the decoding characteristic diagram output by the first decoding unit and the coding characteristic diagram output by the coding unit which is at the same depth level with the next decoding unit through a multi-scale characteristic fusion layer in the next decoding unit to obtain a second fusion characteristic diagram;
performing upsampling processing on the decoding characteristic diagram output by the first decoding unit through a pulse upsampling layer in the next decoding unit to obtain a second upsampling characteristic diagram;
performing feature splicing on the second fused feature map and the second up-sampling feature map through a feature connection layer in the next decoding unit to obtain a second spliced feature map;
performing convolution processing on the second splicing feature map through a second pulse convolution layer in the next decoding unit to obtain a decoding feature map output by the next decoding unit;
returning to execute the step of performing fusion processing on the decoding feature graph output by the first decoding unit and the coding feature graph output by the coding unit at the same depth level as the next decoding unit through the multi-scale feature fusion layer in the next decoding unit to obtain a second fusion feature graph until the decoding feature graph output by the last decoding module is obtained;
and inputting the decoding characteristic graph of the last decoding module into the segmentation output module to obtain an image segmentation result output by the segmentation output module.
4. The image segmentation method according to claim 3, wherein the multi-scale feature fusion layer includes a fourth convolution layer, a fifth convolution layer, a feature fusion layer, a third pulse convolution layer, and a feature point convolution layer;
the obtaining of the fused feature map by fusing the coding feature map output by the last coding unit and the coding feature map output by the coding unit at the same depth level as the first decoding unit through the multi-scale feature fusion layer in the first decoding unit includes:
performing convolution processing on the coding feature map output by the last coding unit through the fourth convolution layer to obtain a first convolution feature map; performing convolution processing on the coding feature maps output by the coding units at the same depth level through a fifth convolution layer to obtain a second convolution feature map;
adding the first convolution feature map and the second convolution feature map through the feature fusion layer to obtain a target feature map;
performing convolution processing on the target feature map through the third pulse convolution layer to obtain a concerned feature map;
and performing dot product processing on the attention feature map and the coding feature map output by the last coding unit through the feature point stacking layer to obtain the fusion feature map.
5. The image segmentation method according to claim 2,
the first pulse convolution layer comprises a first convolution layer, a first normalization layer and a first pulse distribution layer which are cascaded;
the pulse down-sampling layer comprises a second convolution layer and a second pulse transmitting layer which are cascaded;
the pulse up-sampling layer comprises a cascaded deconvolution layer and a third pulse emission layer;
the second pulse convolution layer comprises a third convolution layer, a second normalization layer and a fourth pulse emission layer which are cascaded.
6. The image segmentation method according to claim 2, wherein the image segmentation pulse model is trained based on the following steps:
acquiring a plurality of original three-dimensional images subjected to data enhancement processing;
respectively carrying out pulse coding processing on each original three-dimensional image to obtain a plurality of pulse sequence samples corresponding to preset time step lengths;
and performing multi-scale feature fusion iterative training on the pulse model to be trained based on each pulse sequence sample and the image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model.
7. The image segmentation method according to claim 6, wherein iteratively training a pulse model to be trained based on each of the pulse sequence samples and an image segmentation label of the original three-dimensional image to obtain the image segmentation pulse model, comprises:
inputting the pulse sequence sample into the pulse model to be trained to obtain a prediction segmentation result output by the pulse model to be trained;
calculating to obtain a model loss value based on the prediction segmentation result and the image segmentation label;
and updating parameters of the pulse model to be trained by using a gradient substitution algorithm based on the model loss value obtained by each iteration to obtain the image segmentation pulse model.
8. An image segmentation apparatus, comprising:
the acquisition module is used for acquiring a three-dimensional image to be segmented;
the pulse coding module is used for carrying out pulse coding processing on the three-dimensional image to be segmented to obtain a pulse sequence;
the image segmentation module is used for inputting the pulse sequence into an image segmentation pulse model to obtain an image segmentation result output by the image segmentation pulse model;
the image segmentation pulse model is obtained by performing multi-scale feature fusion training based on a pulse sequence sample formed by an original three-dimensional image through pulse coding and an image segmentation label of the original three-dimensional image.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image segmentation method according to any one of claims 1 to 7 when executing the program.
10. A non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the image segmentation method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211161006.8A CN115564953A (en) | 2022-09-22 | 2022-09-22 | Image segmentation method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211161006.8A CN115564953A (en) | 2022-09-22 | 2022-09-22 | Image segmentation method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115564953A true CN115564953A (en) | 2023-01-03 |
Family
ID=84740675
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211161006.8A Pending CN115564953A (en) | 2022-09-22 | 2022-09-22 | Image segmentation method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115564953A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117853507A (en) * | 2024-03-06 | 2024-04-09 | 阿里巴巴(中国)有限公司 | Interactive image segmentation method, device, storage medium and program product |
-
2022
- 2022-09-22 CN CN202211161006.8A patent/CN115564953A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117853507A (en) * | 2024-03-06 | 2024-04-09 | 阿里巴巴(中国)有限公司 | Interactive image segmentation method, device, storage medium and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111832570B (en) | Image semantic segmentation model training method and system | |
CN111914559B (en) | Text attribute extraction method and device based on probabilistic graphical model and computer equipment | |
CN113743417B (en) | Semantic segmentation method and semantic segmentation device | |
CN110859642B (en) | Method, device, equipment and storage medium for realizing medical image auxiliary diagnosis based on AlexNet network model | |
JP7536893B2 (en) | Image Processing Using Self-Attention Based Neural Networks | |
US20240135610A1 (en) | Image generation using a diffusion model | |
US20220188605A1 (en) | Recurrent neural network architectures based on synaptic connectivity graphs | |
CN114708465B (en) | Image classification method and device, electronic equipment and storage medium | |
CN116958557B (en) | Three-dimensional indoor scene semantic segmentation method based on residual impulse neural network | |
US20240169500A1 (en) | Image and object inpainting with diffusion models | |
CN115564953A (en) | Image segmentation method, device, equipment and storage medium | |
CN118015283B (en) | Image segmentation method, device, equipment and storage medium | |
CN116912299A (en) | Medical image registration method, device, equipment and medium of motion decomposition model | |
US20220414886A1 (en) | Semantic image segmentation using contrastive channels | |
CN111582449B (en) | Training method, device, equipment and storage medium of target domain detection network | |
US20240169567A1 (en) | Depth edges refinement for sparsely supervised monocular depth estimation | |
US20240169622A1 (en) | Multi-modal image editing | |
CN111931841A (en) | Deep learning-based tree processing method, terminal, chip and storage medium | |
CN108376283B (en) | Pooling device and pooling method for neural network | |
US20230004791A1 (en) | Compressed matrix representations of neural network architectures based on synaptic connectivity | |
CN114615505A (en) | Point cloud attribute compression method and device based on depth entropy coding and storage medium | |
CN114413910B (en) | Visual target navigation method and device | |
CN110930391A (en) | Method, device and equipment for realizing medical image auxiliary diagnosis based on VggNet network model and storage medium | |
US20240169604A1 (en) | Text and color-guided layout control with a diffusion model | |
CN113642627B (en) | Deep learning-based image and decision multi-source heterogeneous information fusion identification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |