CN114170249A - Image semantic segmentation method based on CNUNet3+ network - Google Patents

Image semantic segmentation method based on CNUNet3+ network Download PDF

Info

Publication number
CN114170249A
CN114170249A CN202210118688.8A CN202210118688A CN114170249A CN 114170249 A CN114170249 A CN 114170249A CN 202210118688 A CN202210118688 A CN 202210118688A CN 114170249 A CN114170249 A CN 114170249A
Authority
CN
China
Prior art keywords
node
encoder
decoder
nodes
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210118688.8A
Other languages
Chinese (zh)
Other versions
CN114170249B (en
Inventor
张斌
欧阳红林
朱颖达
刘其圣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202210118688.8A priority Critical patent/CN114170249B/en
Publication of CN114170249A publication Critical patent/CN114170249A/en
Application granted granted Critical
Publication of CN114170249B publication Critical patent/CN114170249B/en
Priority to NL2032957A priority patent/NL2032957B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

The invention discloses an image semantic segmentation method based on a CNUNet3+ network, which comprises the following steps: constructing a CNUNet3+ network model, wherein the network model is of a U-shaped structure and adopts an encoder-decoder structure; the depth is N, and N is more than or equal to 3; the left arm of the U-shaped structure is an encoder, and the right arm of the U-shaped structure is a decoder; the number of channels is doubled when the depth of the encoder is deepened by one layer; down-sampling operation is used between adjacent nodes of the encoder, up-sampling operation is used between adjacent nodes of the decoder, and dense connection is used between the nodes of the decoder; when the encoder node and the central node are at the same depth, convolution operation is adopted, and when the encoder node and the central node are at different depths, downsampling operation is adopted; the encoder node, the central node and the decoder node are all neuron nodes; the number of central nodes may range from
Figure 327308DEST_PATH_IMAGE001
And (4) respectively. The method is used for medical X-ray and CT image segmentation, and can improve the segmentation accuracy of small targets.

Description

Image semantic segmentation method based on CNUNet3+ network
Technical Field
The invention belongs to the field of image segmentation, and relates to an image semantic segmentation method based on a CNUNet3+ network.
Background
Clinical commonly used medical images include X-rays, computed tomography CT, magnetic resonance MRI, and ultrasound. Compared with general natural images, the medical images have specificity and complexity. The distinction between different parts of a medical image is often minimal compared to natural images. The medical image is typically a black and white image. In medical images, organs such as liver, spleen, kidney, stomach and the like have almost the same color in an abdominal CT picture, and are often difficult to distinguish. If a tumor appears in the liver, the color of the tumor part is slightly different from the color of the rest part, and the slight difference can be distinguished only by medical professionals and is difficult to distinguish by ordinary people, which causes the complexity of image segmentation of medical images.
In medical image diagnosis, it is necessary to identify a lesion region of a patient. For example, liver tumor diagnosis, it is necessary to identify a tumor region in the liver and label the tumor region in an image, which is a medical image segmentation technique. The medical image diagnosis usually depends on the level and experience of a film reading doctor, and has the problems of strong subjectivity, low repeatability and the like. The computer image processing technology can help doctors to improve the diagnosis accuracy and improve the film reading efficiency.
With the rapid development of artificial intelligence technology represented by deep learning, medical images have become an important application field of artificial intelligence. In recent years, deep learning has taken a leading position in the field of image segmentation, and has also become the mainstream of medical image segmentation technology.
Among the various algorithms of deep learning, Convolutional neural networks (Convolutional neural networks) have been the dominant role in image processing. In 2015, Jonathan Long et al proposed a Full Convolutional Network (FCN). Because of the superior performance of FCNs, a number of FCN-based deep learning methods have been proposed since then. For example, UNet and SegNet which incorporate the codec method, DANet and CCNet which incorporate the attentiveness mechanism, and deep lab which employs the hole convolution.
Among the methods for medical image segmentation, UNet network is the most classical one. In 2015, Olaf Ronneberger et al proposed a UNet network. The encoding part of the UNet network extracts high-level features, the decoding part restores image space information, and combines jump connection ResNet, and high-level and low-level features. The UNet network image segmentation is relatively accurate, and the network structure is simple. UNet networks have found widespread use in medical image segmentation studies.
On the basis of the original UNet network, various variants are layered endlessly. Of these variants, the most important are UNet + + networks and UNet3+ networks.
In 2018, Zhou indulge, et al proposed a UNet + + network. UNet + + can extract different features from structures of different depths and use hopping connections of different lengths in conjunction with a DenseNet network. UNet networks have also found widespread use in medical image segmentation studies.
In 2020, Huimin Huang et al proposed a UNet3+ network. The left arm in the UNet3+ structure diagram represents the encoding process and the right arm represents the decoding process. The circle in UNet3+ structure diagram represents a neuron node, and the number below the circle represents the number of output channels of the node. In the encoding process, the number of output channels is doubled when the depth is increased by one layer.
UNet3+ performs better than other methods of medical image segmentation. However, because the number of neurons is too small, when the number of samples is small, segmentation of a fine target is not ideal.
Disclosure of Invention
In order to solve the problems, the invention provides an image semantic segmentation method based on a CNUNet3+ network. The method is used for medical image segmentation, particularly for segmentation of an X-ray image and a CT image, and can segment a tiny focus area, so that an imaging doctor is assisted in diagnosing the state of an illness. The method specifically comprises the following steps:
constructing a CNUNet3+ network model, wherein the CNUNet3+ network model is of a U-shaped structure and adopts a coder-decoder structure; the depth is N, and N is a positive integer greater than or equal to 3; the left arm of the U-shaped structure is an encoder, and the right arm of the U-shaped structure is a decoder; the number of channels is doubled when the depth of the encoder is deepened by one layer; encodingDown-sampling operation is used between adjacent nodes of the device, up-sampling operation is used between adjacent nodes of the decoder, and dense connection is used between the nodes of the decoder; a central node is inserted in each of the first layer to the L-th layer,
Figure 378992DEST_PATH_IMAGE001
(ii) a When the encoder node and the central node are at the same depth, convolution operation is adopted; when the encoder node and the central node are at different depths, adopting down-sampling operation; when the central node and the decoder node are at the same depth, adopting convolution operation; when the central node and the decoder node are at different depths, adopting down-sampling operation; convolution calculation is adopted between nodes of an encoder and a decoder without a central node layer, and the encoder node, the central node and the decoder node are all neuron nodes;
inputting training set data into the CNUNet3+ network model, and training the CNUNet3+ network model, wherein the training set data comprises original images and images marked with segmentation results as training set samples;
preprocessing the image to be segmented, and modifying all the images to be segmented into the same size;
and (3) segmenting the image to be segmented by using the trained CNUNet3+ network model.
Further, the calculation formula of the central node is as follows:
Figure 502806DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 724840DEST_PATH_IMAGE003
represents a central node located at the ith level;
Figure 527580DEST_PATH_IMAGE004
an encoder node representing an i-th layer; m represents the number of central nodes;
Figure 86737DEST_PATH_IMAGE005
representing a down-sampling operation;
Figure 674844DEST_PATH_IMAGE006
representing a convolution operation;
Figure 661255DEST_PATH_IMAGE007
representing a join operation;
Figure 219800DEST_PATH_IMAGE008
representing hybrid integration operation, firstly performing convolution operation, then performing batch standardization operation, and then performing ELU activation; in the formula
Figure 989172DEST_PATH_IMAGE009
Further, the CNUNet3+ network model employs an ELU activation function.
Further, down-sampling is performed using a maximum convergence method.
Further, the up-sampling is performed using bilinear interpolation.
Compared with a UNet3+ network, the invention adds a plurality of central nodes and corrects the defect of too few UNet3+ network nodes; and meanwhile, the connection mode of the network nodes is changed greatly. Compared with the UNet + + network, the number of nodes is greatly reduced, and the number of parameters is also greatly reduced. When the method is used for segmenting the medical X-ray image and the CT image, good performance still exists when the number of samples is small.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
Fig. 1 is a CNUNet3+ network model in which the encoder has five levels of depth with three nodes in the center;
fig. 2 is a CNUNet3+ network model in which the encoder has five levels of depth with two nodes in the center;
fig. 3 is a CNUNet3+ network model in which the encoder has five levels of depth with only one node in the center;
FIG. 4 is a CNUNet3+ network model in which the encoder has four layers of depth with two nodes in the center;
fig. 5 is a CNUNet3+ network model in which the encoder has four layers deep with only one node in the center.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The image semantic segmentation method based on the CNUNet3+ network provided by the invention can be used under the condition of less sample number, and comprises the following steps:
first, a CNUNet3+ network model is constructed.
The CNUNet3+ network model is of a U-shaped structure and adopts an encoder-decoder structure; the depth is N, and N is a positive integer greater than or equal to 3; the left arm of the U-shaped structure is an encoder, and the right arm of the U-shaped structure is a decoder; the number of channels is doubled when the depth of the encoder is deepened by one layer; down-sampling operation is used between adjacent nodes of the encoder, up-sampling operation is used between adjacent nodes of the decoder, and dense connection is used between the nodes of the decoder; a central node is inserted in each of the first layer to the L-th layer,
Figure 212212DEST_PATH_IMAGE001
(ii) a When the encoder node and the central node are at the same depth, convolution operation is adopted; when the encoder node and the central node are at different depths, adopting down-sampling operation; when the central node and the decoder node are at the same depth, adopting convolution operation; when the central node and the decoder node are at different depths, adopting down-sampling operation; inter-node sampling for encoders and decoders without a central node layerAnd performing convolution calculation, wherein the encoder node, the central node and the decoder node are all neuron nodes.
And secondly, network training is carried out.
And inputting training set data into the CNUNet3+ network model, and training the CNUNet3+ network model, wherein the training set data comprises original images and images marked with segmentation results as training set samples.
And thirdly, preprocessing the image to be segmented.
And modifying all the images to be segmented into the same size, and if the number of input pictures is too small, performing data enhancement.
And fourthly, segmenting the image to be segmented by using the trained CNUNet3+ network model.
One embodiment of the present invention employs a CNUNet3+ network model diagram as shown in fig. 1. The circles in fig. 1 represent the neuron nodes, and the numbers below the circles represent the number of output channels of the nodes. For example
Figure 244890DEST_PATH_IMAGE010
The number "64" below the node indicates that the number of output channels is 64. Neuron node
Figure 897589DEST_PATH_IMAGE010
Figure 923182DEST_PATH_IMAGE011
Figure 876095DEST_PATH_IMAGE012
Figure 610833DEST_PATH_IMAGE013
And
Figure 406619DEST_PATH_IMAGE014
the left arm is formed, and the left arm is an encoding (Encode) process.
Figure 986636DEST_PATH_IMAGE010
Is connected to
Figure 262897DEST_PATH_IMAGE011
The number of output channels is doubled from 64 to 128.
Figure 889575DEST_PATH_IMAGE011
Is connected to
Figure 923390DEST_PATH_IMAGE012
The number of output channels is doubled from 128 to 256.
Figure 556365DEST_PATH_IMAGE012
Is connected to
Figure 483870DEST_PATH_IMAGE013
The number of output channels is doubled from 256 to 512.
Figure 294831DEST_PATH_IMAGE013
Is connected to
Figure 268472DEST_PATH_IMAGE014
The number of output channels is doubled again from 512 to 1024.
Figure 314926DEST_PATH_IMAGE014
Figure 237882DEST_PATH_IMAGE015
Figure 469012DEST_PATH_IMAGE016
Figure 946261DEST_PATH_IMAGE017
And
Figure 796406DEST_PATH_IMAGE018
the right arm is formed, and the right arm is a decoding (Decode) process.
Figure 294908DEST_PATH_IMAGE014
Is connected to
Figure 306726DEST_PATH_IMAGE015
The number of output channels is changed from 1024 to 320. From
Figure 802429DEST_PATH_IMAGE015
After that, the number of output channels is not changed and is always maintained at 320. The left arm and the right arm form the basic structure of UNet, and also form the basic structure of UNet3+, and also form the main structure of the CNUNet3+ network of the invention.
The present invention makes extensive use of Down Sampling and Up Sampling. The arrows at the lower part of the non-same depth layer are all down-sampled. For example from
Figure 49740DEST_PATH_IMAGE010
Node to
Figure 868660DEST_PATH_IMAGE019
Node of from
Figure 51380DEST_PATH_IMAGE010
And
Figure 690172DEST_PATH_IMAGE011
node to
Figure 616539DEST_PATH_IMAGE020
Node of from
Figure 103015DEST_PATH_IMAGE021
Figure 103639DEST_PATH_IMAGE019
And
Figure 777197DEST_PATH_IMAGE020
node to
Figure 772835DEST_PATH_IMAGE015
The nodes are down-sampled. The arrows above the non-same depth layers are all up-sampled. For example from
Figure 769610DEST_PATH_IMAGE014
Node to
Figure 294132DEST_PATH_IMAGE015
Node of from
Figure 704254DEST_PATH_IMAGE014
And
Figure 238003DEST_PATH_IMAGE015
node to
Figure 167913DEST_PATH_IMAGE016
Node of from
Figure 128916DEST_PATH_IMAGE014
Figure 698438DEST_PATH_IMAGE015
And
Figure 770299DEST_PATH_IMAGE016
node to
Figure 72492DEST_PATH_IMAGE017
Node of from
Figure 938817DEST_PATH_IMAGE014
Figure 74263DEST_PATH_IMAGE015
Figure 949815DEST_PATH_IMAGE016
And
Figure 369164DEST_PATH_IMAGE017
node to
Figure 812915DEST_PATH_IMAGE018
The nodes are all up-sampled.
The arrows at the same depth level represent convolution operations, neither up-sampling nor down-sampling.
The calculation formula of the central node is as follows:
Figure 419345DEST_PATH_IMAGE002
in the formula
Figure 833009DEST_PATH_IMAGE022
Representing an intermediate neuron node located at the ith layer;
Figure 451072DEST_PATH_IMAGE023
an encoder node representing an i-th layer;
Figure 800145DEST_PATH_IMAGE024
m is the number of central nodes;
Figure 769238DEST_PATH_IMAGE025
represents a Down Sampling (Down Sampling) operation;
Figure 111227DEST_PATH_IMAGE026
represents a Convolution (Convolution) operation;
Figure 990321DEST_PATH_IMAGE027
representing a join operation;
Figure 634929DEST_PATH_IMAGE028
indicating a hybrid integration operation, a convolution operation is performed first, then a Batch Normalization operation is performed, and then an ELU (explicit Linear Unit) activation is performed.
The ReLU activation function is often adopted in the convolutional neural network, but a neuron using the ReLU activation function is easier to 'die' during training, so that the neuron can never be activated during the training process. To avoid this problem, the present invention preferably employs an ELU activation function. The activation function herein may be replaced by other activation functions, such as a ReLU function, a leak ReLU function, a prilu function, a Softplus function, a Swish function, a GeLU function, and the like.
When i =3, the central node can be obtained according to the above formula
Figure 484461DEST_PATH_IMAGE020
The calculating method of (2):
Figure 974348DEST_PATH_IMAGE029
when i =2, the central node can be obtained according to the above formula
Figure 707949DEST_PATH_IMAGE019
The calculating method of (2):
Figure 257879DEST_PATH_IMAGE030
when i =1, the central node can be obtained according to the above formula
Figure 935985DEST_PATH_IMAGE021
The calculating method of (2):
Figure 619776DEST_PATH_IMAGE031
in the invention, a maximum convergence method (Max clustering) is used for carrying out down-sampling operation. The maximum convergence method can be used to make the resolution of the images of adjacent layers different by 2 times, and when the resolution is reduced, the resolution is divided by 2. The difference of the image resolutions is 4 times every other layer, and the resolution is divided by 4 when the sampling is reduced; the resolution of two layers of images is different by 8 times, and the resolution is divided by 8 when the sampling is reduced.
For example, in FIG. 1, assume that the original picture size is 256 × 256, then the input is
Figure 473463DEST_PATH_IMAGE010
The picture size at the node is 256 × 256;
from
Figure 725452DEST_PATH_IMAGE010
Down-sampling to
Figure 15488DEST_PATH_IMAGE011
Dividing the picture size by 2 in the down-sampling process to obtain
Figure 847178DEST_PATH_IMAGE011
The node picture size is 128 x 128;
from
Figure 555371DEST_PATH_IMAGE011
Down-sampling to
Figure 712683DEST_PATH_IMAGE012
Dividing the picture size by 2 in the down-sampling process to obtain
Figure 148737DEST_PATH_IMAGE012
The node picture size is 64 x 64;
from
Figure 925063DEST_PATH_IMAGE012
Down-sampling to
Figure 139877DEST_PATH_IMAGE013
Dividing the picture size by 2 in the down-sampling process to obtain
Figure 533337DEST_PATH_IMAGE013
The node picture size is 32 x 32;
from
Figure 814277DEST_PATH_IMAGE013
Down-sampling to
Figure 253348DEST_PATH_IMAGE014
Dividing the picture size by 2 in the down-sampling process to obtain
Figure 155708DEST_PATH_IMAGE014
The node picture size is 16 x 16.
And for example from
Figure 310614DEST_PATH_IMAGE020
Down-sampling to
Figure 78850DEST_PATH_IMAGE015
Suppose that
Figure 446246DEST_PATH_IMAGE020
The picture size at the node is 64 x 64, and when down-sampling, the picture size is divided by 2 to obtain
Figure 928744DEST_PATH_IMAGE015
The picture size at the nodes is 32 x 32.
From
Figure 5284DEST_PATH_IMAGE019
Down-sampling to
Figure 572401DEST_PATH_IMAGE015
Suppose that
Figure 228641DEST_PATH_IMAGE019
The picture size of (2) is 128 x 128, and the down-sampling is performed by dividing the picture size by 4 to obtain
Figure 745073DEST_PATH_IMAGE015
The picture size at the nodes is 32 x 32.
From
Figure 710624DEST_PATH_IMAGE021
Down-sampling to
Figure 312506DEST_PATH_IMAGE015
Suppose that
Figure 772438DEST_PATH_IMAGE019
The size of the picture is 256 × 256, and when down-sampling, the picture is processedThe chip size was divided by 8 to give
Figure 143376DEST_PATH_IMAGE015
The picture size at the nodes is 32 x 32.
The pictures of the same layer must be the same size to be connected. Suppose that
Figure 748670DEST_PATH_IMAGE015
The picture size at the node is 32 x 32, then
Figure 965412DEST_PATH_IMAGE014
Up-sampling to
Figure 88089DEST_PATH_IMAGE015
The picture size at the node is 32 x 32; from
Figure 923321DEST_PATH_IMAGE020
Down-sampling to
Figure 637199DEST_PATH_IMAGE015
The picture size at the nodes is also 32 x 32; from
Figure 603887DEST_PATH_IMAGE019
Down-sampling to
Figure 140042DEST_PATH_IMAGE015
The picture size at the nodes is also 32 x 32; from
Figure 485572DEST_PATH_IMAGE021
Down-sampling to
Figure 963827DEST_PATH_IMAGE015
The picture size at the nodes is also 32 x 32. Thus arriving from different layers
Figure 27598DEST_PATH_IMAGE015
The picture size at the nodes is 32 x 32, so that the nodes can be connected with each other.
The calculation method of the decoder node of the CNUNet3+ network is as follows:
Figure 633023DEST_PATH_IMAGE032
in the formula, U represents an Up Sampling (Up Sampling) operation.
The present invention uses Bilinear Interpolation (Bilinear Interpolation) for Up-Sampling (Up Sampling). When the bilinear interpolation method is used for up-sampling, the resolution difference of adjacent layer images can be 2 times, and the resolution is multiplied by 2 when the up-sampling is carried out. The image resolution difference is 4 times every other layer, and the resolution is multiplied by 4 when the sampling is carried out; two layers of images are separated by 8 times of image resolution, and the resolution is multiplied by 8 when the image is up-sampled.
Taking FIG. 1 as an example, assume that
Figure 833060DEST_PATH_IMAGE014
The picture size at the node is 16 x 16;
from
Figure 482216DEST_PATH_IMAGE014
Up sampling to
Figure 502125DEST_PATH_IMAGE015
Multiplying the picture size by 2 in upsampling to obtain
Figure 442399DEST_PATH_IMAGE015
The node picture size is 32 x 32;
from
Figure 155664DEST_PATH_IMAGE015
Up sampling to
Figure 992033DEST_PATH_IMAGE016
Multiplying the picture size by 2 in upsampling to obtain
Figure 968080DEST_PATH_IMAGE016
The node picture size is 64 x 64;
from
Figure 774362DEST_PATH_IMAGE016
Up sampling to
Figure 73625DEST_PATH_IMAGE017
Multiplying the picture size by 2 in upsampling to obtain
Figure 346474DEST_PATH_IMAGE017
The node picture size is 128 x 128;
from
Figure 606554DEST_PATH_IMAGE017
Up sampling to
Figure 341161DEST_PATH_IMAGE018
Multiplying the picture size by 2 in upsampling to obtain
Figure 370297DEST_PATH_IMAGE018
The picture size at the node is 256 × 256.
And for example from
Figure 548468DEST_PATH_IMAGE017
Up sampling to
Figure 30265DEST_PATH_IMAGE018
Suppose that
Figure 834142DEST_PATH_IMAGE017
The picture size at the node is 128 x 128, and when up-sampling, the picture size is multiplied by 2 to obtain
Figure 124309DEST_PATH_IMAGE018
The picture size at the node is 256 × 256.
From
Figure 332437DEST_PATH_IMAGE016
Up sampling to
Figure 429093DEST_PATH_IMAGE018
Suppose that
Figure 380869DEST_PATH_IMAGE016
The picture size of (2) is 64 x 64, and the up-sampling is performed by multiplying the picture size by 4 to obtain
Figure 259963DEST_PATH_IMAGE018
The picture size at the node is 256 × 256.
From
Figure 170150DEST_PATH_IMAGE015
Up sampling to
Figure 751173DEST_PATH_IMAGE018
Suppose that
Figure 506639DEST_PATH_IMAGE015
The picture size of (2) is 32 x 32, and the picture size is multiplied by 8 in the up-sampling process to obtain
Figure 505819DEST_PATH_IMAGE018
The picture size at the node is 256 × 256.
The pictures of the same layer must be the same size to be connected. Suppose that
Figure 445963DEST_PATH_IMAGE018
The picture size at the node is 256 × 256, then
Figure 389648DEST_PATH_IMAGE014
Up-sampling to
Figure 824171DEST_PATH_IMAGE018
The picture size at the node is 256 × 256; from
Figure 536912DEST_PATH_IMAGE015
Up sampling to
Figure 461006DEST_PATH_IMAGE018
The picture size at the node is also 256 × 256; from
Figure 16621DEST_PATH_IMAGE016
Up sampling to
Figure 113890DEST_PATH_IMAGE018
The picture size at the node is also 256 × 256; from
Figure 353242DEST_PATH_IMAGE017
Up sampling to
Figure 384256DEST_PATH_IMAGE018
The picture size at the node is also 256 × 256. Thus arriving from different nodes
Figure 302534DEST_PATH_IMAGE018
The picture size at the nodes is 256 × 256, so that they can be connected to each other.
Fig. 2 is a diagram of a CNUNet3+ network model used in another embodiment. Unlike fig. 1, there are three nodes in the middle of fig. 1, and fig. 2 is changed to 2 nodes. Also the numbers under each circle are the number of channels.
For this embodiment, the central node's calculation formula remains:
Figure 344439DEST_PATH_IMAGE002
however, the calculation formula is slightly different for the decoder node, and the calculation method of the decoder node is as follows:
Figure 766193DEST_PATH_IMAGE033
fig. 3 is a CNUNet3+ network model diagram according to another embodiment of the present invention. Unlike fig. 1 and 2, fig. 3 is instead a single node. The calculation formula of each node changes accordingly.
The depth of the CNUNet3+ network model may vary. When the processing object is not very complicated, or in order to increase the processing speed, the depth can be changed from the five-layer coding depth shown in fig. 1, 2 and 3 to four-layer coding depth.
Fig. 4 shows a CNUNet3+ network model diagram according to another embodiment of the present invention, in which the coding depth is four layers and there are two nodes in the center.
Fig. 5 shows a CNUNet3+ network model diagram according to another embodiment of the present invention, in which the coding depth is four layers and there is a node in the center.
The encoder may also be changed to three layers of depth when the picture size processed is small, or for fast processing.
When the processed picture size is large, the encoder can be modified to six layers of depth. When the encoder is six layers deep, the number of central nodes may be 1 node, 2 nodes, 3 nodes, or 4 nodes.
When the size of the processed picture is larger, the depth of the encoder can be modified to be more than seven layers, and the depth can reach dozens of layers at most. Assuming that the encoder depth is N, the number of central nodes may range from 1 to (N-2).
While embodiments in accordance with the invention have been described above, these embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments described. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. The invention is limited only by the claims and their full scope and equivalents.

Claims (5)

1. Based on
Figure 750122DEST_PATH_IMAGE001
A method for semantic segmentation of images of a network, the method comprising:
construction of
Figure 765483DEST_PATH_IMAGE002
Network model, said
Figure 470134DEST_PATH_IMAGE001
The network model is in a U-shaped structure and adopts a coder-decoder structure; the depth is N, and N is a positive integer greater than or equal to 3; the left arm of the U-shaped structure is an encoder, and the right arm of the U-shaped structure is a decoder; the number of channels is doubled when the depth of the encoder is deepened by one layer; down-sampling operation is used between adjacent nodes of the encoder, up-sampling operation is used between adjacent nodes of the decoder, and dense connection is used between the nodes of the decoder; a central node is inserted in each of the first layer to the L-th layer,
Figure 136607DEST_PATH_IMAGE003
(ii) a When the encoder node and the central node are at the same depth, convolution operation is adopted; when the encoder node and the central node are at different depths, adopting down-sampling operation; when the central node and the decoder node are at the same depth, adopting convolution operation; when the central node and the decoder node are at different depths, adopting down-sampling operation; convolution calculation is adopted between nodes of an encoder and a decoder without a central node layer, and the encoder node, the central node and the decoder node are all neuron nodes;
inputting training set data into the
Figure 307826DEST_PATH_IMAGE001
In the network model, for the
Figure 590908DEST_PATH_IMAGE001
Training a network model, wherein the training set data takes an original image and an image marked with a segmentation result as training set samples;
preprocessing the image to be segmented, and modifying all the images to be segmented into the same size;
after use training
Figure 833671DEST_PATH_IMAGE001
And the network model is used for segmenting the image to be segmented.
2. The image semantic segmentation method according to claim 1, wherein the central node has a calculation formula of:
Figure 839804DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 509820DEST_PATH_IMAGE005
represents a central node located at the ith level;
Figure 749040DEST_PATH_IMAGE006
an encoder node representing an i-th layer; m represents the number of central nodes;
Figure 529914DEST_PATH_IMAGE007
representing a down-sampling operation;
Figure 718450DEST_PATH_IMAGE008
representing a convolution operation;
Figure 218089DEST_PATH_IMAGE009
representing a join operation;
Figure 695338DEST_PATH_IMAGE010
representing hybrid integration operation, firstly performing convolution operation, then performing batch standardization operation, and then performing ELU activation; in the formula
Figure 279903DEST_PATH_IMAGE011
3. The image semantic segmentation method according to claim 1, characterized in that the image semantic segmentation method is
Figure 775475DEST_PATH_IMAGE001
The network model employs an ELU activation function.
4. The method of image semantic segmentation of claim 1, wherein the downsampling operation uses a maximum aggregation method.
5. The method for semantic segmentation of images according to claim 1, wherein the upsampling operation uses a bilinear interpolation method.
CN202210118688.8A 2022-02-08 2022-02-08 Image semantic segmentation method based on CNUNet3+ network Active CN114170249B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210118688.8A CN114170249B (en) 2022-02-08 2022-02-08 Image semantic segmentation method based on CNUNet3+ network
NL2032957A NL2032957B1 (en) 2022-02-08 2022-09-05 SEMANTIC IMAGE SEGMENTATION METHOD BASED ON CNUNet3+

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210118688.8A CN114170249B (en) 2022-02-08 2022-02-08 Image semantic segmentation method based on CNUNet3+ network

Publications (2)

Publication Number Publication Date
CN114170249A true CN114170249A (en) 2022-03-11
CN114170249B CN114170249B (en) 2022-04-19

Family

ID=80489504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210118688.8A Active CN114170249B (en) 2022-02-08 2022-02-08 Image semantic segmentation method based on CNUNet3+ network

Country Status (2)

Country Link
CN (1) CN114170249B (en)
NL (1) NL2032957B1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516740A (en) * 2019-08-28 2019-11-29 电子科技大学 A kind of fault recognizing method based on Unet++ convolutional neural networks
US20200074271A1 (en) * 2018-08-29 2020-03-05 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing a multi-resolution neural network for use with imaging intensive applications including medical imaging
CN113240691A (en) * 2021-06-10 2021-08-10 南京邮电大学 Medical image segmentation method based on U-shaped network
CN113689419A (en) * 2021-09-03 2021-11-23 电子科技大学长三角研究院(衢州) Image segmentation processing method based on artificial intelligence
CN113762264A (en) * 2021-08-26 2021-12-07 南京航空航天大学 Multi-encoder fused multispectral image semantic segmentation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200074271A1 (en) * 2018-08-29 2020-03-05 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing a multi-resolution neural network for use with imaging intensive applications including medical imaging
CN110516740A (en) * 2019-08-28 2019-11-29 电子科技大学 A kind of fault recognizing method based on Unet++ convolutional neural networks
CN113240691A (en) * 2021-06-10 2021-08-10 南京邮电大学 Medical image segmentation method based on U-shaped network
CN113762264A (en) * 2021-08-26 2021-12-07 南京航空航天大学 Multi-encoder fused multispectral image semantic segmentation method
CN113689419A (en) * 2021-09-03 2021-11-23 电子科技大学长三角研究院(衢州) Image segmentation processing method based on artificial intelligence

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUIMIN HUANG 等: "UNet 3+: A Full-Scale Conneted UNet for Medical Image Segmentation", 《ICASSP》 *

Also Published As

Publication number Publication date
NL2032957B1 (en) 2024-01-29
CN114170249B (en) 2022-04-19
NL2032957A (en) 2023-08-11

Similar Documents

Publication Publication Date Title
CN111627019B (en) Liver tumor segmentation method and system based on convolutional neural network
Zhao et al. SCOAT-Net: A novel network for segmenting COVID-19 lung opacification from CT images
CN113674253A (en) Rectal cancer CT image automatic segmentation method based on U-transducer
CN111368849A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112396605B (en) Network training method and device, image recognition method and electronic equipment
CN114066866A (en) Medical image automatic segmentation method based on deep learning
CN113205524B (en) Blood vessel image segmentation method, device and equipment based on U-Net
CN115375711A (en) Image segmentation method of global context attention network based on multi-scale fusion
CN115471470A (en) Esophageal cancer CT image segmentation method
CN113436173A (en) Abdomen multi-organ segmentation modeling and segmentation method and system based on edge perception
CN112381846A (en) Ultrasonic thyroid nodule segmentation method based on asymmetric network
Sun et al. COVID-19 CT image segmentation method based on swin transformer
CN114612662A (en) Polyp image segmentation method based on boundary guidance
Ruan et al. An efficient tongue segmentation model based on u-net framework
CN114170249B (en) Image semantic segmentation method based on CNUNet3+ network
CN111507950B (en) Image segmentation method and device, electronic equipment and computer-readable storage medium
Wang et al. Accurate lung nodule segmentation with detailed representation transfer and soft mask supervision
CN113538363A (en) Lung medical image segmentation method and device based on improved U-Net
CN117934824A (en) Target region segmentation method and system for ultrasonic image and electronic equipment
CN111755131A (en) COVID-19 early screening and severity degree evaluation method and system based on attention guidance
CN116168052A (en) Gastric cancer pathological image segmentation method combining self-adaptive attention and feature pyramid
CN115908438A (en) CT image focus segmentation method, system and equipment based on deep supervised ensemble learning
CN115965785A (en) Image segmentation method, device, equipment, program product and medium
CN115690115A (en) Lung medical image segmentation method based on reconstruction pre-training
CN114119558A (en) Method for automatically generating nasopharyngeal carcinoma image diagnosis structured report

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant