CN109886986B - Dermatoscope image segmentation method based on multi-branch convolutional neural network - Google Patents

Dermatoscope image segmentation method based on multi-branch convolutional neural network Download PDF

Info

Publication number
CN109886986B
CN109886986B CN201910062500.0A CN201910062500A CN109886986B CN 109886986 B CN109886986 B CN 109886986B CN 201910062500 A CN201910062500 A CN 201910062500A CN 109886986 B CN109886986 B CN 109886986B
Authority
CN
China
Prior art keywords
branch
image
layer
network
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910062500.0A
Other languages
Chinese (zh)
Other versions
CN109886986A (en
Inventor
谢凤英
杨加文
姜志国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201910062500.0A priority Critical patent/CN109886986B/en
Publication of CN109886986A publication Critical patent/CN109886986A/en
Application granted granted Critical
Publication of CN109886986B publication Critical patent/CN109886986B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a dermatoscope image segmentation method based on a multi-branch convolutional neural network, which comprises the following steps of: firstly, collecting training samples; secondly, expanding the image; thirdly, designing a multi-branch convolutional neural network model; fourthly, training a multi-branch convolution network; fifthly, generating a skin damage distribution probability graph; and sixthly, obtaining a segmentation result. The invention has the advantages and effects that: aiming at the skin mirror image data characteristics, the training data set is effectively expanded by using corresponding image transformation, so that the network training is effective and the generalization performance is strong; the convolutional neural network comprises a plurality of branches, and abundant semantic information and detail information are fused, so that compared with a common network, the convolutional neural network can better recover the skin damage edge and obtain a more accurate skin damage segmentation result; the invention is a full-automatic segmentation scheme, only needs to input the dermatoscope image to be segmented, and the scheme can automatically provide the segmentation result of the image without additional processing, and is efficient and simple.

Description

Dermatoscope image segmentation method based on multi-branch convolutional neural network
Technical Field
The invention belongs to the field of computer-aided diagnosis, and particularly relates to a dermatoscope image segmentation method based on a multi-branch convolutional neural network.
Background
Skin melanoma is divided into benign and malignant, where malignant skin melanoma is extremely harmful and can easily lead to death if the patient does not receive timely treatment at an early stage. For cutaneous melanoma, the most effective treatment is early detection followed by focal resection. The dermatoscope is also called as an epidermal light transmission microscope, and can obtain a dermatoscope image with high resolution and definition. The skin mirror image is automatically diagnosed, so that diagnosis errors caused by subjectivity of a doctor during diagnosis can be avoided. When skin lesions are diagnosed, the region shape and boundary information are important diagnosis bases. The dermatoscope image segmentation is to obtain an accurate skin lesion area, and is an important step in an automatic auxiliary diagnosis process. However, the dermatome image segmentation is a challenging problem due to the large differences of shapes, colors and the like of different skin lesions and the frequent interference of hairs, air bubbles and the like in the process of acquiring the dermatome image.
At present, a dermatoscope image segmentation method mainly comprises: edge, threshold, or region based methods and supervised learning based methods. Calculating the gray gradient in the image by using classical operators such as Sobel, Laplacian or Canny and the like based on an edge method, and then extracting a region with large gradient change as a skin damage boundary; the method has good effect on dermatome images with clear skin lesion boundaries and no other interference. The threshold-based method utilizes the characteristic that the skin damage color is usually inconsistent with the background skin color to set one or more thresholds to divide different areas; this method is simple to calculate, but the threshold is not easy to select. Combining adjacent similar pixels or sub-regions through region growing based on a region method until a final skin damage region is obtained; this method is suitable for maintaining consistent dermoscopic images of the interior of a lesion. Relevant features are manually designed by a supervised learning-based method, or potential features are automatically mined by data, then a classifier is trained to classify the features, and whether a subregion or a certain pixel belongs to skin damage or normal skin is judged; this approach is very dependent on feature design and selection, and has poor adaptability to complex dermatoscopic images.
Convolutional neural networks are widely used in the field of image processing, and have excellent performance in various tasks such as target classification, target detection, target segmentation and the like. The convolutional neural network can automatically learn high-level features from training data, and has very strong adaptability. The invention designs a brand-new convolutional neural network model for dermatoscope image segmentation, the model is provided with a plurality of branches, the lower-layer branches extract detail information, the higher-layer branches extract semantic information, loss is calculated by each branch, back propagation training is carried out to ensure that each branch can effectively extract features, and finally a plurality of branch feature maps are fused, and accurate dermatoscope image segmentation results are obtained through up-sampling.
Disclosure of Invention
The invention aims to provide a dermatoscope image segmentation method based on a multi-branch convolutional neural network, which can automatically and accurately extract a skin damage area, assist the subsequent links of a dermatoscope image diagnosis system and improve the skin damage diagnosis accuracy. The invention has good robustness to the interference of hairs, bubbles and the like in the image by automatically learning high-level features from the original image data, and outputs the skin damage segmentation result with accurate boundary by multi-branch feature fusion.
The invention relates to a dermatoscope image segmentation method based on a multi-branch convolutional neural network, which comprises the following steps of:
the method comprises the following steps: training sample collection
The dermatoscope image used in the present invention is derived from an internationally published data set, and comprises 2750 original dermatoscope images (2000 of them are used for training and 750 are used for verification and testing), and the true value image of the skin damage area is manually marked by a dermatologist, and the true value image is a binary image, wherein 1 represents the skin damage area, and 0 represents the healthy skin area. The lesions contained in the data set vary greatly in shape, color, texture, location, etc., with image resolutions between 542 x 718 and 2848 x 4288. For convenience of processing, the original image and the real-valued skin loss map are uniformly scaled to 512 × 512.
Step two: image augmentation
Generally, the more neurons of the convolutional neural network, the larger the network capacity, which means that the network model can fit more complex mapping relationships, i.e. very good performance for complex tasks. However, with more neurons, the parameters in the network will increase greatly, and in the subsequent training process, if there is not enough training data, the overfitting problem is easily caused. Therefore, in order to train to obtain a good network, the invention adopts various methods to expand the training image.
Considering that different shooting angles exist when images are collected, each original training image is respectively subjected to horizontal turning, vertical turning and rotation of 90 degrees, 180 degrees and 270 degrees. In addition, since the partial dermoscopic images were observed to have black edges on the top, bottom, left, and right sides, the original training images were each shifted by 25 pixels in four directions. And the skin damage true value graph is correspondingly transformed along with the original graph. Finally, 2000 original training images were expanded to obtain 20000 training images.
Step three: multi-branch convolutional neural network model design
In the deep neural network model, the network shallow layer extracts detail information such as edges and textures in the image, so that the shallow layer feature map comprises skin damage boundary information in a plurality of skin mirror images, and the network deep layer integrates the low-level features extracted from the shallow layer to further form high-level features with rich semantics, so that the classification is convenient. For a skin mirror image segmentation task, abundant semantic information is needed to accurately classify skin damage and background, and detail information is needed to accurately extract a skin damage boundary. In contrast, the invention constructs a convolutional neural network, which is an encoder-decoder architecture, wherein the encoder extracts semantic and detail information, and the decoder recovers the feature map size to obtain the final segmentation result. In addition, 4 branches are constructed at different layers of the encoder to extract features, and finally the features on the different branches are fused to obtain an accurate skin mirror image segmentation result. The specific design idea is as follows:
1. the encoder structure design: in a general convolutional network, each convolutional layer usually only receives the output of the previous layer as input, and the information before the previous layer cannot be well utilized. In the convolutional network constructed by the method, the output of each convolutional layer in the convolutional block is the input of the subsequent convolutional layer, so that the learned low-level features can be fully utilized subsequently, the convolutional layer is prevented from learning repeated features, the information redundancy is reduced, and the network training difficulty is reduced. The overall framework of the encoder is as follows: input image → convolutional layer → pooling layer → 1 st volume block → 1 st pooling block → 2 nd volume block → 2 nd pooling block → 3 rd volume block, specific details are as follows:
firstly, an image is input into an encoder, and then is subjected to 1 convolutional layer (the kernel size is 7 multiplied by 7, the step length is 2, and the output dimension is 24), then is subjected to 1 maximum pooling layer (the kernel size is 3 multiplied by 3, and the step length is 2), and finally is input into a 1 st convolutional block. Representing the Input image by Input #, Conv7 × 7# representing the 7 × 7 convolutional layer, MaxPool # representing the largest pooling layer, BN # representing the batch normalization layer, and ReLU # representing the modified linear cell layer, can be expressed as: input → Conv7 × 7 → BN → ReLU → MaxPool. Since the step size in Conv7 × 7 and MaxPool is 2, the resolution of the final output feature map is 1/4 of the input image, and then the 1 st volume block is input.
② the 1 st, 2 nd and 3 rd convolution blocks are all composed of 6 convolution layers, and Conv3 × 3# represents the convolution layer (kernel size is 3 × 3, step size is 1, output dimension is 12), the specific structure of each layer in the convolution block is BN → ReLU → Conv3 × 3. The specific structure of the convolution Block designed by the invention is shown in fig. 1.
And dimensionality reduction of the 1 st pooling block and the 2 nd pooling block. In the convolution blocks, the feature maps are all the same size, and in order to reduce the size of the feature maps to extract higher-level semantic features and reduce the dimension of the feature maps to reduce the parameter number, the invention designs a pooling block after the No. 1 and the No. 2 convolution blocks respectively. Using X # to represent the input feature map (the number of channels is N), Conv3 × 3# to represent the convolutional layer (the kernel size is 3 × 3, the step size is 1, the output dimension is N/2), MaxPool # to represent the maximum pooling layer (the kernel size is 2 × 2, the step size is 2), then the specific structure of the pooling block designed by the present invention is: x → BN → ReLU → Conv 3X 3 → MaxPool. After the feature maps output by the 1 st convolution block and the 2 nd convolution block pass through the pooling block, the number, width and height of the channels are all 1/2.
2. And (3) designing a multi-branch structure: in the convolutional neural network designed by the invention, 4 branches are constructed at different layers to extract features, and finally concat fusion is carried out on the features on the different branches, as shown in the 1 st branch, the 2 nd branch, the 3 rd branch and the 4 th branch of FIG. 2. Each branch extracts information through 1 convolutional layer (kernel size is 1 × 1, step size is 1, output dimension is 2), wherein the lower layer branches mainly extract detailed information, and the upper layer branches mainly extract semantic information. In addition, due to dimension reduction in the network, the sizes of different branch output feature maps are not consistent, the width and height of the 2 nd branch output feature map are 1/2 on the 1 st branch, and the output feature maps of the 3 rd branch and the 4 th branch are 1/4 on the 1 st branch, so that bilinear interpolation is used here, the output feature maps of the 2 nd branch, the 3 rd branch and the 4 th branch are all up-sampled to the same size as the 1 st branch feature map, and then concat fusion is carried out. The fused feature map is then input to a decoder on the backbone.
3. The decoder structure design: since the spatial resolution of the fused feature map in the encoder is 1/4 of the original input image, the present invention designs a decoder to recover the feature map spatial dimensions. The decoder mainly comprises 2 convolutional layers and 1 quadruple upsampling layer, wherein X # represents an input fusion feature map, Conv1 × 1_1# represents a 1 st convolutional layer (the kernel size is 1 × 1, the step size is 1, and the output dimension is 4), Conv1 × 1_2# represents a 2 nd convolutional layer (the kernel size is 1 × 1, the step size is 1, and the output dimension is 2) and Usample # represents a 4-fold bilinear interpolation upsampling layer, so that the specific structure of the decoder designed by the invention is as follows: x → BN → ReLU → Conv1 × 1_1 → ReLU → Conv1 × 1_2 → Upesample.
Step four: multi-branch convolutional network training
And (3) obtaining a distribution probability map of the skin damage region by the characteristic map output by the decoder through a Softmax function, and comparing the distribution probability map with the segmentation true value map through a cross entropy function to calculate the loss. The loss is reversely propagated in the network to obtain the gradient of the parameters in the network, and then the parameters are adjusted according to a gradient descent method, so that the loss value is reduced, and the network is optimal. The specific calculation of this cross entropy loss function is as follows:
Figure BDA0001954598840000061
wherein W and H are the segmentation true value width and height, respectively, yijRepresenting the true class of pixel (i, j),skin damage of 1, background of 0, pijRepresenting the probability that pixel (i, j) is a skin lesion.
In order to ensure that the information extracted by the 4 branches is valid, the branch Loss is calculated by the same method as described above for the output feature map of each branch1,2,3,4. Because the branch characteristic diagram does not directly generate a final segmentation result, in order to avoid excessive influence on the segmentation precision, each branch loss is multiplied by a coefficient of 0.5 and then added with the segmentation loss on the trunk to obtain a total loss, and finally, parameters in the network are actually obtained by carrying out back propagation training on the total loss. Specifically, it can be expressed as:
Lossall=Loss+0.5*Loss1,2,3,4
step five: lesion distribution probability map generation
The network designed by the invention inputs a skin mirror image and outputs a skin damage distribution probability chart. In order to improve robustness, before the images to be segmented are input into a network, the images to be segmented are respectively subjected to horizontal and vertical turning, rotation of 90 degrees, 180 degrees and 270 degrees, and the original images are added, so that 6 images in total are respectively input into the segmentation network, and 6 skin damage distribution probability maps are obtained. And then, corresponding to the transformation of the input image, carrying out corresponding inverse transformation on the probability map, and finally averaging 6 probability maps to obtain a final skin damage distribution probability map.
Step six: segmentation result obtaining
And after obtaining the distribution probability map of the skin damage area, setting the threshold value to be 0.5, regarding the pixels with the probability value larger than 0.5 as skin damage pixels, or else, regarding the pixels as background skin, and finally obtaining the segmentation result of the image to be segmented.
The invention relates to a dermatoscope image segmentation method based on a multi-branch convolutional neural network, which has the advantages and the effects that:
(1) aiming at the skin mirror image data characteristics, the invention uses the corresponding image transformation to effectively expand the training data set, thereby ensuring the effectiveness of network training and strong generalization performance.
(2) The convolutional neural network designed by the invention comprises a plurality of branches, and is fused with abundant semantic information and detail information, so that compared with a common network, the convolutional neural network can better recover the skin damage edge and obtain a more accurate skin damage segmentation result.
(3) The invention is a full-automatic segmentation scheme, only needs to input the dermatoscope image to be segmented, and the scheme can automatically provide the segmentation result of the image without additional processing, and is efficient and simple.
Drawings
Fig. 1 is a schematic diagram of a convolution block structure in a network designed by the present invention.
Fig. 2 is a schematic diagram of a network structure designed by the present invention.
Fig. 3 is a flow chart of the implementation of the present invention.
Detailed Description
For a better understanding of the technical aspects of the present invention, reference will now be made in detail to the embodiments of the present invention as illustrated in the accompanying drawings.
The invention is realized under a PyTorch deep learning framework, and the network structure diagram and the flow chart of the invention are respectively shown in FIG. 2 and FIG. 3. The computer configuration adopts: intel Core i 56600K processor, 16GB memory, NVIDIA GeForceGTX1080 video card, Ubuntu 16.04 operating system. The invention relates to a dermatoscope image segmentation method based on a multi-branch convolutional neural network, which specifically comprises the following steps:
step 1: skin mirror image training sample collection and processing
A dermoscopic image dataset comprising 2750 raw dermoscopic images (2000 for training and 750 for verification and testing) and a Skin lesion area true value map manually labeled by a professional dermatologist was downloaded from The International Skin Imaging Corporation (ISIC) Challenge official web.
Step 2: training image processing
The original dermatome image and the segmentation true value image are first scaled uniformly in size to 512 × 512, and then the images are transformed. Considering that different shooting angles exist when images are collected, each original training image is respectively subjected to horizontal turning, vertical turning and rotation of 90 degrees, 180 degrees and 270 degrees. In addition, since the partial skin mirror image is observed to have black edges on the top, bottom, left, and right sides, each original image is shifted by 25 pixels in four directions. The skin damage true value graph is also transformed correspondingly. Finally, 2000 original training images were expanded to obtain 20000 training images.
And step 3: multi-branch convolutional network structure design
The structure of the multi-branch convolutional network designed by the present invention is shown in fig. 2, wherein the BN layer and the ReLU layer are simplified for simplicity. According to the network structure, Module is written under a PyTorch deep learning framework, and the Module mainly comprises the following three parts:
1. the encoder structure: the overall framework of the encoder is: input image → convolutional layer → pooling layer → 1 st volume block → 1 st pooling block → 2 nd volume block → 2 nd pooling block → 3 rd volume block, specific details are as follows:
firstly, an image is input into an encoder, and then is subjected to 1 convolutional layer (the kernel size is 7 multiplied by 7, the step length is 2, and the output dimension is 24), then is subjected to 1 maximum pooling layer (the kernel size is 3 multiplied by 3, and the step length is 2), and finally is input into a 1 st convolutional block. Representing the Input image by Input #, Conv7 × 7# representing the 7 × 7 convolutional layer, MaxPool # representing the largest pooling layer, BN # representing the batch normalization layer, and ReLU # representing the modified linear cell layer, can be expressed as: input → Conv7 × 7 → BN → ReLU → MaxPool. Since the step size in Conv7 × 7 and MaxPool is 2, the resolution of the final output feature map is 1/4 of the input image, and then the 1 st volume block is input.
② the 1 st, 2 nd and 3 rd convolution blocks are all composed of 6 convolution layers, and Conv3 × 3# represents the convolution layer (kernel size is 3 × 3, step size is 1, output dimension is 12), the specific structure of each layer in the convolution block is BN → ReLU → Conv3 × 3. The specific structure of the convolution Block designed by the invention is shown in fig. 1.
And dimensionality reduction of the 1 st pooling block and the 2 nd pooling block. In the convolution blocks, the feature maps are all the same size, and in order to reduce the size of the feature maps to extract higher-level semantic features and reduce the dimension of the feature maps to reduce the parameter number, the invention designs a pooling block after the No. 1 and the No. 2 convolution blocks respectively. Using X # to represent the input feature map (the number of channels is N), Conv3 × 3# to represent the convolutional layer (the kernel size is 3 × 3, the step size is 1, the output dimension is N/2), MaxPool # to represent the maximum pooling layer (the kernel size is 2 × 2, the step size is 2), then the specific structure of the pooling block designed by the present invention is: x → BN → ReLU → Conv 3X 3 → MaxPool. After the feature maps output by the 1 st convolution block and the 2 nd convolution block pass through the pooling block, the number, width and height of the channels are all 1/2.
2. A multi-branch structure: in the convolutional neural network designed by the invention, 4 branches are constructed at different layers to extract features, and finally concat fusion is carried out on the features on the different branches, as shown in the 1 st branch, the 2 nd branch, the 3 rd branch and the 4 th branch of FIG. 2. Each branch extracts information through 1 convolutional layer (kernel size is 1 × 1, step size is 1, output dimension is 2), wherein the lower layer branches mainly extract detailed information, and the upper layer branches mainly extract semantic information. In addition, due to dimension reduction in the network, the sizes of different branch output feature maps are not consistent, the width and height of the 2 nd branch output feature map are 1/2 on the 1 st branch, and the 2 nd branch and the 4 th branch output feature map are 1/4 on the 1 st branch, so that bilinear interpolation is used here, the output feature maps of the 2 nd branch, the 3 rd branch and the 4 th branch are all up-sampled to the same size as the 1 st branch feature map, and then concat fusion is carried out. The fused feature map is then input to a decoder on the backbone.
3. The decoder structure is as follows: since the spatial resolution of the fused feature map in the encoder is 1/4 of the original input image, the present invention designs a decoder to recover the feature map spatial dimensions. The decoder mainly comprises 2 convolutional layers and 1 quadruple upsampling layer, wherein X # represents an input fusion feature map, Conv1 × 1_1# represents a 1 st convolutional layer (the kernel size is 1 × 1, the step size is 1, and the output dimension is 4), Conv1 × 1_2# represents a 2 nd convolutional layer (the kernel size is 1 × 1, the step size is 1, and the output dimension is 2) and Usample # represents a 4-fold bilinear interpolation upsampling layer, so that the specific structure of the decoder designed by the invention is as follows: x → BN → ReLU → Conv1 × 1_1 → ReLU → Conv1 × 1_2 → Upesample.
And 4, step 4: multi-branch convolutional network training
And (3) obtaining a distribution probability map of the skin damage region by the characteristic map output by the decoder through a Softmax function, and comparing the distribution probability map with the segmentation true value map through a cross entropy function to calculate the loss. The loss is reversely propagated in the network to obtain the gradient of the parameters in the network, and then the parameters are adjusted according to a gradient descent method, so that the loss value is reduced, and the network is optimal. The specific calculation of this cross entropy loss function is as follows:
Figure BDA0001954598840000101
wherein W and H are the segmentation true value width and height, respectively, yijRepresents the true class of pixel (i, j), with a skin lesion of 1, a background of 0, pijRepresenting the probability that pixel (i, j) is a skin lesion.
In order to ensure that the information extracted by the 4 branches is valid, the branch Loss is calculated by the same method as described above for the output feature map of each branch1,2,3,4. Because the branch characteristic diagram does not directly generate a final segmentation result, in order to avoid excessive influence on the segmentation precision, each branch loss is multiplied by a coefficient of 0.5 and then added with the segmentation loss on the trunk to obtain a total loss, and finally, parameters in the network are actually obtained by carrying out back propagation training on the total loss. Specifically, it can be expressed as:
Lossall=Loss+0.5*LosS1,2,3,4
in the training process, the method adopts an Adam random gradient descent method to train the network, the initial learning rate is set to be 0.0005, the number of each batch is set to be 8, and the maximum training round number is set to be 200. And when the total loss on the verification set is no longer in a continuous descending trend, stopping the network training in advance to avoid overfitting.
And 5: multi-branch convolutional network image testing
The network designed by the invention inputs a skin mirror image and outputs a skin damage distribution probability chart. In order to improve robustness, before the images to be segmented are input into a network, the images to be segmented are respectively subjected to horizontal and vertical turning, rotation of 90 degrees, 180 degrees and 270 degrees, and the original images are added, so that 6 images in total are respectively input into the segmentation network, and 6 skin damage distribution probability maps are obtained. And then, corresponding to the transformation of the input image, carrying out corresponding inverse transformation on the probability map, and finally averaging 6 probability maps to obtain a final skin damage distribution probability map.
And after obtaining the distribution probability map of the skin damage area, setting the threshold value to be 0.5, regarding the pixels with the probability value larger than 0.5 as skin damage pixels, and obtaining a segmentation image of the test image if the pixels are not the background skin.
In the whole segmentation model framework, the transformation of the test image, the inverse transformation of the corresponding segmentation image, the fusion and thresholding of the segmentation image are all automatically processed by codes. Therefore, in the actual test, a skin mirror image is input, and the model directly outputs a final segmentation result image.

Claims (1)

1. A dermatoscope image segmentation method based on a multi-branch convolutional neural network is characterized by comprising the following steps: the method comprises the following steps:
the method comprises the following steps: training sample collection
The dermatome image is derived from an international public data set and comprises 2750 original dermatome images, 2000 of which are used for training and 750 of which are used for verification and testing, a real value image of a skin damage area of the dermatome is manually marked by a dermatologist, and the real value image is a binary image, wherein 1 represents a skin damage area, and 0 represents a healthy skin area; for convenience of processing, the original image and the real-valued skin loss image are uniformly scaled to 512 × 512;
step two: image augmentation
Respectively carrying out horizontal turning, vertical turning and rotation of 90 degrees, 180 degrees and 270 degrees on each original training image; respectively translating each original training image by 25 pixels in four directions; the real value picture of the skin loss is correspondingly transformed along with the original picture; finally, 20000 training images are obtained by expanding 2000 original training images;
step three: multi-branch convolutional neural network model design
Constructing a convolutional neural network as an encoder-decoder framework, wherein the encoder extracts semantic and detail information, and the decoder restores the size of a characteristic diagram to obtain a final segmentation result; in addition, 4 branches are constructed on different layers of the encoder to extract features, and finally the features on the different branches are fused to obtain an accurate skin mirror image segmentation result;
step four: multi-branch convolutional network training
The characteristic diagram output by the decoder is subjected to a Softmax function to obtain a distribution probability diagram of a skin damage area, and then the distribution probability diagram is compared with a segmentation truth value diagram through a cross entropy function to calculate loss; the loss is reversely propagated in the network to obtain the gradient of the parameters in the network, and then the parameters are adjusted according to a gradient descent method to reduce the loss value and optimize the network; the specific calculation of the cross entropy loss function is as follows:
Figure FDA0002592108910000021
wherein W and H are the segmentation true value width and height, respectively, yijRepresents the true class of pixel (i, j), with a skin lesion of 1, a background of 0, pijRepresents the probability that pixel (i, j) is a skin lesion;
in order to ensure that the information extracted by the 4 branches is valid, the same method as described above is applied to the output feature map of each branch, and the Loss of each of the 4 branches is calculated1,Loss2,Loss3And Loss4(ii) a Because the branch characteristic diagram does not directly generate a final segmentation result, in order to avoid excessive influence on the segmentation precision, each branch loss is multiplied by a coefficient of 0.5 and then added with the segmentation loss on the trunk to obtain a total loss, and finally parameters in the network are actually obtained by carrying out back propagation training on the total loss; specifically, it can be expressed as:
Figure FDA0002592108910000022
step five: lesion distribution probability map generation
The network designed by the invention inputs a skin mirror image and outputs a skin damage distribution probability chart; in order to improve robustness, before the image to be segmented is input into a network, the image to be segmented is respectively subjected to horizontal and vertical turning, rotation of 90 degrees, 180 degrees and 270 degrees, and the original image is added, so that 6 images in total are respectively input into the segmentation network, and 6 skin damage distribution probability graphs are obtained; then, corresponding to the transformation of the input image, carrying out corresponding inverse transformation on the probability map, and finally averaging 6 probability maps to obtain a final skin damage distribution probability map;
step six: segmentation result obtaining
After obtaining a skin damage area distribution probability map, setting a threshold value to be 0.5, regarding pixels with probability values larger than 0.5 as skin damage pixels, and if not, regarding the pixels as background skin, and finally obtaining a segmentation result of the image to be segmented;
the general framework of the encoder is as follows: input image → convolutional layer → pooling layer → 1 st volume block → 1 st pooling block → 2 nd volume block → 2 nd pooling block → 3 rd volume block, the details are as follows:
an image input encoder firstly passes through 1 convolution layer with the kernel size of 7 multiplied by 7, the step length of 2 and the output dimension of 24, then passes through 1 maximum pooling layer with the kernel size of 3 multiplied by 3 and the step length of 2, and finally inputs a 1 st convolution block; representing the Input image by Input #, Conv7 × 7# representing the 7 × 7 convolutional layer, MaxPool # representing the largest pooling layer, BN # representing the batch normalization layer, and ReLU # representing the modified linear cell layer, can be expressed as: input → Conv7 × 7 → BN → ReLU → MaxPool; since the step size in Conv7 × 7 and MaxPool is 2, the resolution of the final output feature map is 1/4 of the input image, and then the 1 st volume block is input;
the 1 st convolution block, the 2 nd convolution block and the 3 rd convolution block are all composed of 6 convolution layers, and the convolution layer with the kernel size of 3 × 3, the step size of 1 and the output dimension of 12 is represented by Conv3 × 3#, so that the specific structure of each layer in the convolution blocks is BN → ReLU → Conv3 × 3;
the 1 st pooling block and the 2 nd pooling block are subjected to dimensionality reduction; a pooling block is respectively designed after the 1 st convolution block and the 2 nd convolution block; using X # to represent an input feature map, the number of channels is N, Conv3 × 3# represents a convolutional layer with a core size of 3 × 3, a step size of 1 and an output dimension of N/2, MaxPool # represents a maximum pooling layer with a core size of 2 × 2 and a step size of 2, and then the specific structure of the pooling block is as follows: x → BN → ReLU → Conv3 × 3 → MaxPool; after the characteristic graphs output by the 1 st convolution block and the 2 nd convolution block pass through the pooling block, the number, the width and the height of the channels are all 1/2 of the original values;
the multi-branch structure is specifically designed as follows: in the convolutional neural network, 4 branches are constructed in different layers to extract features, and finally concat fusion is carried out on the features on the different branches; each branch extracts information through 1 convolution layer with the kernel size of 1 multiplied by 1, the step length of 1 and the output dimension of 2, wherein the low-layer branch extracts detail information and the high-layer branch extracts semantic information; in addition, due to dimension reduction in the network, the sizes of different branch output feature maps are not consistent, the width and height of the 2 nd branch output feature map are 1/2 on the 1 st branch, and the output feature maps of the 3 rd branch and the 4 th branch are 1/4 on the 1 st branch, so that bilinear interpolation is used here, the output feature maps of the 2 nd branch, the 3 rd branch and the 4 th branch are all up-sampled to the same size as the 1 st branch feature map, and then concat fusion is carried out; the fused feature map is input to a decoder on the backbone;
the decoder comprises 2 convolutional layers and 1 quadruple upsampling layers, and an input fusion feature map is represented by X #; conv1 × 1 — 1# represents the 1 st convolutional layer, with a kernel size of 1 × 1, step size of 1, and output dimension of 4; conv1 × 1 — 2# represents the 2 nd convolutional layer, the kernel size is 1 × 1, the step size is 1, and the output dimension is 2; the upsamplale # represents a 4 times bilinear interpolation upsampling layer, and the specific structure of the decoder is as follows: x → BN → ReLU → Conv1 × 1_1 → ReLU → Conv1 × 1_2 → Upesample.
CN201910062500.0A 2019-01-23 2019-01-23 Dermatoscope image segmentation method based on multi-branch convolutional neural network Active CN109886986B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910062500.0A CN109886986B (en) 2019-01-23 2019-01-23 Dermatoscope image segmentation method based on multi-branch convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910062500.0A CN109886986B (en) 2019-01-23 2019-01-23 Dermatoscope image segmentation method based on multi-branch convolutional neural network

Publications (2)

Publication Number Publication Date
CN109886986A CN109886986A (en) 2019-06-14
CN109886986B true CN109886986B (en) 2020-09-08

Family

ID=66926502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910062500.0A Active CN109886986B (en) 2019-01-23 2019-01-23 Dermatoscope image segmentation method based on multi-branch convolutional neural network

Country Status (1)

Country Link
CN (1) CN109886986B (en)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437926B (en) * 2019-06-18 2024-05-31 神经技术Uab公司 Fast robust friction ridge patch detail extraction using feedforward convolutional neural network
CN110263797B (en) * 2019-06-21 2022-07-12 北京字节跳动网络技术有限公司 Method, device and equipment for estimating key points of skeleton and readable storage medium
CN110349161B (en) * 2019-07-10 2021-11-23 北京字节跳动网络技术有限公司 Image segmentation method, image segmentation device, electronic equipment and storage medium
CN110648311B (en) * 2019-09-03 2023-04-18 南开大学 Acne image focus segmentation and counting network model based on multitask learning
CN110910388A (en) * 2019-10-23 2020-03-24 浙江工业大学 Cancer cell image segmentation method based on U-Net and density estimation
CN111105031B (en) * 2019-11-11 2023-10-17 北京地平线机器人技术研发有限公司 Network structure searching method and device, storage medium and electronic equipment
CN111126561B (en) * 2019-11-20 2022-07-08 江苏艾佳家居用品有限公司 Image processing method based on multi-path parallel convolution neural network
CN110866565B (en) * 2019-11-26 2022-06-24 重庆邮电大学 Multi-branch image classification method based on convolutional neural network
RU2733823C1 (en) * 2019-12-17 2020-10-07 Акционерное общество "Российская корпорация ракетно-космического приборостроения и информационных систем" (АО "Российские космические системы") System for segmenting images of subsoil resources of open type
RU2734058C1 (en) * 2019-12-17 2020-10-12 Акционерное общество "Российская корпорация ракетно-космического приборостроения и информационных систем" (АО "Российские космические системы") System for segmenting images of buildings and structures
CN111179193B (en) * 2019-12-26 2021-08-10 苏州斯玛维科技有限公司 Dermatoscope image enhancement and classification method based on DCNNs and GANs
CN111127487B (en) * 2019-12-27 2022-04-19 电子科技大学 Real-time multi-tissue medical image segmentation method
CN111583256B (en) * 2020-05-21 2022-11-04 北京航空航天大学 Dermatoscope image classification method based on rotating mean value operation
CN111724399A (en) * 2020-06-24 2020-09-29 北京邮电大学 Image segmentation method and terminal
CN111951288B (en) * 2020-07-15 2023-07-21 南华大学 Skin cancer lesion segmentation method based on deep learning
CN111833273B (en) * 2020-07-17 2021-08-13 华东师范大学 Semantic boundary enhancement method based on long-distance dependence
CN113744178B (en) * 2020-08-06 2023-10-20 西北师范大学 Skin lesion segmentation method based on convolution attention model
CN112132833B (en) * 2020-08-25 2024-03-26 沈阳工业大学 Dermatological image focus segmentation method based on deep convolutional neural network
CN112419286A (en) * 2020-11-27 2021-02-26 苏州斯玛维科技有限公司 Method and device for segmenting skin mirror image
CN112613359B (en) * 2020-12-09 2024-02-02 苏州玖合智能科技有限公司 Construction method of neural network for detecting abnormal behaviors of personnel
CN113159275A (en) * 2021-03-05 2021-07-23 深圳市商汤科技有限公司 Network training method, image processing method, device, equipment and storage medium
DE102021203251B3 (en) * 2021-03-31 2022-09-08 Siemens Healthcare Gmbh Digital pathology using an artificial neural network
CN113920124B (en) * 2021-06-22 2023-04-11 西安理工大学 Brain neuron iterative segmentation method based on segmentation and error guidance
CN113378984B (en) * 2021-07-05 2023-05-02 国药(武汉)医学实验室有限公司 Medical image classification method, system, terminal and storage medium
CN113506310B (en) * 2021-07-16 2022-03-01 首都医科大学附属北京天坛医院 Medical image processing method and device, electronic equipment and storage medium
CN113743280B (en) * 2021-08-30 2024-03-01 广西师范大学 Brain neuron electron microscope image volume segmentation method, device and storage medium
CN113870194B (en) * 2021-09-07 2024-04-09 燕山大学 Breast tumor ultrasonic image processing device with fusion of deep layer characteristics and shallow layer LBP characteristics
CN113742775B (en) * 2021-09-08 2023-07-28 哈尔滨工业大学(深圳) Image data security detection method, system and storage medium
CN113707312A (en) * 2021-09-16 2021-11-26 人工智能与数字经济广东省实验室(广州) Blood vessel quantitative identification method and device based on deep learning
CN113780241B (en) * 2021-09-29 2024-02-06 北京航空航天大学 Acceleration method and device for detecting remarkable object
CN114419060B (en) * 2021-12-01 2024-05-31 华南理工大学 Method and system for dividing skin mirror image
CN114943963B (en) * 2022-04-29 2023-07-04 南京信息工程大学 Remote sensing image cloud and cloud shadow segmentation method based on double-branch fusion network
CN115511882B (en) * 2022-11-09 2023-03-21 南京信息工程大学 Melanoma identification method based on lesion weight characteristic map

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203999A (en) * 2017-04-28 2017-09-26 北京航空航天大学 A kind of skin lens image automatic division method based on full convolutional neural networks
CN107451996A (en) * 2017-07-26 2017-12-08 广州慧扬健康科技有限公司 Deep learning training system applied to cutaneum carcinoma identification
CN107480261A (en) * 2017-08-16 2017-12-15 上海荷福人工智能科技(集团)有限公司 One kind is based on deep learning fine granularity facial image method for quickly retrieving
CN107527069A (en) * 2017-08-22 2017-12-29 京东方科技集团股份有限公司 Image processing method, device, electronic equipment and computer-readable medium
CN108090447A (en) * 2017-12-19 2018-05-29 青岛理工大学 Hyperspectral image classification method and device under double-branch deep structure
CN108154145A (en) * 2018-01-24 2018-06-12 北京地平线机器人技术研发有限公司 The method and apparatus for detecting the position of the text in natural scene image
CN108830150A (en) * 2018-05-07 2018-11-16 山东师范大学 One kind being based on 3 D human body Attitude estimation method and device
CN109241829A (en) * 2018-07-25 2019-01-18 中国科学院自动化研究所 The Activity recognition method and device of convolutional neural networks is paid attention to based on space-time

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10223788B2 (en) * 2016-08-31 2019-03-05 International Business Machines Corporation Skin lesion segmentation using deep convolution networks guided by local unsupervised learning
CN108510456B (en) * 2018-03-27 2021-12-21 华南理工大学 Sketch simplification method of deep convolutional neural network based on perception loss

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203999A (en) * 2017-04-28 2017-09-26 北京航空航天大学 A kind of skin lens image automatic division method based on full convolutional neural networks
CN107451996A (en) * 2017-07-26 2017-12-08 广州慧扬健康科技有限公司 Deep learning training system applied to cutaneum carcinoma identification
CN107480261A (en) * 2017-08-16 2017-12-15 上海荷福人工智能科技(集团)有限公司 One kind is based on deep learning fine granularity facial image method for quickly retrieving
CN107527069A (en) * 2017-08-22 2017-12-29 京东方科技集团股份有限公司 Image processing method, device, electronic equipment and computer-readable medium
CN108090447A (en) * 2017-12-19 2018-05-29 青岛理工大学 Hyperspectral image classification method and device under double-branch deep structure
CN108154145A (en) * 2018-01-24 2018-06-12 北京地平线机器人技术研发有限公司 The method and apparatus for detecting the position of the text in natural scene image
CN108830150A (en) * 2018-05-07 2018-11-16 山东师范大学 One kind being based on 3 D human body Attitude estimation method and device
CN109241829A (en) * 2018-07-25 2019-01-18 中国科学院自动化研究所 The Activity recognition method and device of convolutional neural networks is paid attention to based on space-time

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Automatic skin lesion segmentation based on supervised learning;Yefen Wu 等;《2013 Seventh International Conference on Image and Graphics》;20131231;第164-169页 *
Vehicle Type Recognition based on Multi-branch and Multi-Layer Features;Chaocun Chen 等;《2017 IEEE》;20171231;第2038-2041页 *
一种基于融合深度卷积神经网络与度量学习的人脸识别方法;吕璐 等;《现代电子技术》;20180501;第41卷(第9期);第58-61,67页 *

Also Published As

Publication number Publication date
CN109886986A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN109886986B (en) Dermatoscope image segmentation method based on multi-branch convolutional neural network
Ali et al. Structural crack detection using deep convolutional neural networks
US20210118144A1 (en) Image processing method, electronic device, and storage medium
CN106296653A (en) Brain CT image hemorrhagic areas dividing method based on semi-supervised learning and system
CN110188792A (en) The characteristics of image acquisition methods of prostate MRI 3-D image
Wen et al. Gcsba-net: Gabor-based and cascade squeeze bi-attention network for gland segmentation
CN112669248B (en) Hyperspectral and panchromatic image fusion method based on CNN and Laplacian pyramid
CN112446892A (en) Cell nucleus segmentation method based on attention learning
CN106408001A (en) Rapid area-of-interest detection method based on depth kernelized hashing
CN111275686B (en) Method and device for generating medical image data for artificial neural network training
CN114399510B (en) Skin focus segmentation and classification method and system combining image and clinical metadata
CN114581662A (en) Method, system, device and storage medium for segmenting brain tumor image
CN112818920B (en) Double-temporal hyperspectral image space spectrum joint change detection method
CN113344933B (en) Glandular cell segmentation method based on multi-level feature fusion network
CN113192076B (en) MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction
CN112001895B (en) Thyroid calcification detection device
CN114419060B (en) Method and system for dividing skin mirror image
CN114299383A (en) Remote sensing image target detection method based on integration of density map and attention mechanism
CN115601330A (en) Colonic polyp segmentation method based on multi-scale space reverse attention mechanism
CN114648806A (en) Multi-mechanism self-adaptive fundus image segmentation method
CN116758336A (en) Medical image intelligent analysis system based on artificial intelligence
Zhang et al. Multi-scale aggregation networks with flexible receptive fields for melanoma segmentation
CN115511882B (en) Melanoma identification method based on lesion weight characteristic map
CN114170224B (en) System and method for cellular pathology classification using generative staining normalization
CN115760875A (en) Full-field medical picture region segmentation method based on self-supervision learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant