CN113705630B

CN113705630B - Skin lesion image classification method

Info

Publication number: CN113705630B
Application number: CN202110911205.5A
Authority: CN
Inventors: 王玉峰; 万承北
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2023-10-13
Anticipated expiration: 2041-08-10
Also published as: CN113705630A

Abstract

The invention discloses a skin lesion image classification method, which comprises the following steps: performing center clipping on the skin lesion image to be classified to obtain a first image block; performing feature extraction on the first image block by using an upper branch network in the neural network model to obtain a first feature vector; obtaining lesion area coordinates according to the first feature vector; cutting the first image block according to the coordinates of the lesion area to obtain a second image block; extracting features of the second image block by using a lower branch network in the neural network model to obtain a second feature vector; and carrying out feature fusion on the first feature vector and the second feature vector to obtain the skin lesion type prediction probability, and obtaining the skin lesion type of the skin lesion image to be classified according to the skin lesion type prediction probability. The method can rapidly, objectively and accurately judge the skin lesion type of the skin lesion image.

Description

Skin lesion image classification method

Technical Field

The invention relates to a skin lesion image classification method, and belongs to the technical field of image processing.

Background

Skin cancer has been one of the major cancers that threaten human life, and skin melanoma is one of the major categories of skin cancer. Since skin melanoma looks like a mole at an early stage, it is difficult for an average person to recognize it by naked eyes. Skin lesions such as skin melanoma can be identified through treatment of skin images, at present, most hospitals utilize a dermatoscope to amplify local skin, eliminate interference around the skin, then acquire the skin images, and a professional dermatologist judges the skin lesion type according to the skin images.

The appearance of the deep neural network brings hopes for accurately and quickly judging the skin lesion type, but the skin lesion areas of the skin images in the existing skin lesion data sets are relatively small, different in size and not obvious in appearance difference characteristics, and meanwhile, the data set distribution in the real world often has unbalanced characteristics, and the characteristics seriously affect the efficiency and the accuracy of the skin lesion image classification system based on the neural network.

Disclosure of Invention

In order to improve the efficiency and accuracy of a skin lesion image classification system based on a neural network, the invention provides a skin lesion image classification method, wherein a neural network model based on a multi-scale double-layer attention mechanism is utilized to predict skin lesion probability, and finally, accurate and reliable skin lesion types are obtained according to the skin lesion probability, so that the efficiency and accuracy of skin lesion image classification are improved.

In order to solve the technical problems, the invention adopts the following technical means:

the invention provides a skin lesion image classification method, which comprises the following steps:

performing center clipping on the skin lesion image to be classified to obtain a first image block;

performing feature extraction on the first image block by adopting an upper branch network of a trained neural network model to obtain a first feature vector, wherein the neural network model adopts a neural network model based on a multi-scale double-layer attention mechanism;

positioning a lesion region according to the first feature vector by utilizing a lesion positioning structure in the upper branch network;

cutting the first image block according to the positioned lesion area to obtain a second image block;

extracting features of the second image block by using a lower branch network of the trained neural network model to obtain a second feature vector;

feature fusion is carried out on the first feature vector and the second feature vector by utilizing a feature fusion structure in the lower branch network, so that a fusion vector is obtained;

processing the fusion vector by using a softmax activation function in an output layer of the lower branch network to obtain the skin lesion category prediction probability;

classifying the skin lesion images to be classified according to the skin lesion class prediction probability.

Further, the upper branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a lesion positioning structure and a cutting scaling structure, the lower branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a feature fusion structure and an output layer, the feature extraction structure comprises a convolution layer and a plurality of attention residual unit learning structures ARL, and the lesion positioning structure comprises a hiding layer and an output layer.

Further, the method for obtaining the first feature vector comprises the following steps:

inputting the first image block into a convolution layer of a feature extraction structure of an upper branch network, and obtaining an intermediate vector X through a Relu nonlinear activation function ₁ ；

Learning structure ARL with several attention residual units for intermediate vector X ₁ Performing convolution, normalization and downsampling to obtain an output vector y;

processing the output vector y by using a global average pooling layer to obtain a first feature vector F corresponding to the first image block ¹ 。

Further, the method for positioning the lesion area according to the first feature vector by using the lesion positioning structure in the upper branch network comprises the following steps:

will first feature vector F ¹ Input into hidden layer of lesion localization structure, obtained by using Relu nonlinear activation functionTo hidden layer state g:

g＝Relu(U ₃ F ¹ +b ₃ ) (1)

wherein ,U₃ B is a parameter matrix of the hidden layer ₃ Bias terms for the hidden layer;

according to the hidden layer state g, acquiring a lesion area coordinate of the first image block by using a sigmoid nonlinear activation function in an output layer of the lesion positioning structure, wherein the expression of the lesion area coordinate is as follows:

[t _x ，t _y ，t _l ]＝n*sigmoid(U ₄ g+b ₄ ) (2)

wherein ,t_x An abscissa representing a center point of a lesion region, t _y An ordinate representing the center point of the lesion area, t _l Represents the radius of the lesion area, n is the side length of the first image block, U ₄ B is a parameter matrix of the output layer ₄ Is a bias term for the output layer.

Further, the method for acquiring the second image block comprises the following steps:

obtaining vertex coordinates of a cutting area in the first image block according to the coordinates of the lesion area: the upper left corner coordinate of the clipping region is (t _x(tl) ，t _y(tl) ) The lower left corner coordinate of the clipping region is (t _x(tl) ，t _y(br) ) The upper right corner coordinates of the clipping region are (t _x(br) ，t _y(tl) ) The lower right corner coordinate of the clipping region is (t _x(br) ，t _y(br)), wherein ,t_x(tl) ＝t _x -t _l ，t _y(tl) ＝t _y -t _l ，t _x(br) ＝t _x +t _l ，t _y(br) ＝t _y +t _l ；

Cutting the first image block according to the vertex coordinates of the cutting area to obtain a cutting image corresponding to the first image block;

scaling the clipping image according to the side length of the first image block to obtain a second image block:

wherein ,representing pixel values on the ith row and jth column in the second image block +.>Representing pixel values on the h-th row and w-th column in the cropped image, α=h- [ i/λ]，β＝w-[j/λ]，/>[·]As a rounding function, {. Cndot. } is a fractional part function, h.cndot.epsilon.t _x(tl) ，t _x(br) ]，w∈[t _y(tl) ，t _y(br) ]，i，j∈{1，2，…，n}。

Further, the training method of the neural network model comprises the following steps:

obtaining a skin lesion dataset comprising a plurality of sample images under a plurality of skin lesion categories;

performing center clipping on each sample image in the skin lesion data set to obtain first sample image blocks, and forming a preprocessed skin lesion data set by utilizing all the first sample image blocks;

dividing the preprocessed skin lesion data set into a plurality of category data sets according to skin lesion categories, and dividing each category data set into a plurality of sub-category data sets according to image correlation;

extracting features of each first sample image block in each subclass data set by using a feature extraction structure of the upper branch network to obtain a first feature vector, and obtaining a first lesion class prediction probability by using an auxiliary output layer;

positioning a lesion region of each first sample image block according to the first feature vector by utilizing a lesion positioning structure of the upper branch network;

cutting each first sample image block according to the positioned lesion area to obtain a second sample image block;

extracting the characteristics of each second sample image block by utilizing a characteristic extraction structure of the lower branch network to obtain a second characteristic vector, and obtaining a second lesion category prediction probability by utilizing an auxiliary output layer;

carrying out feature fusion on the first feature vector and the second feature vector by utilizing a feature fusion structure to obtain a fusion vector;

processing the fusion vector by using a softmax activation function in the output layer to obtain the skin lesion category prediction probability;

based on the first lesion type prediction probability, the second lesion type prediction probability and the skin lesion type prediction probability, performing parameter training on the neural network model by using the permutation loss function and the weighted loss function, and obtaining a trained neural network model through iterative convergence.

Further, assuming that d category data sets are included in the preprocessed skin lesion data set, and z=1, 2, …, d, the method for decomposing the Z category data set into a plurality of sub-category data sets according to the image correlation is as follows:

(1) Carrying out gray processing on each first sample image block in the Z-th class data set to obtain a gray image corresponding to the first sample image block;

(2) Randomly selecting a first sample image block from the Z-th class data set as an initial clustering center c _Z1 ；

(3) Calculating each first sample image block in the Z-th class data set to an initial clustering center c according to the gray level image corresponding to the first sample image block _Z1 And calculates the probability that each first sample image block is selected as the next cluster center according to the distance, wherein the calculation formula is as follows:

wherein ,z_k Representing the kth first sample image block in the zth class data set, P (Z _k ) Representing z _k Probability of being selected as the next cluster center, D (z _k ) Representing z _k To the beginningInitial cluster center c _Z1 K=1, 2, …, K being the number of first sample image blocks in the Z-th class dataset;

(4) At [0,1]Randomly generating a random number in the interval when the random number belongs toWhen the method is used, the (r+1) th first sample image block in the Z-th class data set is selected as a second clustering center point c _Z2 Wherein r=1, 2, …, K-1;

(5) Repeating step (4) until N cluster centers are selected from the Z-th class data set: c (C) _Z1 ，c _Z2 ，…，c _ZN ；

(6) The Hamming distance between each first sample image block in the Z-th class data set and N clustering centers is calculated, and each first sample image block in the Z-th class data set is divided into N clustering centers according to a nearby principle to obtain N clusters;

(7) The cluster centers of the N clusters are recalculated, and the calculation formula is as follows:

wherein ,representing the cluster center of the v cluster at the p+1st iteration cluster, +.>The v cluster representing the p-th iterative cluster,>the number of samples representing the v-th cluster at the p-th iterative cluster, v=1, 2, …, N;

(8) Repeating the steps (6) and (7) until the cluster center of two successive iterative clusters in each cluster meets the requirementN clusters of the final cluster are obtained, one cluster representing a sub-class dataset.

Further, the method for performing parameter training on the neural network model by using the permutation loss function and the weighted loss function comprises the following steps:

extracting a first probability p of the sample image from the first lesion type prediction probability, the second lesion type prediction probability and the skin lesion type prediction probability according to the real skin lesion type of the sample image in the skin lesion data set ¹ Second probability p ² And a third probability p ³ ；

Fixing network parameters of a lesion locating structure according to a first probability p ¹ Second probability p ² And a third probability p ³ And optimizing other network parameters in the neural network model by using a weighted loss function, wherein the expression of the weighted loss function is as follows:

where LF represents the weighted loss function, H is the number of sample images in the skin lesion dataset, ρ _Z Representing the number of first sample image blocks in the Z-th class data set, wherein gamma is a manually set hyper-parameter, and Z=1, 2, …, d and d are the number of class data sets in the skin lesion data set;

other network parameters in the fixed neural network model, using the permutation loss function L _rank (p ¹ ，p ² ) Optimizing network parameters of lesion positioning structure, L _rank (p ¹ ，p ² ) The expression of (2) is as follows:

L _rank (p ¹ ，p ² )＝max(0，p ¹ -p ² +margin) (7)

wherein, margin is a preset decimal.

Further, the calculation formula of the first lesion class prediction probability is as follows:

P ₁ ＝softmax(U ₂ F ¹ +b ₂ ) (8)

wherein ,U₂ To assist the parameter matrix of the output layer, F ¹ Representing a first eigenvector, b ₂ Is a bias term for the auxiliary output layer.

The following advantages can be obtained by adopting the technical means:

the invention provides a skin lesion image classification method, which utilizes a neural network model based on a multi-scale double-layer attention mechanism to process a skin lesion image to be detected and classify, predicts the probability of the skin lesion class of the skin lesion image to be detected, and thus obtains the skin lesion class of the skin lesion image to be classified. Before the classification is started, the method of the invention performs central cutting on the skin lesion image to be classified, enlarges the skin lesion area in the image and unifies the image size, thereby being beneficial to the subsequent image processing and feature recognition. In the process of image processing by utilizing the neural network module, the method utilizes the feature extraction structure formed by the attention residual error learning block and the lesion positioning structure based on the attention mechanism to ensure that the neural network is highly concentrated in the skin lesion area during feature extraction, thereby greatly reducing the influence on the prediction of the neural network caused by the too small skin lesion area in the skin lesion image to be classified and improving the accuracy of classifying the skin lesion image.

In the model training process, the method not only unifies the sizes of sample images in the skin lesion data set, but also carries out class decomposition on various types in the skin lesion data set, divides the skin lesion data set into a plurality of sub-class data sets to form new data distribution, achieves the effect of extracting hidden fine-grained information in each type of images, and improves the model training effect; in addition, the invention solves the problem of data unbalance possibly caused by class decomposition by using the weighting loss function, and greatly improves the sensitivity and the specificity of the skin lesion image classification method.

The method does not depend on manual operation, improves the efficiency of classifying the skin lesion images, and can rapidly, objectively and accurately judge the skin lesion types of the skin lesion images.

Drawings

FIG. 1 is a flow chart showing the steps of a method for classifying skin lesions according to the present invention;

FIG. 2 is a network architecture diagram of a neural network model in an embodiment of the invention;

FIG. 3 is a schematic diagram of an attention residual unit learning structure ARL according to an embodiment of the present invention;

FIG. 4 is a training flow chart of a neural network model according to an embodiment of the present invention;

fig. 5 is an exploded flow chart of a skin lesion dataset in an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings:

the invention provides a skin lesion image classification method, which is shown in fig. 1, and specifically comprises the following steps:

and step A, performing center clipping on the skin lesion image to be classified to obtain a first image block, wherein the size of the first image block is n x 3, n is the side length of the first image block, and 3 represents three channels of RGB images.

And B, performing feature extraction on the first image block by adopting an upper branch network of the trained neural network model to obtain a first feature vector, wherein the neural network model adopts a neural network model based on a multi-scale double-layer attention mechanism.

And C, positioning a lesion region according to the first feature vector by utilizing a lesion positioning structure in the upper branch network.

And D, cutting the first image block according to the positioned lesion area to obtain a second image block.

And E, extracting the characteristics of the second image block by using a lower branch network of the trained neural network model to obtain a second characteristic vector.

And F, carrying out feature fusion on the first feature vector and the second feature vector by utilizing a feature fusion structure in the lower branch network to obtain a fusion vector.

And G, processing the fusion vector by using a softmax activation function in the output layer of the lower branch network to obtain the skin lesion type prediction probability.

And step H, classifying the skin lesion images to be classified according to the skin lesion class prediction probability.

In the embodiment of the present invention, as shown in fig. 2, the network structure of the neural network model may be divided into an upper branch network and a lower branch network, where the upper branch network mainly includes a feature extraction structure, an auxiliary output layer, a lesion positioning structure (LLN, lesion Location Network), and a cropping scaling structure, and the lower branch network mainly includes a feature extraction structure, an auxiliary output layer, a feature fusion structure, and an output layer; the feature extraction structure comprises a convolution layer, a plurality of attention residual unit learning structures ARL (Attention Residual Learning) and a global average pooling layer GAP (Global Average Pooling); the auxiliary output layer is a full-connection layer; the pathological change location structure includes hiding layer and output layer, and hiding layer and output layer are one deck all connected layer.

In the embodiment of the present invention, the specific operation of step B is as follows:

step B01, inputting a first image block with dimension of n X3 into a convolution layer of a feature extraction structure of an upper branch network, and obtaining an intermediate vector X through a Relu nonlinear activation function ₁ Intermediate vector X ₁ Is of dimension n ₁ *n ₁ *D ₁ ，n ₁ Representing intermediate vector X ₁ Side length D of (2) ₁ The number of convolution kernels representing the convolution layer of the feature extraction structure.

Step B02, a plurality of attention residual unit learning structures ARL in the feature extraction structure are connected in sequence, and an intermediate vector X is obtained ₁ After inputting ARL, a plurality of attention residual units are utilized to learn the structure ARL and the intermediate vector X ₁ The output vector y can be obtained by performing convolution, normalization, downsampling, and the like.

As shown in fig. 3, the attention residual unit learning structure ARL includes 1*1 convolution layers, batch layers, nonlinear activation layers, 3*3 convolution layers, batch layers, nonlinear activation layers, 1*1 convolution layers, batch layers, nonlinear activation layers, and downsampling layers (1*1 convolution layers).

Taking the first ARL as an example, the middleVector X ₁ In the first ARL, the vector Q is output after the convolution treatment of a plurality of convolution layers and the convolution treatment of a third layer 1*1 in the first ARL, the dimension of the vector Q is n 'D', n 'represents the side length of the vector Q, and D' represents the number of convolution kernels of the third layer 1*1; then the first ARL normalizes the vector Q to obtain a vector matrix M [ Q ]]The method comprises the steps of carrying out a first treatment on the surface of the Intermediate vector X ₁ Downsampled X by downsampling layer (1*1 convolutional layer) ₁ Is the same as the vector Q in dimension and downsampled X ₁ And M [ Q ]]Performing pixel level multiplication; finally X is taken ₁ Q and M [ Q ]]·X ₁ Adding pixel level to obtain output vector y of the first ARL ¹ ＝X ₁ +Q+μ·M[Q]·X ₁ Wherein μ is a parameter for neural network automatic learning, y ¹ The dimension of (c) is still n ' ×n ' ×d '.

Intermediate vector X ₁ After the last ARL, an output vector y is obtained, and the dimension is n ₂ *n ₂ *D，n ₂ Representing the side length of the output vector y, and D represents the number of third layer 1*1 convolution kernels in the last ARL structure.

Step B03, processing the output vector y by using the global average pooling layer GAP to obtain a first feature vector F of the first image block ¹ ，F ¹ Is 1 x d.

In the embodiment of the present invention, the specific operation of step C is as follows:

step C01, the first feature vector F ¹ Inputting the hidden layer to a lesion positioning structure, and obtaining a hidden layer state g by utilizing a Relu nonlinear activation function:

g＝Relu(U ₃ F ¹ +b ₃ ) (9)

wherein ,U₃ B is a parameter matrix of the hidden layer ₃ Is a bias term for the hidden layer.

Step C02, according to the hidden layer state g, acquiring a lesion area coordinate of a first image block in an output layer of a lesion positioning structure by using a sigmoid nonlinear activation function, wherein the lesion area adopts a circular area, and the expression of the lesion area coordinate is as follows:

[t _x ，t _y ，t _l ]＝n*sigmoid(U ₄ g+b ₄ ) (10)

In the formula (10), the values obtained by the sigmoid nonlinear activation function are all between 0 and 1, and in order to obtain the true coordinate values, the values obtained by the sigmoid nonlinear activation function need to be amplified, and thus the sigmoid (U) ₄ g+b ₄ ) Multiplied by n.

In the embodiment of the present invention, the specific operation of step D is as follows:

step D01, the clipping region is rectangular, and 4 vertex coordinates of the clipping region in the first image block can be obtained according to the coordinates of the lesion region: the upper left corner coordinate of the clipping region is (t _x(tl) ，t _y(tl) ) The lower left corner coordinate of the clipping region is (t _x(tl) ，t _y(br) ) The upper right corner coordinates of the clipping region are (t _x(br) ，t _y(tl) ) The lower right corner coordinate of the clipping region is (t _x(br) ，t _y(br)), wherein ,t_x(tl) ＝t _x -t _l ，t _y(tl) ＝t _y -t _l ，t _x(br) ＝t _x +t _l ，t _y(br) ＝t _y +t _l 。

Step D02, clipping the first image block according to the vertex coordinates of the clipping region to obtain a clipping image X corresponding to the first image block ^att 。

Step D03, scaling the clipping image according to the side length of the first image block to obtain a second image block X ^amp Second image block X ^amp For a three-dimensional vector of n×n×3, the scaling process is expressed as follows:

In step E of the present invention, a second image block X having a dimension of n×n×3 ^amp Inputting a feature extraction structure of a lower branch network, and utilizing a convolution layer, a plurality of attention residual unit learning structures ARL and a global average pooling layer GAP in the feature extraction structure to X ^amp Extracting features to obtain a second feature vector F ² ，F ² Is 1 x d. The specific operation of step E is identical to that of step B.

In step F, the first feature vector F ¹ And a second eigenvector F ² Splicing in a concat mode to obtain a fusion vector F, wherein F= [ F ] ¹ ；F ² ]Is a 1 x 2d three-dimensional vector.

In step G, the fusion vector F is input to the output layer of the lower branch network, and is processed by a softmax activation function to output the skin lesion class prediction probability P ₃ ，P ₃ Is 1 s, s is the number of subclasses in the neural network model, at P ₃ S number values are included, each number value representing the probability that the skin lesion image to be classified belongs to 1 subclass.

In step H, the probability P is predicted from the skin lesion class ₃ Classifying the skin lesion images to be classified to obtain skin lesion categories of the skin lesion images to be classified. In particular, the skin isSkin lesion class prediction probability P ₃ The prediction probabilities among all the subclasses belonging to the same class are added to obtain the true probability of the skin lesion image to be classified corresponding to each class, and the class corresponding to the maximum value is taken as the final skin lesion class.

It is assumed that the skin lesion images to be classified may belong to two categories: the method comprises the steps that a class A and a class B are decomposed into three subclasses by a neural network model, namely the class B is respectively a class B1, a class B2 and a class B3, after a skin lesion image to be classified is input into the neural network model, the model outputs a vector [0.4,0.1,0.4,0.1] of 1*4, four numbers in the vector are respectively the prediction probabilities of the class A, the class B1, the class B2 and the class B3, and the probability of the class B is obtained by adding the last three probability values in the vector because the class B is decomposed, namely 0.1+0.4+0.1=0.6, the true probability [0.4,0.6] of the skin lesion image to be classified is obtained, and the prediction probability of the class A is respectively 0.4 and 0.6, and is greater than the prediction probability of the class A, so that the skin lesion image to be classified is regarded as the class B.

In the method of the present invention, training of the neural network model is also required, as shown in fig. 4, and the training method includes the following steps:

step 1, obtaining a skin lesion data set with d types of samples, wherein the skin lesion data set comprises a plurality of sample images under each skin lesion type.

Because the sizes of the sample images are different and the lesion areas in the sample images are relatively smaller, the invention carries out data preprocessing on the skin lesion data set, and the specific operation is as follows:

step 101, performing center clipping on all sample images in the skin lesion data set according to a preset size to obtain a corresponding first sample image block, wherein the preset size is n x 3, n is the side length of the first sample image block, and 3 represents three channels of an RGB image.

Step 102, the preprocessed skin lesion data set is composed by using all the first sample image blocks.

In order to conveniently extract hidden fine granularity information in each category of the image, the method carries out class decomposition treatment on the preprocessed skin lesion data set, divides the preprocessed skin lesion data set into a plurality of category data sets according to the category of the skin lesion, and decomposes each category data set into a plurality of sub-category data sets according to the relevance of the image.

As shown in fig. 5, the specific operation of decomposing the Z-th category dataset into a plurality of sub-category datasets is as follows:

step 201, gray processing is performed on each first sample image block in the Z-th class data set to obtain a gray image corresponding to the first sample image block, and each pixel point in the gray image is used as a unit, so that the gray image can be represented as a 2-dimensional data matrix a _n×n ：

Where n represents the number of rows by columns of matrix A, a _ij Representation matrix A _n×n Pixel value of row j, z=1, 2, …, d.

Step 201, randomly selecting a first sample image block from the Z-th class data set as an initial clustering center c _Z1 。

Step 203, calculating the initial clustering center c of each first sample image block in the Z-th class data set by using the gray level image corresponding to the first sample image block _Z1 Distance D (z) _k ) ² The calculation formula is as follows:

wherein ,z_k Representing the kth first sample image block in the zth class data set, D (Z _k ) Representing z _k To the initial clustering center c _Z1 Distance of a _i，j Representing an initial cluster center c _Z1 Pixel point value of ith row and j column, b _i，j Representing z _k The pixel value of the ith row and the jth column in the (i) th row and the (j) th column, k=1, 2, …, K and K are the first samples in the Z-th class data setThe number of image blocks.

From all first sample image blocks to c _Z1 The distance of each first sample image block is calculated as the probability of the next cluster center, and the calculation formula is as follows:

wherein ,P(z_k ) Representing z _k The probability of being selected as the next cluster center.

Step 204, at [0,1]Randomly generating a random number in the interval when the random number belongs toWhen the method is used, the (r+1) th first sample image block in the Z-th class data set is selected as a second clustering center point c _Z2 Where r=1, 2, …, K-1.

Step 205, repeating step 204 until N cluster centers are selected from the Z-th category dataset: c _Z1 ，c _Z2 ，...，c _ZN N is the number of preset clustering centers, and the selected N value is to enable the number of the first sample image blocks in each sub-class after decomposition to be approximately the same.

Step 206, sequentially calculating hamming distances Z from each first sample image block to N clustering centers in the Z-th class data set _k -C _zv || ₁ V=1, 2, …, N, dividing each first sample image block in the Z-th class dataset into N cluster centers according to the nearest neighbor principle to obtain N clusters, respectively recorded asp represents the p-th iterative cluster.

Step 207, recalculate the cluster centers of the N clusters according to the N clusters in step 206, where the calculation formula is as follows:

wherein ,representing the cluster center of the v cluster at the p+1st iteration cluster, +.>The v cluster representing the p-th iterative cluster,>representing the number of samples of the v-th cluster at the p-th iterative cluster.

Step 208, repeating steps 206 and 207, and continuously updating the first sample image block and the cluster center in the cluster until the distance between the new center position and the old center position of the cluster meets the requirement, namely, the cluster centers of two successive iterative clusters in each cluster meet the requirementAt this time, the classification result is not changed any more, iteration is ended, N clusters of the final cluster are obtained, and one cluster represents one subclass data set.

According to the steps, carrying out cluster decomposition on each category data set in the skin lesion data sets, extracting corresponding first sample image blocks, and storing the first sample image blocks in the same folder; the folder of each sub-class data set is named in a mode of class name_N, which indicates N sub-classes corresponding to the original class, and accordingly, each class label of a new data set formed by the sub-class data sets is the name of the corresponding folder, and after decomposition, the new data set contains s sub-class data sets in total.

Step 3, performing feature extraction and global average pooling processing on each first sample image block in each sub-class data set by using a feature extraction structure of the upper branch network to obtain a first feature vector F ¹ The specific operation is consistent with the step B, F ¹ The three-dimensional vector is 1 x D, and D is the number of the last convolution kernels in the feature extraction structure.

The auxiliary output layer of the upper branch network receives F ¹ Output of the first lesion class prediction probability P by softmax activation function ₁ ：

P ₁ ＝softmax(U ₂ F ¹ +b ₂ ) (16)

wherein ,U₂ To assist the parameter matrix of the output layer, F ¹ Representing a first eigenvector, b ₂ To assist the bias term of the output layer, P ₁ The dimension is 1*s.

And step 4, positioning a lesion area of each first sample image block according to the first feature vector by utilizing a lesion positioning structure of the upper branch network, wherein the specific operation is consistent with the step C.

And 5, cutting each first sample image block according to the positioned lesion area to obtain a second sample image block, wherein the specific operation is consistent with that in the step D.

Step 6, performing feature extraction and global average pooling processing on the dermatological images of each second sample image block by using the feature extraction structure of the lower branch network to obtain a second feature vector F ² The specific operation is consistent with step E.

Receiving a second feature vector F using an auxiliary output layer ² Calculating a second lesion class prediction probability P using a softmax activation function ₂ ，P ₂ Is 1*s.

Step 7, performing feature fusion on the first feature vector and the second feature vector by using a feature fusion structure to obtain a fusion vector, and specifically, performing F ¹ ，F ² Splicing in a concat mode, and fusing vectors F= [ F ] ¹ ；F ² ]。

Step 8, processing the fusion vector by using a softmax activation function in the output layer to obtain a skin lesion type prediction probability P corresponding to each sample image ₃ ，P ₃ Is 1*s. Comparison P ₃ And selecting the category corresponding to the maximum value as a classification result of the sample image.

Step 9, based on the first lesion type prediction probability, the second lesion type prediction probability and the skin lesion type prediction probability, performing parameter training on the neural network model by using the permutation loss function and the weighted loss function, and obtaining a trained neural network model through iterative convergence, wherein the specific operation is as follows:

step 901, obtaining a first probability p of a sample image from a first lesion type prediction probability, a second lesion type prediction probability and a skin lesion type prediction probability according to the real skin lesion type of each sample image in the skin lesion type data set ¹ Second probability p ² And a third probability p ³ 。

P ₁ 、P ₂ and P₃ Is 1*s, which includes the predicted probability values for the s subclasses. Respectively P ₁ 、P ₂ and P₃ Adding the predicted probability values of different subclasses under the same class to obtain the predicted probability values corresponding to different classes; from P, according to the correct label of the sample image (the true skin lesion type of the image judged by expert) ₁ 、P ₂ and P₃ Finding out the predicted probability value corresponding to the real skin lesion category of the sample image block, and marking the predicted probability value as a first probability p corresponding to the correct label ¹ Second probability p ² And a third probability p ³ 。

Step 902, fixing network parameters of the lesion localization structure according to a first probability p ¹ Second probability p ² And a third probability p ³ And optimizing other network parameters in the neural network model by using a weighted loss function, wherein the expression of the weighted loss function is as follows:

step 903, fixing other network parameters in the neural network model, and using the permutation loss function L _rank (p ¹ ，p ² ) Optimizing network parameters of lesion positioning structure, L _rank (p ¹ ，p ² ) The expression of (2) is as follows:

L _rank (p ¹ ，p ² )＝max(0，p ¹ -p ² +margin) (18)

wherein, margin is a preset decimal close to 0.

The method applies the deep neural network to the field of classification of skin lesion images, and rapidly and accurately judges the skin lesion types through the trained neural network model; the invention uses the feature extraction structure formed by the attention residual error learning blocks and the pathological change positioning structure based on the attention mechanism to ensure that the network is highly concentrated in the pathological change region of the skin during feature extraction, thereby greatly reducing the influence on the detection of the neural network caused by the undersize pathological change region of the skin in the pathological change image of the skin to be classified, solving the problem of data unbalance possibly caused by class decomposition by using the weighting loss function, and greatly improving the sensitivity and the specificity of the classifying method of the pathological change image of the skin.

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. A method for classifying skin lesions, comprising the steps of:

classifying the skin lesion images to be classified according to the skin lesion class prediction probability;

the upper branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a lesion positioning structure and a cutting scaling structure, the lower branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a feature fusion structure and an output layer, wherein the feature extraction structure comprises a convolution layer and a plurality of attention residual unit learning structures ARL, and the lesion positioning structure comprises a hiding layer and an output layer;

the training method of the neural network model comprises the following steps:

2. The method for classifying skin lesions images according to claim 1, wherein the method for obtaining the first eigenvector is:

3. The method for classifying skin lesions according to claim 1, wherein the lesion localization method using the lesion localization structure in the upper branch network according to the first feature vector comprises:

will first feature vector F ¹ Inputting the hidden layer to a lesion positioning structure, and obtaining a hidden layer state g by utilizing a Relu nonlinear activation function:

g＝Relu(U ₃ F ¹ +b ₃ )

[t _x ,t _y ,t _l ]＝n*sigmoid(U ₄ g+b ₄ )

4. A method of classifying a skin lesion image according to claim 3, wherein the method of acquiring the second image block comprises:

5. The method of classifying skin lesions according to claim 1, wherein, assuming that d category data sets are included in the preprocessed skin lesions data set, z=1, 2, …, d, the method of decomposing the Z-th category data set into a plurality of sub-category data sets according to the image correlation is as follows:

(3) Computing each first sample in the Z-th class data set from the gray scale image corresponding to the first sample image blockImage block to initial clustering center c _Z1 And calculates the probability that each first sample image block is selected as the next cluster center according to the distance, wherein the calculation formula is as follows:

wherein ,z_k Representing the kth first sample image block in the zth class data set, P (Z _k ) Representing z _k Probability of being selected as the next cluster center, D (z _k ) Representing z _k To the initial clustering center c _Z1 K=1, 2, …, K being the number of first sample image blocks in the Z-th class dataset;

(5) Repeating step (4) until N cluster centers are selected from the Z-th class data set: c _Z1 ,c _Z2 ,…,c _ZN ；

wherein ,represents the p+1st iterationIn the generation of clustering, the clustering center of the v-th cluster is +.>The v cluster representing the p-th iterative cluster,>the number of samples representing the v-th cluster at the p-th iterative cluster, v=1, 2, …, N;

6. The method of classifying skin lesions images according to claim 1, wherein the method of training parameters of the neural network model using the permutation loss function and the weighted loss function comprises:

where LF represents the weighted loss function, H is the number of sample images in the skin lesion dataset, ρ _Z Representing a Z-th category datasetThe number of the first sample image blocks, gamma is a manually set super parameter, and Z=1, 2, …, d and d are the number of category data sets in the skin lesion data set;

other network parameters in the fixed neural network model, using the permutation loss function L _rank (p ¹ ,p ² ) Optimizing network parameters of lesion positioning structure, L _rank (p ¹ ,p ² ) The expression of (2) is as follows:

L _rank (p ¹ ,p ¹ )＝max(0,p ¹ -p ² +margin)

wherein, margin is a preset decimal.

7. The method of classifying skin lesion images according to claim 6, wherein the first lesion class prediction probability is calculated as follows:

P ₁ ＝softmax(U ₂ F ¹ +b ₂ )