CN113705630A

CN113705630A - Skin lesion image classification method

Info

Publication number: CN113705630A
Application number: CN202110911205.5A
Authority: CN
Inventors: 王玉峰; 万承北
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2021-11-26
Anticipated expiration: 2041-08-10
Also published as: CN113705630B

Abstract

The invention discloses a skin lesion image classification method, which comprises the following steps: performing center cutting on a skin lesion image to be classified to obtain a first image block; extracting the features of the first image block by using an upper branch network in the neural network model to obtain a first feature vector; obtaining coordinates of a lesion area according to the first feature vector; cutting the first image block according to the coordinates of the lesion area to obtain a second image block; extracting the features of the second image block by using a lower branch network in the neural network model to obtain a second feature vector; and performing feature fusion on the first feature vector and the second feature vector to obtain skin lesion category prediction probability, and obtaining the skin lesion category of the skin lesion image to be classified according to the skin lesion category prediction probability. The method can rapidly, objectively and accurately judge the skin lesion type of the skin lesion image.

Description

Skin lesion image classification method

Technical Field

The invention relates to a skin lesion image classification method, and belongs to the technical field of image processing.

Background

Skin cancer has been one of the major life-threatening cancers in humans, and skin melanoma is one of the major categories of skin cancer. Since skin melanoma looks very much like a mole at an early stage, it is difficult for an ordinary person to recognize it by the naked eye. Skin lesions such as skin melanoma can be identified by processing skin images, currently, most hospitals amplify local skin by using a skin mirror to eliminate interference around the skin, then obtain skin images, and judge skin lesion types according to the skin images by professional dermatologists.

The appearance of the deep neural network provides hopes for accurately and quickly judging the skin lesion type, but the skin lesion area of the skin image in the existing skin lesion data set is relatively small, the skin lesion area is different in size and the appearance difference characteristic is not obvious, and meanwhile, the data set distribution in the real world often has unbalanced characteristics, so that the efficiency and the accuracy of the skin lesion image classification system based on the neural network are seriously influenced by the characteristics.

Disclosure of Invention

In order to improve the efficiency and the accuracy of a skin lesion image classification system based on a neural network, the invention provides a skin lesion image classification method, which predicts the probability of skin lesions by using a neural network model based on a multi-scale double-layer attention mechanism, finally obtains accurate and reliable skin lesion classes according to the probability of skin lesions, and improves the efficiency and the accuracy of skin lesion image classification.

In order to solve the technical problems, the invention adopts the following technical means:

the invention provides a skin lesion image classification method, which comprises the following steps:

performing center cutting on a skin lesion image to be classified to obtain a first image block;

extracting features of the first image block by adopting an upper branch network of a trained neural network model to obtain a first feature vector, wherein the neural network model adopts a neural network model based on a multi-scale double-layer attention mechanism;

carrying out lesion area positioning according to the first feature vector by using a lesion positioning structure in the upper branch network;

cutting the first image block according to the located lesion area to obtain a second image block;

extracting the features of the second image block by using a lower branch network of the trained neural network model to obtain a second feature vector;

performing feature fusion on the first feature vector and the second feature vector by using a feature fusion structure in the lower branch network to obtain a fusion vector;

processing the fusion vector by utilizing a softmax activation function in an output layer of the lower branch network to obtain the skin lesion category prediction probability;

and classifying the skin lesion images to be classified according to the skin lesion category prediction probability.

Furthermore, the upper branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a lesion location structure and a cutting scaling structure, and the lower branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a feature fusion structure and an output layer, wherein the feature extraction structure comprises a convolution layer and a plurality of attention residual error unit learning structures ARL, and the lesion location structure comprises a hidden layer and an output layer.

Further, the method for obtaining the first feature vector comprises:

inputting the first image block into the convolution layer of the feature extraction structure of the upper branch network, and obtaining a middle vector X through a Relu nonlinear activation function₁；

Learning structure ARL versus intermediate vector X using several attention residual units₁Performing convolution, normalization and downsampling processing to obtain an output vector y;

processing the output vector y by utilizing a global average pooling layer to obtain correspondence of the first image blockFirst feature vector F of¹。

Further, the method for locating the lesion area according to the first feature vector by using the lesion locating structure in the upper branch network comprises the following steps:

the first feature vector F¹Inputting the state of the hidden layer into a hidden layer of a lesion positioning structure, and obtaining a hidden layer state g by utilizing a Relu nonlinear activation function:

g＝Relu(U₃F¹+b₃) (1)

wherein ,U₃Parameter matrix as a hidden layer, b₃A bias term that is a hidden layer;

according to the hidden layer state g, acquiring the coordinates of a lesion area of the first image block by using a sigmoid nonlinear activation function in an output layer of the lesion positioning structure, wherein the expression of the coordinates of the lesion area is as follows:

[t_x，t_y，t_l]＝n*sigmoid(U₄g+b₄) (2)

wherein ,t_xAbscissa representing center point of lesion region, t_yOrdinate, t, representing the center point of the lesion_lRepresenting the radius of the lesion, n being the side length of the first image block, U₄Is a parameter matrix of the output layer, b₄Is the bias term for the output layer.

Further, the method for acquiring the second image block includes:

obtaining the vertex coordinates of the clipping area in the first image block according to the coordinates of the lesion area: the coordinate of the upper left corner of the cutting area is (t)_x(tl)，t_y(tl)) The coordinate of the lower left corner of the clipping region is (t)_x(tl)，t_y(br)) The coordinate of the upper right corner of the cutting area is (t)_x(br)，t_y(tl)) The coordinate of the lower right corner of the cutting area is (t)_x(br)，t_y(br)), wherein ,t_x(tl)＝t_x-t_l，t_y(tl)＝t_y-t_l，t_x(br)＝t_x+t_l，t_y(br)＝t_y+t_l；

Cutting the first image block according to the vertex coordinates of the cutting area to obtain a cutting image corresponding to the first image block;

scaling the cut image according to the side length of the first image block to obtain a second image block:

wherein ,

representing the pixel values in the ith row and jth column in the second image block,

denotes a pixel value on the ith row and the w column in the clipped image, and α ═ h- [ i/λ]，β＝w-[j/λ]，

[·]For the rounding function, {. cndot._x(tl)，t_x(br)]，w∈[t_y(tl)，t_y(br)]，i，j∈{1，2，…，n}。

Further, the training method of the neural network model comprises the following steps:

obtaining a skin lesion dataset comprising a plurality of sample images under a plurality of skin lesion categories;

performing center cutting on each sample image in the skin lesion data set to obtain a first sample image block, and forming a preprocessed skin lesion data set by using all the first sample image blocks;

dividing the preprocessed skin lesion data set into a plurality of category data sets according to skin lesion categories, and decomposing each category data set into a plurality of sub-category data sets according to image correlation;

performing feature extraction on each first sample image block in each sub-data set by using a feature extraction structure of an upper branch network to obtain a first feature vector, and obtaining a first lesion category prediction probability by using an auxiliary output layer;

positioning a lesion area of each first sample image block according to the first feature vector by using a lesion positioning structure of the upper branch network;

cutting each first sample image block according to the located lesion area to obtain a second sample image block;

performing feature extraction on each second sample image block by using a feature extraction structure of the lower branch network to obtain a second feature vector, and obtaining a second lesion category prediction probability by using an auxiliary output layer;

performing feature fusion on the first feature vector and the second feature vector by using a feature fusion structure to obtain a fusion vector;

processing the fusion vector by utilizing a softmax activation function in an output layer to obtain the skin lesion class prediction probability;

and performing parameter training on the neural network model by using the permutation loss function and the weighting loss function based on the first lesion category prediction probability, the second lesion category prediction probability and the skin lesion category prediction probability, and obtaining the trained neural network model through iterative convergence.

Further, assuming that there are d category datasets in the preprocessed skin lesion dataset, and Z is 1, 2, …, d, the method for decomposing the Z-th category dataset into a plurality of sub-category datasets according to the image correlation is as follows:

(1) carrying out gray processing on each first sample image block in the Z-th category data set to obtain a gray image corresponding to the first sample image block;

(2) randomly selecting a first sample image block from the Z category data set as an initial clustering center c_Z1；

(3) Calculating each first sample image block to an initial clustering center c in the Z category data set according to the gray level image corresponding to the first sample image block_Z1And calculating the probability of each first sample image block being selected as the next cluster center according to the distance, the calculation formula is as follows:

wherein ,z_kRepresents the kth first sample image block, P (Z), in the Z category dataset_k) Denotes z_kProbability of being selected as next cluster center, D (z)_k) Denotes z_kTo the initial cluster center c_Z1K is 1, 2, …, K is the number of the first sample image blocks in the Z-th class data set;

(4) in [0, 1]]Internally randomly generating a random number, and when the random number belongs to the interval

Then, the (r + 1) th first sample image block in the Z-th class data set is selected as a second clustering center point c_Z2Wherein r is 1, 2, …, K-1;

(5) repeating the step (4) until N clustering centers are selected from the Z category data set: c_Z1，c_Z2，…，c_ZN；

(6) Calculating the Hamming distance from each first sample image block in the Z-th category data set to N clustering centers, and dividing each first sample image block in the Z-th category data set into the N clustering centers according to the principle of closeness to obtain N clusters;

(7) and recalculating the clustering centers of the N clusters, wherein the calculation formula is as follows:

wherein ,

represents the cluster center of the v-th cluster at the p +1 th iterative clustering,

represents the v-th cluster at the p-th iterative clustering,

representing the number of samples of the v-th cluster at the p-th iterative clustering, v being 1, 2, …, N;

(8) repeating the steps (6) and (7) until the cluster center of the two continuous iterative clusters in each cluster meets the requirement

And acquiring N clusters of the final cluster, wherein one cluster represents one subcategory data set.

Further, the method for performing parameter training on the neural network model by using the permutation loss function and the weighting loss function comprises the following steps:

extracting a first probability p of the sample image from the first lesion class prediction probability, the second lesion class prediction probability and the skin lesion class prediction probability according to the real skin lesion class of the sample image in the skin lesion data set¹Second probability p²And a third probability p³；

Network parameters of the fixed lesion localization structure according to the first probability p¹Second probability p²And a third probability p³And optimizing other network parameters in the neural network model by using a weighted loss function, wherein the expression of the weighted loss function is as follows:

where LF denotes the weighted loss function and H is the number of sample images in the skin lesion dataset, ρ_ZRepresenting the number of first sample image blocks in the Z-th category data set, γ being an artificially set hyper-parameter, Z being 1, 2, …, d, d being the number of category data sets in the skin lesion data set;

other network parameters in the fixed neural network model, using the permutation loss function L_rank(p¹，p²) Optimization of network parameters of lesion localization structures, L_rank(p¹，p²) The expression of (a) is as follows:

L_rank(p¹，p²)＝max(0，p¹-p²+margin) (7)

wherein margin is a preset decimal.

Further, the first lesion class prediction probability is calculated as follows:

P₁＝softmax(U₂F¹+b₂) (8)

wherein ,U₂To assist the parameter matrix of the output layer, F¹Representing a first feature vector, b₂Is a bias term for the auxiliary output layer.

The following advantages can be obtained by adopting the technical means:

the invention provides a skin lesion image classification method, which is characterized in that a neural network model based on a multi-scale double-layer attention mechanism is utilized to process a skin lesion image to be classified, and the skin lesion class probability of the skin lesion image to be classified is predicted, so that the skin lesion class of the skin lesion image to be classified is obtained. Before the classification is started, the method performs center cutting on the skin lesion image to be classified, enlarges the skin lesion area in the image and unifies the image size, thereby being beneficial to subsequent image processing and feature recognition. In the process of image processing by using the neural network module, the method of the invention utilizes the feature extraction structure consisting of the attention residual learning block and the lesion positioning structure based on the attention mechanism to ensure that the neural network is highly concentrated in the skin lesion area during feature extraction, thereby greatly reducing the influence on the prediction of the neural network due to the undersize skin lesion area in the skin lesion image to be classified and improving the classification accuracy of the skin lesion image.

In the model training process, the method not only unifies the size of the sample images in the skin lesion data set, but also performs class decomposition on various classes in the skin lesion data set, divides the classes into a plurality of subclass data sets, forms new data distribution, achieves the effect of extracting fine-grained information hidden in each class of the images, and improves the model training effect; in addition, the invention also solves the problem of data imbalance possibly caused by class decomposition by using a weighting loss function, and greatly improves the sensitivity and specificity of the skin lesion image classification method.

The method of the invention does not depend on manual operation, improves the efficiency of classifying the skin lesion images, and can rapidly, objectively and accurately judge the skin lesion categories of the skin lesion images.

Drawings

FIG. 1 is a flowchart illustrating steps of a method for classifying skin lesion images according to the present invention;

FIG. 2 is a network structure diagram of a neural network model according to an embodiment of the present invention;

FIG. 3 is a diagram of an attention residual error unit learning structure ARL according to an embodiment of the present invention;

FIG. 4 is a flow chart of a training process of a neural network model according to an embodiment of the present invention;

FIG. 5 is an exploded flow chart of a skin lesion data set according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the accompanying drawings as follows:

the invention provides a skin lesion image classification method, as shown in fig. 1, which specifically comprises the following steps:

and A, performing center cutting on the skin lesion image to be classified to obtain a first image block, wherein the size of the first image block is n x 3, n is the side length of the first image block, and 3 represents three channels of the RGB image.

And step B, extracting the features of the first image block by adopting an upper branch network of the trained neural network model to obtain a first feature vector, wherein the neural network model adopts a neural network model based on a multi-scale double-layer attention mechanism.

And C, positioning the lesion area by using the lesion positioning structure in the upper branch network according to the first feature vector.

And D, cutting the first image block according to the positioned lesion area to obtain a second image block.

And E, extracting the features of the second image block by using the trained lower branch network of the neural network model to obtain a second feature vector.

And F, performing feature fusion on the first feature vector and the second feature vector by using a feature fusion structure in the lower branch network to obtain a fusion vector.

And G, processing the fusion vector by utilizing a softmax activation function in an output layer of the lower branch network, and acquiring the skin lesion class prediction probability.

And H, classifying the skin lesion images to be classified according to the skin lesion category prediction probability.

In the embodiment of the present invention, a Network structure of a neural Network model is shown in fig. 2, the neural Network model may be divided into an upper branch Network and a lower branch Network, the upper branch Network mainly includes a feature extraction structure, an auxiliary output layer, a Lesion Location Network (LLN) and a clipping scaling structure, and the lower branch Network mainly includes a feature extraction structure, an auxiliary output layer, a feature fusion structure and an output layer; wherein, the feature extraction structure comprises a convolution layer, a plurality of attention Residual error unit learning structures ARL (attention Residual learning) and a global Average pooling layer GAP (global Average pooling layer); the auxiliary output layer is a full connecting layer; the pathological change positioning structure comprises a hidden layer and an output layer, wherein the hidden layer and the output layer are all full-connection layers.

In the embodiment of the present invention, the specific operation of step B is as follows:

step B01, inputting the first image block with dimension n X3 into the convolution layer of the feature extraction structure of the upper branch network, and obtaining the intermediate vector X through the Relu nonlinear activation function₁Intermediate vector X₁Has a dimension of n₁*n₁*D₁，n₁Representing the intermediate vector X₁Side length of (D)₁The number of convolution kernels representing the convolution layer of the feature extraction structure.

Step B02, connecting a plurality of attention residual error unit learning structures ARL in the feature extraction structure in sequence, and connecting the intermediate vector X₁After the ARL is input, a plurality of attention residual error unit learning structures ARL are utilized to learn the intermediate vector X₁The output vector y can be obtained by performing processing such as convolution, normalization, and downsampling.

As shown in fig. 3, the attention residual error unit learning structure ARL includes 1 × 1 convolution layer, batch layer, nonlinear active layer, 3 × 3 convolution layer, batch layer, nonlinear active layer, 1 × 1 convolution layer, batch layer, nonlinear active layer, and down-sampling layer (1 × 1 convolution layer).

Taking the first ARL as an example, the intermediate vector X₁Inputting the vector Q into the first ARL, firstly performing convolution processing on a plurality of convolution layers, outputting the vector Q after passing through a third layer 1 x 1 convolution layer in the first ARL, wherein the dimension of the vector Q is n '. times.n '. times.D ', n ' represents the side length of the vector Q, and D ' represents the number of convolution kernels of the third layer 1 x 1; then the first ARL carries out normalization processing on the vector Q to obtain a vector matrix M [ Q ]](ii) a Intermediate vector X₁Passing through a downsampling layer (1X 1 convolutional layer), downsampling X₁Is the same as the vector Q, and down-sampled X₁And M [ Q ]]Pixel level multiplication is carried out; finally, X is₁Q and M [ Q ]]·X₁Performing pixel-level addition to obtain an output vector y of the first ARL¹＝X₁+Q+μ·M[Q]·X₁Where μ is a parameter for automatic neural network learning, y¹The dimension of (c) is still n '. cndot.Dd'.

Intermediate vector X₁After the last ARL, an output vector y with dimension n is obtained₂*n₂*D，n₂The side length of the output vector y is indicated and D indicates the number of the third layer 1 x 1 convolution kernels in the final ARL structure.

Step B03, utilizing the global average pooling layer GAP to process the output vector y to obtain a first feature vector F of the first image block¹，F¹Dimension of (D) is 1 x D.

In the embodiment of the present invention, the specific operation of step C is as follows:

step C01, the first feature vector F¹Inputting the state of the hidden layer into a hidden layer of a lesion positioning structure, and obtaining a hidden layer state g by utilizing a Relu nonlinear activation function:

g＝Relu(U₃F¹+b₃) (9)

wherein ,U₃Parameter matrix as a hidden layer, b₃Biasing terms for hidden layers.

Step C02, according to the hidden layer state g, obtaining the coordinates of the lesion area of the first image block by using a sigmoid nonlinear activation function in the output layer of the lesion positioning structure, wherein the lesion area adopts a circular area in the method, and the expression of the coordinates of the lesion area is as follows:

[t_x，t_y，t_l]＝n*sigmoid(U₄g+b₄) (10)

In the formula (10), the values obtained by the sigmoid nonlinear activation function are all between 0 and 1, and in order to obtain a real coordinate value, the values obtained by the sigmoid nonlinear activation function need to be amplified, so that sigmoid (U) needs to be controlled₄g+b₄) Multiplied by n.

In the embodiment of the present invention, the specific operation of step D is as follows:

step D01, the clipping region in the present invention is rectangular, and 4 vertex coordinates of the clipping region in the first image block can be obtained according to the coordinates of the lesion region: the coordinate of the upper left corner of the cutting area is (t)_x(tl)，t_y(tl)) The coordinate of the lower left corner of the clipping region is (t)_x(tl)，t_y(br)) The coordinate of the upper right corner of the cutting area is (t)_x(br)，t_y(tl)) The coordinate of the lower right corner of the cutting area is (t)_x(br)，t_y(br)), wherein ,t_x(tl)＝t_x-t_l，t_y(tl)＝t_y-t_l，t_x(br)＝t_x+t_l，t_y(br)＝t_y+t_l。

D02, cutting the first image block according to the vertex coordinates of the cutting area to obtain a cutting image X corresponding to the first image block^att。

Step D03, scaling the cropped image according to the side length of the first image blockProcessing to obtain a second image block X^ampSecond image block X^ampFor a three-dimensional vector of n x 3, the scaling process is expressed as follows:

wherein ,

In step E of the invention, a second image block X of dimension n X3 is applied^ampInputting the feature extraction structure of the lower branch network, and utilizing the convolution layer, a plurality of attention residual error unit learning structures ARL and a global average pooling layer GAP in the feature extraction structure to perform X pair^ampExtracting the features to obtain a second feature vector F²，F²Dimension of (D) is 1 x D. The specific operation of step E is identical to step B.

In step F, a first feature vector F is added¹And a second feature vector F²Splicing in a concat mode, and obtaining a fusion vector F after splicing, wherein F is ═ F¹；F²]Is a 1 x 2D three-dimensional vector.

In step G, inputting the fusion vector F into an output layer of the lower branch network, processing the fusion vector F through a softmax activation function, and outputting the skin lesion class prediction probability P₃，P₃Is 1 × s, s is the number of subclasses in the neural network model, at P₃Comprises sNumerical values, each numerical value representing the probability that the skin lesion image to be classified belongs to 1 subclass.

In step H, the probability P is predicted according to the skin lesion class₃And classifying the skin lesion images to be classified to obtain the skin lesion categories of the skin lesion images to be classified. Specifically, the skin lesion class prediction probability P is set₃The prediction probabilities of all the subclasses belonging to the same class are added to obtain the real probability of the skin lesion image to be classified corresponding to each class, and the class corresponding to the maximum value is taken as the final skin lesion class.

It is assumed that the skin lesion image to be classified may belong to two categories: class a and class B, the neural network model decomposes class B into three subclasses, class B1, class B2, class B3, after the skin lesion image to be classified is input into the neural network model, the model outputs a 1 × 4 vector [0.4, 0.1, 0.4, 0.1], four numbers in the vector are prediction probabilities of class a, class B1, class B2, class B3, since class B is decomposed, the last three probability values in the vector need to be added to obtain the probability of class B, i.e., 0.1+0.4+0.1 ═ 0.6, to obtain the true probability of the skin lesion image to be classified [0.4, 0.6], 0.4 and 0.6 are prediction probabilities of class a and class B, and since the prediction probability of class B is greater than the prediction probability of class a, the skin lesion image to be classified is considered as class B.

In the method of the present invention, a neural network model needs to be trained, as shown in fig. 4, the training method includes the following steps:

step 1, obtaining a skin lesion data set with d types of samples, wherein each skin lesion type in the skin lesion data set comprises a plurality of sample images.

Because the sample images are different in size and the lesion area in the sample images is relatively small, the method carries out data preprocessing on the skin lesion data set, and comprises the following specific operations:

step 101, performing center clipping on all sample images in the skin lesion data set according to a preset size to obtain a corresponding first sample image block, wherein the preset size is n × 3, n is the side length of the first sample image block, and 3 represents three channels of an RGB image.

And 102, utilizing all the first sample image blocks to form a preprocessed skin lesion data set.

And 2, in order to conveniently extract the hidden fine-grained information in each category of the image, carrying out category decomposition processing on the preprocessed skin lesion data set, dividing the preprocessed skin lesion data set into a plurality of category data sets according to the category of the skin lesion, and decomposing each category data set into a plurality of sub-category data sets according to the image correlation.

As shown in FIG. 5, the specific operation of decomposing the Z-th category dataset into a plurality of sub-category datasets is as follows:

step 201, performing gray processing on each first sample image block in the Z-th category data set to obtain a gray image corresponding to the first sample image block, and taking each pixel point in the gray image as a unit, so that the gray image can be represented as a 2-dimensional data matrix a_n×n：

Where nxn denotes the number of rows x columns, a, of the matrix A_ijRepresentation matrix A_n×nAnd (3) pixel point values of ith row and j column, wherein Z is 1, 2, … and d.

Step 201, randomly selecting a first sample image block from the Z-th category data set as an initial clustering center c_Z1。

Step 203, calculating each first sample image block to the initial clustering center c in the Z-th class data set by using the gray scale image corresponding to the first sample image block_Z1Distance D (z)_k)²The calculation formula is as follows:

wherein ,z_kFirst sample graph representing the kth in the Z category datasetImage block, D (z)_k) Denotes z_kTo the initial cluster center c_Z1A distance of_i，jRepresenting the initial cluster center c_Z1Pixel point value of ith row and j column, b_i，jDenotes z_kThe pixel point value K in the ith row and j column is 1, 2, …, K is the number of the first sample image blocks in the Z-th category data set.

From all first sample image blocks to c_Z1Calculates the probability that each first sample image block is selected as the next cluster center, and the calculation formula is as follows:

wherein ,P(z_k) Denotes z_kProbability of being selected as the next cluster center.

Step 204, at [0, 1]]Internally randomly generating a random number, and when the random number belongs to the interval

Then, the (r + 1) th first sample image block in the Z-th class data set is selected as a second clustering center point c_Z2Wherein r is 1, 2, …, K-1.

Step 205, repeating step 204 until N clustering centers are selected from the Z category data set: c. C_Z1，c_Z2，...，c_ZNAnd N is the preset number of cluster centers, and the selected N value ensures that the number of the first sample image blocks in each sub-class after decomposition is approximately the same.

Step 206, calculating the hamming distance | Z from each first sample image block to the N clustering centers in the Z-th category data set in sequence_k-C_zv||₁And v is 1, 2, …, N, dividing each first sample image block in the Z-th class data set into N clusters according to the principle of proximity, and obtaining N clusters, which are respectively marked as N clusters

p denotes the p-th iterative clustering.

Step 207, according to the N clusters in step 206, recalculating the cluster centers of the N clusters, wherein the calculation formula is as follows:

wherein ,

represents the v-th cluster at the p-th iterative clustering,

representing the number of samples of the v-th cluster at the p-th iterative clustering.

Step 208, repeating steps 206 and 207, continuously updating the first sample image block and the clustering center in the cluster until the distance between the new center position and the old center position of the cluster meets the requirement, namely the clustering center of the two continuous iterative clustering in each cluster meets the requirement

At this time, the classification result is not changed any more, iteration is finished, and N clusters of the final cluster are obtained, wherein one cluster represents one subclass data set.

Performing clustering decomposition on each category data set in the skin lesion data set according to the steps, extracting corresponding first sample image blocks, and storing the first sample image blocks in the same folder; the folder of each subclass data set is named in a class name _ N mode, which indicates N corresponding subclasses in the original class, and accordingly each class label of a new data set formed by the subclass data sets is the name of the corresponding folder, and the new data set contains s subclass data sets after decomposition.

Step 3, performing feature extraction on each first sample image block in each sub-data set by using a feature extraction structure of the upper branch networkExtracting and performing global average pooling to obtain a first feature vector F¹The concrete operation is the same as that of step B, F¹And D is the number of the last convolution kernels in the feature extraction structure.

The auxiliary output layer of the upper branch network receiving F¹Outputting the prediction probability P of the first lesion class through the softmax activation function₁：

P₁＝softmax(U₂F¹+b₂) (16)

wherein ,U₂To assist the parameter matrix of the output layer, F¹Representing a first feature vector, b₂To assist the bias term of the output layer, P₁Dimension 1 × s.

And 4, positioning a lesion area of each first sample image block according to the first feature vector by using a lesion positioning structure of the upper branch network, wherein the specific operation is consistent with that in the step C.

And 5, cutting each first sample image block according to the positioned lesion area to obtain a second sample image block, wherein the specific operation is consistent with that in the step D.

Step 6, performing feature extraction and global average pooling on the skin disease picture of each second sample image block by using a feature extraction structure of the lower branch network to obtain a second feature vector F²The specific operation is the same as step E.

Receiving a second feature vector F using an auxiliary output layer²Calculating a second lesion class prediction probability P using the softmax activation function₂，P₂Dimension (d) is 1 × s.

Step 7, performing feature fusion on the first feature vector and the second feature vector by using the feature fusion structure to obtain a fusion vector, specifically, F¹，F²Splicing in a concat mode, and fusing a vector F ═ F¹；F²]。

Step 8, processing the fusion vector by utilizing a softmax activation function in the output layer to obtain the skin lesion class prediction probability P corresponding to each sample image₃，P₃Dimension (d) is 1 × s. Comparison P₃And selecting the category corresponding to the maximum value as the classification result of the sample image according to the prediction probability values corresponding to different categories.

Step 9, based on the first lesion category prediction probability, the second lesion category prediction probability and the skin lesion category prediction probability, performing parameter training on the neural network model by using the permutation loss function and the weighting loss function, and obtaining the trained neural network model through iterative convergence, wherein the method specifically comprises the following steps:

step 901, obtaining a first probability p of the sample image from a first lesion class prediction probability, a second lesion class prediction probability and a skin lesion class prediction probability according to the real skin lesion class of each sample image in the skin lesion class data set¹Second probability p²And a third probability p³。

P₁、P₂ and P₃All dimensions of (1 x s) include the prediction probability values corresponding to the s sub-classes. Respectively adding P₁、P₂ and P₃Adding the prediction probability values of different subclasses in the same class to obtain the prediction probability values corresponding to different classes; from P, on the basis of the correct label of the sample image (the expert judged category of the real skin lesion of this image)₁、P₂ and P₃Finding out the predicted probability value corresponding to the real skin lesion category of the sample image block, and recording as the first probability p corresponding to the correct label¹Second probability p²And a third probability p³。

Step 902, fix the network parameters of the lesion localization structure according to the first probability p¹Second probability p²And a third probability p³And optimizing other network parameters in the neural network model by using a weighted loss function, wherein the expression of the weighted loss function is as follows:

where LF denotes the weighted loss function and H is the number of sample images in the skin lesion dataset, ρ_ZTo representThe number of the first sample image blocks in the Z-th category data set, gamma is a manually set hyper-parameter, and Z is 1, 2, …, d, d is the number of category data sets in the skin lesion data set;

step 903, fixing other network parameters in the neural network model by using the permutation loss function L_rank(p¹，p²) Optimization of network parameters of lesion localization structures, L_rank(p¹，p²) The expression of (a) is as follows:

L_rank(p¹，p²)＝max(0，p¹-p²+margin) (18)

wherein margin is a preset decimal close to 0.

In the method, the deep neural network is applied to the field of classification of the skin lesion images, the skin lesion types are quickly and accurately judged through the trained neural network model, in the specific implementation process, the method enlarges the skin lesion areas through center cutting and type decomposition, unifies the sizes of the skin lesion images, and achieves the effect of extracting fine-grained information hidden in each type of the images; the invention utilizes the feature extraction structure formed by the attention residual learning blocks and the lesion positioning structure based on the attention mechanism to lead the network to be highly concentrated in the skin lesion area during feature extraction, thereby greatly reducing the influence on neural network detection caused by the undersize skin lesion area in the skin lesion image to be classified, solving the problem of data imbalance possibly caused by class decomposition by utilizing the weighting loss function, and greatly improving the sensitivity and specificity of the skin lesion image classification method.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A skin lesion image classification method is characterized by comprising the following steps:

2. The method for classifying skin lesion images according to claim 1, wherein an upper branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a lesion localization structure and a clipping scaling structure, and a lower branch network of the neural network model comprises a feature extraction structure, an auxiliary output layer, a feature fusion structure and an output layer, wherein the feature extraction structure comprises a convolution layer and a plurality of attention residual error unit Learning Structures (ARLs), and the lesion localization structure comprises a hidden layer and an output layer.

3. The method for classifying skin lesion images according to claim 1 or 2, wherein the method for obtaining the first feature vector comprises:

inputting the first image block into a convolution layer of a feature extraction structure of an upper branch network, passing throughRelu nonlinear activation function obtains intermediate vector X₁；

processing the output vector y by using the global average pooling layer to obtain a first feature vector F corresponding to the first image block¹。

4. The method for classifying skin lesion images according to claim 1, wherein the method for locating lesion regions according to the first feature vector by using the lesion locating structure in the upper branch network comprises:

g＝Relu(U₃F¹+b₃)

[t_x，t_y，t_l]＝n*sigmoid(U₄g+b₄)

5. The method for classifying skin lesion images according to claim 4, wherein the method for obtaining the second image block comprises:

wherein ,

6. The method for classifying skin lesion images according to claim 1 or 2, wherein the training method of the neural network model comprises the following steps:

7. The method of claim 6, wherein if the pre-processed skin lesion data set has d category data sets, Z is 1, 2, …, d, then the method of decomposing the Z-th category data set into a plurality of sub-category data sets according to the image correlation comprises:

(5) repeating the step (4) until N clustering centers are selected from the Z category data set: c. C_Z1，c_Z2，...，c_ZN；

wherein ,

represents the v-th cluster at the p-th iterative clustering,

8. The method for classifying skin lesion images according to claim 6, wherein the method for performing parameter training on the neural network model by using the permutation loss function and the weighting loss function comprises:

Network parameters of the fixed lesion localization structure according toFirst probability p¹Second probability p²And a third probability p³And optimizing other network parameters in the neural network model by using a weighted loss function, wherein the expression of the weighted loss function is as follows:

L_rank(p¹，p²)＝max(0，p¹-p²+margin)

wherein margin is a preset decimal.

9. The method for classifying skin lesion images according to claim 6 or 8, wherein the prediction probability of the first lesion class is calculated as follows:

P₁＝softmax(U₂F¹+b₂)