CN117292217A

CN117292217A - Skin typing data augmentation method and system based on countermeasure generation network

Info

Publication number: CN117292217A
Application number: CN202310669194.3A
Authority: CN
Inventors: 华薇; 舒晓红; 熊丽丹; 唐洁; 李利; 王曦; 李朝霞; 霍维; 邹琳; 汤莹
Original assignee: West China Hospital of Sichuan University
Current assignee: West China Hospital of Sichuan University
Priority date: 2023-06-07
Filing date: 2023-06-07
Publication date: 2023-12-26

Abstract

The invention discloses a skin classification data augmentation method and system based on adversarial generation network, which is used to accurately classify skin pathology types. Due to the small number of skin pathology images that can be collected, over-fitting problems will occur when directly used for training classifier models, resulting in unsatisfactory classification results. It is necessary to use adversarial generation networks to generate realistic skin pathology images to alleviate this shortcoming. . This method performs image segmentation on the skin pathology data set, then performs data augmentation through an adversarial generation network, and uses multi-level feature fusion technology to classify skin pathology. The main steps of the present invention include: first, collecting skin pathology image data, performing image segmentation and manually completing label operations; then, constructing an adversarial generation network to generate high-quality skin pathology images to expand the available data set; finally, using multi-branch feature extraction The network extracts features and models the target image, and uses support vector machines to achieve accurate skin classification.

Description

A skin typing data augmentation method and system based on adversarial generative network

技术领域Technical field

本发明涉及数据分析与医疗技术领域，特别涉及一种基于对抗生成网络的皮肤分型数据增广的方法与系统。The invention relates to the fields of data analysis and medical technology, and in particular to a method and system for augmenting skin classification data based on adversarial generation networks.

背景技术Background technique

皮肤病理分型是临床皮肤学领域的重要研究内容之一。通过对皮肤病变情况进行分类和描述，可以为皮肤病的诊断和治疗提供重要的参考。皮肤病理类型可以通过观察皮肤表面的多色斑块和表面溃疡来进行评估，进而初步判断皮肤病理类型。通常情况下，通过图像识别技术可以对常见的四类病理皮肤进行分型，分别为鳞状细胞皮肤癌、基底细胞皮肤癌、恶性黑色素瘤、默克尔细胞皮肤癌。Skin pathology classification is one of the important research contents in the field of clinical dermatology. Classifying and describing skin lesions can provide an important reference for the diagnosis and treatment of skin diseases. The type of skin pathology can be assessed by observing multi-colored patches and surface ulcers on the skin surface, and then the type of skin pathology can be initially determined. Under normal circumstances, image recognition technology can be used to classify four common types of pathological skin, namely squamous cell skin cancer, basal cell skin cancer, malignant melanoma, and Merkel cell skin cancer.

目前，皮肤分型主要通过目测的方法来实现。然而，由于皮肤类型存在特异性和多样性，导致传统方法暴露出诸多缺陷，不同医生的诊断结果也存在着一定的差异性，这种主观性和不确定性导致人工皮肤分型的可靠性和准确性较差。Currently, skin typing is mainly achieved through visual inspection. However, due to the specificity and diversity of skin types, traditional methods have exposed many flaws, and there are also certain differences in the diagnostic results of different doctors. This subjectivity and uncertainty have led to the reliability and uncertainty of artificial skin typing. Less accurate.

为了解决上述问题，越来越多的研究人员开始探索如何使用计算机视觉技术来实现皮肤分型。相比于人眼观察皮肤图像，计算机视觉技术可以自动地从大量的皮肤图像中提取特征，并通过机器学习模型进行分类和预测。这种方法可以减少主观性带来的负面影响，提高皮肤分型的准确性和可靠性。然而，在皮肤分型领域，由于缺乏大规模的标注数据集，导致数据增广技术的应用受到很大限制。在深度学习中，数据量越大，训练出的模型性能越好。因此，在缺乏大规模数据集的情况下，如何通过数据增广技术来提高皮肤分型的准确性和可靠性是当前亟待解决的问题。In order to solve the above problems, more and more researchers have begun to explore how to use computer vision technology to achieve skin classification. Compared with human eyes observing skin images, computer vision technology can automatically extract features from a large number of skin images and classify and predict them through machine learning models. This method can reduce the negative impact of subjectivity and improve the accuracy and reliability of skin typing. However, in the field of skin typing, the application of data augmentation technology is greatly limited due to the lack of large-scale annotated data sets. In deep learning, the larger the amount of data, the better the performance of the trained model. Therefore, in the absence of large-scale data sets, how to improve the accuracy and reliability of skin classification through data augmentation technology is an urgent problem that needs to be solved.

基于对抗生成网络(Generative Adversarial Networks,GAN)的数据增广技术可以通过生成新的合成数据样本来提高模型的分型性能和泛化能力。GAN由生成器和判别器两部分组成，其中生成器通过学习训练数据集的分布来生成新的合成数据样本，判别器则用于判别生成的数据样本是否真实。通过训练生成器和判别器，GAN可以生成高质量的合成数据样本，从而扩充原有数据集的规模，提高模型的鲁棒性和泛化能力。Data augmentation technology based on Generative Adversarial Networks (GAN) can improve the classification performance and generalization ability of the model by generating new synthetic data samples. GAN consists of two parts: a generator and a discriminator. The generator generates new synthetic data samples by learning the distribution of the training data set, and the discriminator is used to determine whether the generated data samples are real. By training the generator and discriminator, GAN can generate high-quality synthetic data samples, thereby expanding the size of the original data set and improving the robustness and generalization ability of the model.

因此，使用基于对抗生成网络的数据增广技术来实现皮肤分型，具有非常重要的应用价值。利用GAN生成更多的合成数据，可以在不增加数据采集成本的条件下实现扩大数据集规模，提升皮肤分型的准确性和可靠性。通过计算机视觉技术对皮肤图像进行分析和处理，可以实现对皮肤病理类型高效分类，为临床医学的皮肤病理诊断和治疗提供重要的参考。Therefore, using data augmentation technology based on adversarial generative networks to achieve skin classification has very important application value. Using GAN to generate more synthetic data can expand the size of the data set and improve the accuracy and reliability of skin classification without increasing the cost of data collection. By analyzing and processing skin images using computer vision technology, efficient classification of skin pathology types can be achieved, providing an important reference for the diagnosis and treatment of skin pathology in clinical medicine.

发明内容Contents of the invention

有鉴于此，本发明的目的之一是提供一种基于对抗生成网络的皮肤分型数据增广方法，使用对抗生成网络模型实现皮肤分型数据增广，可以在原始图像数据量较小或皮肤图像较复杂的情况下提供多样性的样本，辅佐医疗判断并提升皮肤的诊断效率。In view of this, one of the purposes of the present invention is to provide a skin typing data augmentation method based on an adversarial generative network. The adversarial generative network model is used to achieve skin typing data augmentation, which can be used when the original image data is small or the skin is small. Provides diverse samples when images are complex to assist medical judgment and improve skin diagnosis efficiency.

本发明的目的之一是通过以下技术方案实现的：One of the purposes of the present invention is achieved through the following technical solutions:

该种基于对抗生成网络的皮肤分型数据增广的方法，包括以下步骤：This method of skin typing data augmentation based on adversarial generation network includes the following steps:

步骤S1：对原始皮肤图像数据进行采集与预处理；Step S1: Collect and preprocess the original skin image data;

步骤S2：构建对抗生成网络模型生成高质量的合成皮肤病理图像样本；Step S2: Construct an adversarial generative network model to generate high-quality synthetic skin pathology image samples;

步骤S3：分割生成的皮肤病理图像并与健康皮肤图像合成以实现数据增广；Step S3: Segment the generated skin pathology image and synthesize it with the healthy skin image to achieve data augmentation;

步骤S4：训练对抗生成网络模型以扩充原有皮肤病理图像数据集的规模；Step S4: Train an adversarial generative network model to expand the size of the original skin pathology image data set;

步骤S5：构建多分支特征提取网络提取皮肤病理图像多层次的结构特征并实现皮肤分型。Step S5: Construct a multi-branch feature extraction network to extract multi-level structural features of skin pathology images and achieve skin classification.

进一步，对原始皮肤图像数据进行采集与预处理，具体包括：Further, the original skin image data is collected and preprocessed, including:

步骤S101：从医学图像数据库、网络皮肤图像数据库、大型医学科研机构等多种渠道获得包含多种类型的皮肤图像，其中，皮肤图像的主要信息包括：皮肤病变外观、皮肤病变形状、皮肤病变大小、皮肤病变类型、皮肤病变分级；Step S101: Obtain multiple types of skin images from medical image databases, online skin image databases, large medical research institutions and other channels. The main information of the skin images includes: skin lesion appearance, skin lesion shape, and skin lesion size. , skin lesion types, skin lesion grading;

步骤S102：将皮肤图像映射为一副无向图，将每个像素视为一个节点，再将每个像素节点连接形成边，每条边的权重表示像素之间的纹理差异，构建灰度共生矩阵P计算不同像素之间存在的纹理差异，构建方式如下：Step S102: Map the skin image into an undirected graph, treat each pixel as a node, and then connect each pixel node to form an edge. The weight of each edge represents the texture difference between pixels to build grayscale symbiosis. The matrix P calculates the texture differences that exist between different pixels and is constructed as follows:

上式中，i表示当前像素，j表示i像素的相邻像素，(x,y)表示当前像素i的坐标，αx和βy表示两个像素之间的坐标偏移量，θ表示像素对(i,j)在图像中的出现频次，W表示图像的宽度，H表示图像的长度，G_value(·)是灰度计算函数，表示图像中坐标(x,y)处的灰度值；In the above formula, i represents the current pixel, j represents the adjacent pixel of i pixel, (x, y) represents the coordinate of the current pixel i, αx and βy represent the coordinate offset between the two pixels, and θ represents the pixel pair ( The frequency of occurrence of i, j) in the image, W represents the width of the image, H represents the length of the image, G_value(·) is the grayscale calculation function, which represents the grayscale value at the coordinates (x, y) in the image;

利用纹理特征函数，计算像素之间的灰度值差异平方和，用于反映像素之间的纹理差异程度，计算方式如下：Use the texture feature function to calculate the sum of squares of gray value differences between pixels, which is used to reflect the degree of texture difference between pixels. The calculation method is as follows:

D(i,j)＝∑_i,j[G_value(x,y)-G_value(x+αx,y+βy)]²×I(G_value(x,y)＝i)×I(G_value(x+αx,y+βy)＝j)D(i,j)＝∑ _i,j [G_value(x,y)-G_value(x+αx,y+βy)] ² ×I(G_value(x,y)＝i)×I(G_value(x+ αx,y+βy)＝j)

上式中，D(·)表示差异程度计算函数，I(·)表示指示函数，当I(·)为真时取值为1，反之取值为0；In the above formula, D(·) represents the difference degree calculation function, and I(·) represents the indicator function. When I(·) is true, the value is 1, otherwise the value is 0;

根据像素纹理差异度量公式，计算无向图中每条边的权重，计算方式如下：According to the pixel texture difference measurement formula, the weight of each edge in the undirected graph is calculated as follows:

步骤S103：在构建的无向图中增设两个额外节点集合，分别为源节点集合S和汇节点集合T，与无向图中的每个顶点相连形成边，计算将图中节点划分为两个集合的最小代价，计算方式如下：Step S103: Add two additional node sets in the constructed undirected graph, namely the source node set S and the sink node set T, which are connected to each vertex in the undirected graph to form an edge. Calculate and divide the nodes in the graph into two The minimum cost of a set is calculated as follows:

Cost＝min∑_{(u,v)∈I,s∈S,t∈T}c(u,v)Cost＝min∑ _{(u,v)∈I,s∈S,t∈T} c(u,v)

上式中，min表示求取最小值，u和v表示无向图中的任意两条边，c(·)表示路径流量计算函数，I表示皮肤图像，S表示源节点集合，T表示汇节点集合，通过查找从源节点到汇节点之间存在的增广路径，将该路径上每条边的权值计入增广路径的流量，当无法查找到增广路径时，表示当前路径的流量达到最大值，依据最大流路径将图像进行分割保留图像中的病理区域；In the above formula, min represents the minimum value, u and v represent any two edges in the undirected graph, c(·) represents the path flow calculation function, I represents the skin image, S represents the source node set, and T represents the sink node Set, by finding the augmented path that exists between the source node and the sink node, the weight of each edge on the path is included in the flow of the augmented path. When the augmented path cannot be found, it represents the flow of the current path. When the maximum value is reached, the image is segmented according to the maximum flow path to retain the pathological areas in the image;

步骤S104：采取线性插值方法将图像补全为512×512固定尺寸的失真图像块，弥补分割后的图片规格尺寸不统一的缺陷，补全计算公式如下：Step S104: Use linear interpolation method to complete the image into a distorted image block of 512×512 fixed size to make up for the defect of inconsistent size of the segmented image. The completion calculation formula is as follows:

Pixel'(x_m,y_m)＝(1-U)*(1-V)*Pixel(x_m,y_m)+U*(1-V)*Pixel(x_m,y_m)+(1-U)*V*Pixel(x_m,y_m)Pixel'(x _m ,y _m )=(1-U)*(1-V)*Pixel(x _m ,y _m )+U*(1-V)*Pixel(x _m ,y _m )+(1 -U)*V*Pixel(x _m ,y _m )

Pixel(x_m,y_m)＝w₁*R+w₂*G+w₃*BPixel(x _m ,y _m )＝w ₁ *R+w ₂ *G+w ₃ *B

上式中，(x_m,y_m)表示缺失像素的坐标值，(x₀,y₀)表示坐标原点，(x_r,y_r)表示随机选取的缺失像素附近完整像素的坐标值，U表示缺失像素的横向偏移量，V表示缺失像素的纵向偏移量，Pixel'(·)表示图像灰度估计函数，Pixel(·)为图像灰度计算函数，采用加权平均的方法计算目标像素的灰度值，w₁、w₂、w₃分别表示红色分量、绿色分量和蓝色分量的权重，需要注意的是，三种颜色分量的权重可根据实际情况调整且权重总和为1；In the above formula, (x _m , y _m ) represents the coordinate value of the missing pixel, (x ₀ , y ₀ ) represents the coordinate origin, (x _r , y _r ) represents the coordinate value of the complete pixel near the randomly selected missing pixel, U represents the horizontal offset of the missing pixel, V represents the vertical offset of the missing pixel, Pixel'(·) represents the image grayscale estimation function, Pixel(·) is the image grayscale calculation function, and uses the weighted average method to calculate the target pixel The gray value of w ₁ , w ₂ , and w ₃ represents the weight of the red component, green component, and blue component respectively. It should be noted that the weights of the three color components can be adjusted according to the actual situation and the sum of the weights is 1;

步骤S105：将补全后的图像进行分块处理，采用滑动窗口的方式遍历图像并将图像裁剪为256×256固定尺寸的图像G_path，为裁剪后的图像添加相应的标签，若原始图像包含标签信息则依据原始信息添加相应的标签，若原始图像缺少或遗漏标签信息则由人工识别并添加具体的标签信息。Step S105: Divide the completed image into blocks, use a sliding window to traverse the image and crop the image into a 256×256 fixed-size image G _path , and add corresponding labels to the cropped image. If the original image contains The label information adds corresponding labels based on the original information. If the original image lacks or omits label information, it will be manually identified and specific label information will be added.

进一步，构建对抗生成网络模型生成高质量的合成皮肤病理图像样本，具体包括：Furthermore, an adversarial generative network model is constructed to generate high-quality synthetic skin pathology image samples, including:

步骤S201：构建具有对称结构的生成器网络，学习输入图像映射到对应参照图像之间的潜在特征，构建的生成器网络由3个卷积层组成，第一个卷积层包含32个尺寸为7×7的卷积核且卷积步长设置为1，第二个卷积层包含64个尺寸为5×5的卷积核且卷积步长设置为1，第三个卷积层包含128个尺寸为3×3的卷积核且卷积步长设置为1；然后，设置3个由1层卷积层和1层ReLU激活函数组成的残差网络单元，残差网络单元中卷积层包含128个尺寸为3×3的卷积核且卷积步长设置为1；最后，设置3个转置卷积层，实现特征采样以及图像恢复，第一个转置卷积层包含128个尺寸为3×3的卷积核且卷积步长设置为1，第二个转置卷积层包含64个尺寸为5×5的卷积核且卷积步长设置为1，第三个转置卷积层包含32个尺寸为7×7卷积核且卷积层步长设置为1，将最后一层转置卷积层的输出特征与输入图像进行特征融合得到最后的输出图像G_gen，卷积操作的表达式如下：Step S201: Construct a generator network with a symmetric structure, and learn the latent features between the input image and the corresponding reference image. The constructed generator network consists of 3 convolutional layers. The first convolutional layer contains 32 pixels of size The convolution kernel is 7×7 and the convolution step is set to 1. The second convolution layer contains 64 convolution kernels of size 5×5 and the convolution step is set to 1. The third convolution layer contains 128 convolution kernels with a size of 3×3 and the convolution step size is set to 1; then, set up 3 residual network units composed of 1 layer of convolution layer and 1 layer of ReLU activation function, and the residual network unit is convolved. The convolutional layer contains 128 convolution kernels with a size of 3×3 and the convolution step size is set to 1; finally, 3 transposed convolution layers are set up to achieve feature sampling and image recovery. The first transposed convolution layer contains 128 convolution kernels of size 3 × 3 and the convolution step size is set to 1. The second transposed convolution layer contains 64 convolution kernels of size 5 × 5 and the convolution step size is set to 1. The three transposed convolutional layers contain 32 convolution kernels with a size of 7×7 and the convolutional layer stride is set to 1. The output features of the last transposed convolutional layer are fused with the input image to obtain the final output. Image G _gen , the expression of the convolution operation is as follows:

上式中，表示当前通道输出的特征图，K表示卷积核的尺寸，W表示图像的宽度，H表示图像的高度，/>表示当前通道输入的特征图，x,y表示输出特征图在通道n的坐标位置，w,h表示卷积核权重矩阵在通道n的元素坐标；In the above formula, Represents the feature map output by the current channel, K represents the size of the convolution kernel, W represents the width of the image, H represents the height of the image, /> Represents the feature map of the current channel input, x, y represents the coordinate position of the output feature map in channel n, w, h represents the element coordinates of the convolution kernel weight matrix in channel n;

步骤S202：构建由5个卷积层组成的辨别器网络，可以提取输入图像中的不同特征，对不同类型的皮肤进行识别，辨别器网络中卷积层均选用尺寸为3×3的卷积核，每一层卷积层均加入批量归一化层，依照先后顺序，卷积层包含的卷积层数量分别为32、64、128、256、1，前四层卷积层使用ReLU激活函数，最后一层卷积层使用Sigmoid激活函数，输出对应的输入图像是否为生成图像的概率值。Step S202: Construct a discriminator network composed of 5 convolutional layers, which can extract different features in the input image and identify different types of skin. The convolutional layers in the discriminator network all use convolutions with a size of 3×3. Kernel, each convolutional layer is added with a batch normalization layer. In order, the number of convolutional layers included in the convolutional layer are 32, 64, 128, 256, 1 respectively. The first four convolutional layers use ReLU activation. function, the last convolutional layer uses the Sigmoid activation function to output the probability value of whether the corresponding input image is the generated image.

进一步，分割生成的皮肤病理图像并与健康皮肤图像合成以实现数据增广，具体包括：Furthermore, the generated skin pathology images are segmented and synthesized with healthy skin images to achieve data augmentation, including:

步骤S301：遍历图像G_gen的每个像素并获取对应的灰度值，统计每个灰度值的像素频率用于构建图像G_gen对应的灰度直方图，灰度直方图的表现形式如下：Step S301: Traverse each pixel of the image G _gen and obtain the corresponding gray value, and count the pixel frequency of each gray value to construct a gray histogram corresponding to the image G _gen . The expression form of the gray histogram is as follows:

k(l)＝f(l)×m^-1 k(l)=f(l)×m ^-1

上式中，k(l)表示灰度级为l的像素在图像中的出现频率，f(l)表示灰度级为l的像素在图像中的出现频次，m表示图像中的像素数量；In the above formula, k(l) represents the frequency of occurrence of pixels with gray level l in the image, f(l) represents the frequency of occurrence of pixels with gray level l in the image, and m represents the number of pixels in the image;

步骤S302：计算不同灰度级阈值对应的类间方差，求得阈值δ使得病情纹理和正常皮肤之间的差异最大化，根据阈值δ将图像G_gen转化为二值化图像G_bin，计算方式如下：Step S302: Calculate the inter-class variance corresponding to different gray level thresholds, obtain the threshold δ to maximize the difference between the disease texture and normal skin, and convert the image G _gen into the binary image G _bin according to the threshold δ, calculation method as follows:

δ＝max{δ₁,δ₂,…,δ_i},i＝h(l)δ=max{δ ₁ , δ ₂ ,..., δ _i }, i=h(l)

上式中，δ_i表示基于第i个灰度级构建二值化图像的阈值，表示图像在阈值为δ_i时的类间方差，/>表示图像在阈值为δ_i时灰度级为0的像素数量，/>表示图像在阈值为δ_i时灰度级为255的像素数量，f(l)表示灰度级为l的像素在图像中的出现频次，m表示图像中的像素数量，h(l)表示图像中灰度级的数量，max{·}是求最大值算子，δ表示最大类间方差所对应的灰度级阈值；In the above formula, δ _i represents the threshold for constructing a binary image based on the i-th gray level, Represents the inter-class variance of the image when the threshold value is δ _i ,/> Indicates the number of pixels with a gray level of 0 in the image when the threshold value is δ _i ,/> Indicates the number of pixels with a gray level of 255 in the image when the threshold value is δ _i , f(l) indicates the frequency of occurrence of pixels with a gray level of l in the image, m indicates the number of pixels in the image, h(l) indicates the image The number of medium gray levels, max{·} is the maximum operator, and δ represents the gray level threshold corresponding to the maximum inter-class variance;

步骤S303：生成器得到的图像G_gen中的皮肤部分与病理部分存在较为明显的像素差异，利用二值化图像G_bin将图像G_gen分割得到更精细的病理图像G_acc，然后输入至判别器中；Step S303: There are obvious pixel differences between the skin part and the pathological part in the image G _gen obtained by the generator. Use the binary image G _bin to segment the image G _gen to obtain a more refined pathological image G _acc , and then input it to the discriminator. middle;

步骤S304：利用生成器网络生成的病理图像与健康皮肤合成，病理图像的边缘区域与健康皮肤的像素差异明显，利用均值滤波的方式对病理图像的边缘进行平滑改进，图像边缘均值滤波平滑公式如下：Step S304: Use the pathological image generated by the generator network to synthesize the healthy skin. The edge area of the pathological image is significantly different from the pixels of the healthy skin. Use mean filtering to smooth the edges of the pathological image. The image edge mean filter smoothing formula is as follows :

G_i'＝λ×G_i+(1-λ)×G_j G _i '=λ×G _i +(1-λ)×G _j

上式中，G_i'表示滤波后的像素值，G_i表示待滤波的像素值，G_j表示与G_i相邻像素点的像素值，λ表示控制图像滤波的权重；In the above formula, G _i ' represents the filtered pixel value, G _i represents the pixel value to be filtered, G _j represents the pixel value of the pixel adjacent to G _i , and λ represents the weight controlling image filtering;

步骤S305：利用双线性插值法对合成后的图像进行多尺度处理，构建图像金字塔结构，确保合成的图像可以保留主要的病理特征，像素的位置坐标及像素值的计算方式如下：Step S305: Use the bilinear interpolation method to perform multi-scale processing on the synthesized image to construct an image pyramid structure to ensure that the synthesized image can retain the main pathological features. The position coordinates of the pixels and the pixel value are calculated as follows:

上式中，l_i表示待计算目标点在图像中的坐标值，x_src表示缩放后图像的横坐标，y_src表示缩放后图像的横坐标，x_dst表示缩放前图像的横坐标，y_dst表示缩放前图像的纵坐标，W_src表示缩放后图像的宽度，W_dst表示缩放前图像的宽度，H_src表示缩放后图像的宽度，H_dst表示缩放前图像的宽度，G_i表示待计算目标点的像素值，G_j表示待计算目标点邻近像素点的像素值，l_j表示待计算目标点邻近像素点在图像中的坐标值。In the above formula, l _i represents the coordinate value of the target point to be calculated in the image, x _src represents the abscissa of the image after scaling, y _src represents the abscissa of the image after scaling, x _dst represents the abscissa of the image before scaling, y _dst represents the ordinate of the image before scaling, W _src represents the width of the image after scaling, W _dst represents the width of the image before scaling, H _src represents the width of the image after scaling, H _dst represents the width of the image before scaling, G _i represents the target to be calculated The pixel value of the point, G _j represents the pixel value of the adjacent pixel point of the target point to be calculated, and l _j represents the coordinate value of the adjacent pixel point of the target point to be calculated in the image.

进一步，训练对抗生成网络模型以扩充原有皮肤病理图像数据集的规模，具体包括：Furthermore, an adversarial generative network model is trained to expand the size of the original skin pathology image data set, including:

步骤S401：利用残差损失函数测量生成图像G_acc与原始图像G_path之间的不相似性，理想情况下，生成图像与原始图像之间像素补丁数相同，残差损失值为0，残差损失函数的公式如下：Step S401: Use the residual loss function to measure the dissimilarity between the generated image G _acc and the original image G _path . Ideally, the number of pixel patches between the generated image and the original image is the same, the residual loss value is 0, and the residual The formula of the loss function is as follows:

上式中，Loss_gen(·)表示残差损失函数，P_G∈Data(G_path)表示输入图像属于真实图像数据的概率，s表示图像的数量，log(·)表示对数函数；In the above formula, Loss _gen (·) represents the residual loss function, P _G∈Data (G _path ) represents the probability that the input image belongs to real image data, s represents the number of images, and log(·) represents the logarithmic function;

步骤S402：利用特征损失函数鉴别生成图像G_acc与原始图像G_path之间的特征差异性，，特征损失函数的公式如下：Step S402: Use the feature loss function to identify the feature difference between the generated image G _acc and the original image G _path . The formula of the feature loss function is as follows:

Loss_dis(G_acc,G_path)＝-[log(P_G∈Data(G_path))+log(1-P_G∈Gen(G_acc))]Loss _dis (G _acc ,G _path )＝-[log(P _G∈Data (G _path ))+log(1-P _G∈Gen (G _acc ))]

上式中，Loss_dis(·)表示特征损失函数，P_G∈Gen(G_acc)表示输入图像属于生成器创建样本的概率；In the above formula, Loss _dis (·) represents the feature loss function, and P _G∈Gen (G _acc ) represents the probability that the input image belongs to the sample created by the generator;

步骤S403：将两个损失函数加权拼接，将目标皮肤图像映射到潜在空间，利用Adam优化器更新生成器网络和辨别器网络中的参数，帮助两个网络在交替训练时更快达到收敛状态，提升模型的稳定性和可靠性，最终达到收敛，损失函数拼接公式如下：Step S403: Weighted splicing of the two loss functions, mapping the target skin image to the latent space, and using the Adam optimizer to update the parameters in the generator network and discriminator network to help the two networks reach a convergence state faster during alternate training. To improve the stability and reliability of the model and finally achieve convergence, the loss function splicing formula is as follows:

Loss(G_acc,G_path)＝α₁*Loss_gen(G_acc,G_path)+β₁*Loss_dis(G_acc,G_path)Loss(G _acc ,G _path )＝α ₁ *Loss _gen (G _acc ,G _path )+β ₁ *Loss _dis (G _acc ,G _path )

上式中，α₁表示控制生成器网络对应损失函数权重的参数，β₁表示控制辨别器网络对应损失函数权重的参数，α₁和β₁之和为1。In the above formula, α ₁ represents the parameter that controls the weight of the loss function corresponding to the generator network, β ₁ represents the parameter that controls the weight of the loss function corresponding to the discriminator network, and the sum of α ₁ and β ₁ is 1.

进一步，构建多分支特征提取网络提取皮肤病理图像多层次的结构特征并实现皮肤分型，具体包括：Furthermore, a multi-branch feature extraction network is constructed to extract multi-level structural features of skin pathology images and achieve skin classification, including:

步骤S501：构建含有4个残差块的多分支特征提取网络，每个残差块包含一层空洞卷积层、一层批量归一化层和一层自适应平均池化层，空洞卷积的计算方式如下：Step S501: Construct a multi-branch feature extraction network containing 4 residual blocks. Each residual block includes a layer of atrous convolution layer, a layer of batch normalization layer and a layer of adaptive average pooling layer. The atrous convolution layer is calculated as follows:

上式中，q_i表示第i个残差块输出图像的特征张量，q_i-1表示第i-1个残差块输出图像的特征张量，z表示空洞卷积层中卷积核的尺寸，p表示填充参数，b表示移动步长，η表示空洞卷积的膨胀率；In the above formula, q _i represents the feature tensor of the output image of the i-th residual block, q _i-1 represents the feature tensor of the output image of the i-1 residual block, and z represents the convolution kernel in the dilated convolution layer. The size of , p represents the filling parameter, b represents the moving step size, and eta represents the expansion rate of atrous convolution;

步骤S502：对不同层次的图像特征进行特征融合，避免丢失皮肤病理图像的重要特征，提升皮肤分型的正确率，特征融合的计算方式如下：Step S502: Perform feature fusion on image features at different levels to avoid losing important features of skin pathology images and improve the accuracy of skin classification. The calculation method of feature fusion is as follows:

上式中，Q表示特征融合后输出的特征张量，q_i表示第i个残差块输出的特征张量，σ_i表示第i个残差块对应的特征权重；In the above formula, Q represents the feature tensor output after feature fusion, q _i represents the feature tensor output by the i-th residual block, and σ _i represents the feature weight corresponding to the i-th residual block;

步骤S503：依据最小最大化原则对融合特征张量Q归一化编码处理，将特征张量的元素数值范围映射到[0,1]的区间内，归一化的计算方式如下：Step S503: The fused feature tensor Q is normalized and encoded according to the minimum-maximization principle, and the element value range of the feature tensor is mapped to the interval of [0,1]. The normalization calculation method is as follows:

上式中，Q'表示归一化后的特征张量，min(Q)表示张量Q中所有元素的最小值，max(Q)表示张量Q中所有元素的最大值；In the above formula, Q' represents the normalized feature tensor, min(Q) represents the minimum value of all elements in tensor Q, max(Q) represents the maximum value of all elements in tensor Q;

步骤S504：选择核函数训练支持向量机模型对特征张量Q'进行分类，得到目标皮肤图像的类别标签，完成皮肤分型。Step S504: Select the kernel function to train the support vector machine model to classify the feature tensor Q', obtain the category label of the target skin image, and complete the skin classification.

本发明的目的之二在于提供一种基于对抗生成网络的皮肤分型数据增广的系统，包括存储器、处理器及储存在存储器上并能够在处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现如前所述的方法。The second object of the present invention is to provide a system for skin typing data augmentation based on adversarial generation network, including a memory, a processor and a computer program stored in the memory and capable of running on the processor. The processor executes The computer program implements the method as described above.

本发明的目的之三是提供一种计算机可读存储介质，其上储存有计算机程序，所述计算机程序被处理器执行时实现如前所述的方法。A third object of the present invention is to provide a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the method as described above is implemented.

本发明的有益效果是：The beneficial effects of the present invention are:

(1)本发明的方法使用生成对抗网络模型实现皮肤分型数据的增广，可以在原始图像数据量较小或皮肤图像较复杂的情况下提供多样性的样本，在不影响数据质量的前提条件下，可以大幅度降低预测模型出现过拟合的情况，为医护人员对患者的皮肤情况诊断提供了有效的临床辅助决策支持；(1) The method of the present invention uses a generative adversarial network model to achieve augmentation of skin classification data, which can provide diverse samples when the amount of original image data is small or the skin image is complex, without affecting the quality of the data. Under certain conditions, it can greatly reduce the over-fitting of the prediction model and provide effective clinical decision-making support for medical staff to diagnose patients' skin conditions;

(2)本发明在分割生成病理图像时采用图像二值化的方法解决生成图像规模庞大问题，将图像像素二值化处理，可以准确且直观地将皮肤病理区域与健康区域分开，进而更方便提取皮肤病理区域的显著特征。此外，二值法具有较低的时间复杂度，有益于降低对抗生成网络训练所需的时间开销，保障皮肤分型结果的时效性；(2) The present invention uses image binarization method to solve the problem of large scale of generated images when segmenting and generating pathological images. Binarizing image pixels can accurately and intuitively separate pathological skin areas from healthy areas, making it more convenient Extract salient features of skin pathological areas. In addition, the binary method has low time complexity, which is beneficial to reducing the time overhead required for adversarial generation network training and ensuring the timeliness of skin classification results;

(3)本发明在合成病理图像时采用均值滤波的方法解决图像不平滑的问题，通过计算像素周围邻域像素的平均值达到平滑图像的目的，有效地减少了噪声像素，使得图像整体呈现更加平滑的外观，有益于对抗生成网络生成更加逼真的样本图像，保证下一步皮肤分型的准确率；(3) The present invention uses the mean filtering method to solve the problem of image non-smoothness when synthesizing pathological images. It achieves the purpose of smoothing the image by calculating the average value of neighborhood pixels around the pixel, effectively reducing noise pixels and making the overall image presentation more smooth. The smooth appearance is beneficial to the adversarial generation network to generate more realistic sample images, ensuring the accuracy of skin classification in the next step;

(4)本发明在训练对抗生成网络时加入了额外的合成样本，通过生成器生成的病理皮肤图像与健康皮肤图像合成额外的新样本，增加了训练数据的多样性，为特定类别的皮肤病理数据集提供更充足的示例样本，有益于提高对抗生成网络的泛化能力，从而改善对抗生成网络在该皮肤病理类型上的生成能力；(4) The present invention adds additional synthetic samples when training the adversarial generation network, and synthesizes additional new samples through pathological skin images and healthy skin images generated by the generator, increasing the diversity of training data and providing specific categories of skin pathologies. The data set provides more sufficient example samples, which is beneficial to improving the generalization ability of the adversarial generation network, thereby improving the generation ability of the adversarial generation network on this type of skin pathology;

(5)本发明在提取图像不同层级特征时使用多分支网络结构，每个分支专注于不同的感受野大小，可以有效地捕捉图像中的局部和全局特征信息，并且在特征融合过程中采取可调整的特征权重，相比于固定特征权重，可以更好地提升图像的表达能力，有效地提升皮肤分型任务的准确性。(5) The present invention uses a multi-branch network structure when extracting features at different levels of the image. Each branch focuses on a different receptive field size, which can effectively capture local and global feature information in the image, and adopt feasible methods during the feature fusion process. Compared with fixed feature weights, adjusted feature weights can better improve the expressive ability of images and effectively improve the accuracy of skin classification tasks.

本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述，并且在某种程度上，基于对下文的考察研究对本领域技术人员而言将是显而易见的，或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书和前述的权利要求书来实现和获得。Other advantages, objects, and features of the present invention will, to the extent that they are set forth in the description that follows, and to the extent that they will become apparent to those skilled in the art upon examination of the following, or may be derived from This invention is taught by practicing it. The objectives and other advantages of the invention may be realized and attained by the following description and the preceding claims.

附图说明Description of drawings

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作进一步的详细描述，其中：In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings, in which:

图1为本发明的一种基于对抗生成网络的皮肤分型数据增广方法及系统流程示意图；Figure 1 is a schematic flow diagram of a skin typing data augmentation method and system flow based on an adversarial generation network according to the present invention;

图2为本发明实施皮肤图像原始数据示意图；Figure 2 is a schematic diagram of raw skin image data implemented in the present invention;

图3为本发明实施最大流路径图像分割示意图；Figure 3 is a schematic diagram of the maximum flow path image segmentation implemented in the present invention;

图4为本发明实施构建生成器网络示意图；Figure 4 is a schematic diagram of the implementation of the present invention to build a generator network;

图5为本发明实施构建判别器网络示意图；Figure 5 is a schematic diagram of the discriminator network constructed by the implementation of the present invention;

图6为本发明实施构建二值化图像示意图；Figure 6 is a schematic diagram of the implementation of the present invention to construct a binary image;

图7为本发明实施二值化图像分割示意图；Figure 7 is a schematic diagram of the implementation of binary image segmentation according to the present invention;

图8为本发明实施均值滤波平滑合成图像示意图；Figure 8 is a schematic diagram of the mean filtering and smoothing synthetic image implemented in the present invention;

图9为本发明实施图像金字塔策略缩放图像特征示意图。Figure 9 is a schematic diagram of scaling image features using the image pyramid strategy according to the present invention.

具体实施方式Detailed ways

以下将参照附图，对本发明的优选实施例进行详细的描述。应当理解，优选实施例仅为了说明本发明，而不是为了限制本发明的保护范围。Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be understood that the preferred embodiments are only for illustrating the present invention and are not intended to limit the scope of the present invention.

本发明一种基于对抗生成网络的皮肤分型数据增广方法和系统，如图1所示，包括如下步骤：The present invention is a skin typing data augmentation method and system based on adversarial generation network, as shown in Figure 1, including the following steps:

以下将通过一个具体的实施例对上述方法的具体步骤进行进一步的阐述。The specific steps of the above method will be further explained below through a specific embodiment.

本实施例中，步骤S1具体包括如下步骤：In this embodiment, step S1 specifically includes the following steps:

步骤S101：从医学图像数据库、网络皮肤图像数据库、大型医学科研机构等多种渠道收集包含多种类型的皮肤图像保存至MongoDB数据库中建立皮肤图像数据库，其中，皮肤图像的主要信息包括：皮肤病变外观、皮肤病变形状、皮肤病变大小、皮肤病变类型、皮肤病变分级，如图2所示，其中数据获取渠道包括但不限于医学图像数据库、网络皮肤图像数据库、大型医学科研机构等；Step S101: Collect various types of skin images from medical image databases, online skin image databases, large medical research institutions and other channels and save them to the MongoDB database to establish a skin image database. The main information of the skin images includes: skin lesions Appearance, skin lesion shape, skin lesion size, skin lesion type, and skin lesion grade, as shown in Figure 2. The data acquisition channels include but are not limited to medical image databases, online skin image databases, large medical research institutions, etc.;

上式中，min表示求取最小值，u和v表示无向图中的任意两条边，c(·)表示路径流量计算函数，I表示皮肤图像，S表示源节点集合，T表示汇节点集合，通过查找从源节点到汇节点之间存在的增广路径，将该路径上每条边的权值计入增广路径的流量，当无法查找到增广路径时，表示当前路径的流量达到最大值，依据最大流路径将图像进行分割保留图像中的病理区域，如图3；In the above formula, min represents the minimum value, u and v represent any two edges in the undirected graph, c(·) represents the path flow calculation function, I represents the skin image, S represents the source node set, and T represents the sink node Set, by finding the augmented path that exists between the source node and the sink node, the weight of each edge on the path is included in the flow of the augmented path. When the augmented path cannot be found, it represents the flow of the current path. Reaching the maximum value, the image is segmented according to the maximum flow path to retain the pathological area in the image, as shown in Figure 3;

Pixel(x_m,y_m)＝0.299R+0.587G+0.114BPixel(x _m ,y _m )＝0.299R+0.587G+0.114B

上式中，(x_m,y_m)表示缺失像素的坐标值，(x₀,y₀)表示坐标原点，(x_r,y_r)表示随机选取缺失像素附近存在像素的坐标值，U表示缺失像素的横向偏移量，V表示缺失像素的纵向偏移量，Pixel'(·)表示图像灰度估计函数，Pixel(·)是图像灰度计算函数，采用加权平均的方法计算目标像素的灰度值，红色分量R的权重设置为0.299，绿色分量G的权重设置为0.587，蓝色分量B的权重设置为0.114，需要注意的是，三种颜色分量的权重可根据实际情况调整且权重总和为1；In the above formula, (x _m , y _m ) represents the coordinate value of the missing pixel, (x ₀ , y ₀ ) represents the coordinate origin, (x _r , y _r ) represents the coordinate value of the pixel that is randomly selected near the missing pixel, and U represents The horizontal offset of the missing pixel, V represents the vertical offset of the missing pixel, Pixel'(·) represents the image grayscale estimation function, Pixel(·) is the image grayscale calculation function, and the weighted average method is used to calculate the target pixel For grayscale values, the weight of the red component R is set to 0.299, the weight of the green component G is set to 0.587, and the weight of the blue component B is set to 0.114. It should be noted that the weights of the three color components can be adjusted according to the actual situation and the weights The sum is 1;

步骤S105：将补全后的图像进行分块处理，采用滑动窗口的方式遍历图像并将图像裁剪为256×256固定尺寸的图像G_path，为裁剪后的图像添加相应的标签，若原始图像包含标签信息则依据原始信息添加相应的标签，若原始图像缺少或遗漏标签信息则由人工识别添加具体的标签信息，具体来说，若原始图像由人工识别为鳞状细胞皮肤癌，则手动为图像添加皮肤病理类型标签，其值为“鳞状细胞皮肤癌”。Step S105: Divide the completed image into blocks, use a sliding window to traverse the image and crop the image into a 256×256 fixed-size image G _path , and add corresponding labels to the cropped image. If the original image contains The label information adds corresponding labels based on the original information. If the original image lacks or omits label information, manual identification will add specific label information. Specifically, if the original image is manually identified as squamous cell skin cancer, the image will be manually identified. Add a skin pathology type tag with the value "Squamous Cell Skin Cancer".

在本实施例中，步骤S2具体包括如下步骤：In this embodiment, step S2 specifically includes the following steps:

步骤S201：构建具有对称结构的生成器网络，学习输入图像映射到对应参照图像之间的潜在特征，构建的生成器网络由3个卷积层组成，第一个卷积层包含32个尺寸为7×7的卷积核且卷积步长设置为1，第二个卷积层包含64个尺寸为5×5的卷积核且卷积步长设置为1，第三个卷积层包含128个尺寸为3×3的卷积核且卷积步长设置为1；然后，设置3个由1层卷积层和1层ReLU激活函数组成的残差网络单元，残差网络单元中卷积层包含128个尺寸为3×3的卷积核且卷积步长设置为1；最后，设置3个转置卷积层，实现特征采样以及图像恢复，第一个转置卷积层包含128个尺寸为3×3的卷积核且卷积步长设置为1，第二个转置卷积层包含64个尺寸为5×5的卷积核且卷积步长设置为1，第三个转置卷积层包含32个尺寸为7×7卷积核且卷积层步长设置为1，将最后一层转置卷积层的输出特征与输入图像进行特征融合得到最后的输出图像G_gen，如图4，卷积操作的表达式如下：Step S201: Construct a generator network with a symmetric structure, and learn the latent features between the input image and the corresponding reference image. The constructed generator network consists of 3 convolutional layers. The first convolutional layer contains 32 pixels of size The convolution kernel is 7×7 and the convolution step is set to 1. The second convolution layer contains 64 convolution kernels of size 5×5 and the convolution step is set to 1. The third convolution layer contains 128 convolution kernels with a size of 3×3 and the convolution step size is set to 1; then, set up 3 residual network units composed of 1 layer of convolution layer and 1 layer of ReLU activation function, and the residual network unit is convolved. The convolutional layer contains 128 convolution kernels with a size of 3×3 and the convolution step size is set to 1; finally, 3 transposed convolution layers are set up to achieve feature sampling and image recovery. The first transposed convolution layer contains 128 convolution kernels of size 3 × 3 and the convolution step size is set to 1. The second transposed convolution layer contains 64 convolution kernels of size 5 × 5 and the convolution step size is set to 1. The three transposed convolutional layers contain 32 convolution kernels with a size of 7×7 and the convolutional layer stride is set to 1. The output features of the last transposed convolutional layer are fused with the input image to obtain the final output. Image G _gen , as shown in Figure 4, the expression of the convolution operation is as follows:

步骤S202：构建由5个卷积层组成的辨别器网络，提取输入图像中的不同特征，对不同类型的皮肤进行识别，辨别器网络中卷积层均选用尺寸为3×3的卷积核，每一层卷积层均加入批量归一化层，依照先后顺序，卷积层的包含的卷积层数量分别为32、64、128、256、1，前四层卷积层使用ReLU激活函数，最后一层卷积层使用Sigmoid激活函数，输出对应输入图像是否为生成图像的概率值，如图5。Step S202: Construct a discriminator network composed of 5 convolutional layers, extract different features in the input image, and identify different types of skin. The convolutional layers in the discriminator network all use convolution kernels with a size of 3×3. , each convolutional layer is added with a batch normalization layer. In order, the number of convolutional layers included in the convolutional layer are 32, 64, 128, 256, and 1 respectively. The first four convolutional layers use ReLU activation. function, the last convolutional layer uses the Sigmoid activation function to output the probability value of whether the input image is a generated image, as shown in Figure 5.

步骤S3具体包括如下步骤：Step S3 specifically includes the following steps:

k(l)＝f(l)×m^-1 k(l)=f(l)×m ^-1

上式中，k(l)表示灰度级为l的像素在图像中的出现频率，f(l)表示灰度级为l的像素在图像中的出现频次，m表示图像中的像素数量，如灰度级为0的像素在图像中出现频次f(0)为28，图像中的像素数量m为625，则灰度级为0的像素在图像中的出现频率k(0)为0.0448；In the above formula, k(l) represents the frequency of occurrence of pixels with gray level l in the image, f(l) represents the frequency of occurrence of pixels with gray level l in the image, m represents the number of pixels in the image, For example, the frequency f(0) of pixels with gray level 0 in the image is 28, and the number of pixels m in the image is 625, then the frequency k(0) of pixels with gray level 0 in the image is 0.0448;

上式中，δ_i表示基于第i个灰度级构建二值化图像的阈值，表示图像在阈值为δ_i时的类间方差，/>表示图像在阈值为δ_i时灰度级为0的像素数量，/>表示图像在阈值为δ_i时灰度级为255的像素数量，f(l)表示灰度级为l的像素在图像中的出现频次，m表示图像中的像素数量，h(l)表示图像中灰度级的数量，max{·}是求最大值算子，δ表示最大类间方差所对应的灰度级阈值，如图6所示；In the above formula, δ _i represents the threshold for constructing a binary image based on the i-th gray level, Represents the inter-class variance of the image when the threshold value is δ _i ,/> Indicates the number of pixels with a gray level of 0 in the image when the threshold value is δ _i ,/> Indicates the number of pixels with a gray level of 255 in the image when the threshold value is δ _i , f(l) indicates the frequency of occurrence of pixels with a gray level of l in the image, m indicates the number of pixels in the image, h(l) indicates the image The number of medium gray levels, max{·} is the maximum operator, and δ represents the gray level threshold corresponding to the maximum inter-class variance, as shown in Figure 6;

步骤S303：生成器得到的图像G_gen中的皮肤部分与病理部分存在较为明显的像素差异，利用二值化图像G_bin将图像G_gen分割得到更精细的病理图像G_acc再输入至判别器中，如图7所示；Step S303: There are obvious pixel differences between the skin part and the pathological part in the image G _gen obtained by the generator. Use the binary image G _bin to segment the image G _gen to obtain a more refined pathological image G _acc and then input it into the discriminator. , as shown in Figure 7;

步骤S304：利用生成器网络生成的病理图像与健康皮肤合成，病理图像的边缘区域与健康皮肤的像素差异较为明显，利用均值滤波的方式对病理图像的边缘进行平滑改进，图像边缘均值滤波平滑公式如下：Step S304: Use the pathological image generated by the generator network to synthesize the healthy skin. The pixel difference between the edge area of the pathological image and the healthy skin is obvious. Use mean filtering to smooth the edges of the pathological image. The image edge mean filter smoothing formula as follows:

G_i'＝λ×G_i+(1-λ)×G_j G _i '=λ×G _i +(1-λ)×G _j

上式中，G_i'表示滤波后的像素值，G_i表示待滤波的像素值，G_j表示与G_i相邻像素点的像素值，λ表示控制图像滤波的权重，取值为0.625，具体来说，如待滤波G_i的像素值为220，其相邻像素点G_j的像素值为176，则滤波后G_i'的像素值为203.5，如图8；In the above formula, G _i ' represents the filtered pixel value, G _i represents the pixel value to be filtered, G _j represents the pixel value of the pixel adjacent to G _i , and λ represents the weight that controls image filtering, with a value of 0.625. Specifically, if the pixel value of G _i to be filtered is 220 and the pixel value of its adjacent pixel point G _j is 176, then the pixel value of G _i ' after filtering is 203.5, as shown in Figure 8;

上式中，l_i表示待计算目标点在图像中的坐标值，x_src表示缩放后图像的横坐标，y_src表示缩放后图像的横坐标，x_dst表示缩放前图像的横坐标，y_dst表示缩放前图像的纵坐标，W_src表示缩放后图像的宽度，W_dst表示缩放前图像的宽度，H_src表示缩放后图像的宽度，H_dst表示缩放前图像的宽度，G_i表示待计算目标点的像素值，G_j表示待计算目标点邻近像素点的像素值，l_j表示待计算目标点邻近像素点在图像中的坐标值，如图9所示。In the above formula, l _i represents the coordinate value of the target point to be calculated in the image, x _src represents the abscissa of the image after scaling, y _src represents the abscissa of the image after scaling, x _dst represents the abscissa of the image before scaling, y _dst represents the ordinate of the image before scaling, W _src represents the width of the image after scaling, W _dst represents the width of the image before scaling, H _src represents the width of the image after scaling, H _dst represents the width of the image before scaling, G _i represents the target to be calculated The pixel value of the point, G _j represents the pixel value of the adjacent pixel point of the target point to be calculated, and l _j represents the coordinate value of the adjacent pixel point of the target point to be calculated in the image, as shown in Figure 9.

步骤S4具体包括如下步骤：Step S4 specifically includes the following steps:

步骤S401：利用残差损失函数测量生成图像G_acc与原始图像G_path之间的不相似性，理想情况下，生成图像与原始图像之间补丁数相同，残差损失值为0，残差损失函数的公式如下：Step S401: Use the residual loss function to measure the dissimilarity between the generated image G _acc and the original image G _path . Ideally, the number of patches between the generated image and the original image is the same, the residual loss value is 0, and the residual loss The formula of the function is as follows:

上式中，α₁表示控制生成器网络对应损失函数权重的参数，取值为0.3，β₁表示控制辨别器网络对应损失函数权重的参数，取值为0.7，α₁和β₁之和为1；具体来说，步骤S401求取Loss_gen(G_acc,G_path)值为26.523；步骤S402求取Loss_dis(G_acc,G_path)值为1.046；最终，求取Loss(G_acc,G_path)值为8.689。In the above formula, α ₁ represents the parameter that controls the weight of the loss function corresponding to the generator network, and the value is 0.3. β ₁ represents the parameter that controls the weight of the loss function corresponding to the discriminator network. The value is 0.7. The sum of α ₁ and β ₁ is 1; Specifically, the value of Loss _gen (G _acc ,G _path ) is calculated in step S401 to be 26.523; the value of Loss _dis (G _acc ,G _path ) is calculated in step S402 to be 1.046; finally, the value of Loss (G _acc ,G path ) is calculated. _path ) value is 8.689.

步骤S5具体包括如下步骤：Step S5 specifically includes the following steps:

步骤S501：搭建含有4个残差块的多分支特征提取网络，每个残差块包含一层空洞卷积层、一层批量归一化层和一层自适应平均池化层，空洞卷积的计算方式如下：Step S501: Build a multi-branch feature extraction network containing 4 residual blocks. Each residual block includes a dilated convolution layer, a batch normalization layer and an adaptive average pooling layer. The dilated convolution layer is calculated as follows:

上式中，q_i表示第i个残差块输出图像的特征张量，q_i-1表示第i-1个残差块输出图像的特征张量，z表示空洞卷积层中卷积核的尺寸，p表示填充参数，b表示移动步长，η表示空洞卷积的膨胀率，如选择尺寸为3×3的卷积核提取较小的局部病理特征，填充参数选择“same”其值为1，移动步长设置为1，保证输出图像的尺寸不变，4个残差块的膨胀率分别取值1、4、9、16；In the above formula, q _i represents the feature tensor of the output image of the i-th residual block, q _i-1 represents the feature tensor of the output image of the i-1 residual block, and z represents the convolution kernel in the dilated convolution layer. size, p represents the filling parameter, b represents the moving step size, eta represents the expansion rate of dilated convolution, for example, select a convolution kernel with a size of 3×3 to extract smaller local pathological features, and select "same" for the filling parameter. is 1, the moving step size is set to 1 to ensure that the size of the output image remains unchanged, and the expansion rates of the four residual blocks take values of 1, 4, 9, and 16 respectively;

步骤S504：选择高斯核作为支持向量机的核函数，根据张量的特征和标签学习合适的决策边界，最大化不同类别之间的间隔，使用训练过的支持向量机模型对特征张量Q'进行分类，得到目标皮肤图像的类别标签，完成皮肤分型。Step S504: Select the Gaussian kernel as the kernel function of the support vector machine, learn an appropriate decision boundary according to the characteristics and labels of the tensor, maximize the interval between different categories, and use the trained support vector machine model to compare the feature tensor Q' Classify, obtain the category label of the target skin image, and complete the skin classification.

需要说明的是，本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。It should be noted that the present invention is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the invention. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

本发明中应用了具体实施例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想；同时，对于本领域的一般技术人员，依据本发明的思想，在具体实施方式及应用范围上均会有改变之处，综上所述，本说明书内容不应理解为对本发明的限制The present invention uses specific embodiments to illustrate the principles and implementation methods of the present invention. The description of the above embodiments is only used to help understand the method of the present invention and its core idea; at the same time, for those of ordinary skill in the art, based on this The idea of the invention will be subject to change in the specific implementation and scope of application. In summary, the contents of this specification should not be understood as limiting the invention.

最后说明的是，以上实施例仅用以说明本发明的技术方案而非限制，尽管参照较佳实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，可以对本发明的技术方案进行修改或者等同替换，而不脱离本技术方案的宗旨和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not limiting. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be modified. Modifications or equivalent substitutions without departing from the purpose and scope of the technical solution shall be included in the scope of the claims of the present invention.

Claims

1. A method of skin classification data augmentation based on adversarial generation network, characterized in that: the method includes the following steps:

Step S1: Collect and preprocess the original skin image data;

Step S2: Construct an adversarial generative network model to generate high-quality synthetic skin pathology image samples;

Step S3: Segment the generated skin pathology image and synthesize it with the healthy skin image to achieve data augmentation;

Step S4: Train an adversarial generative network model to expand the size of the original skin pathology image data set;

Step S5: Construct a multi-branch feature extraction network to extract multi-level structural features of skin pathology images and achieve skin classification.

2. A method of skin typing data augmentation based on adversarial generation network according to claim 1, characterized in that: the step S1 specifically includes:

Step S101: Obtain multiple types of skin images from medical image databases, online skin image databases, large medical research institutions and other channels. The main information of the skin images includes: skin lesion appearance, skin lesion shape, and skin lesion size. , skin lesion types, skin lesion grading;

Step S102: Map the skin image into an undirected graph, treat each pixel as a node, and then connect each pixel node to form an edge. The weight of each edge represents the texture difference between pixels to build grayscale symbiosis. The matrix P calculates the texture differences that exist between different pixels and is constructed as follows:

In the above formula, i represents the current pixel, j represents the adjacent pixel of i pixel, (x, y) represents the coordinate of the current pixel i, αx and βy represent the coordinate offset between the two pixels, and θ represents the pixel pair ( The frequency of occurrence of i, j) in the image, W represents the width of the image, H represents the length of the image, G_value(·) is the grayscale calculation function, which represents the grayscale value at the coordinates (x, y) in the image;

Use the texture feature function to calculate the sum of squares of gray value differences between pixels, which is used to reflect the degree of texture difference between pixels. The calculation method is as follows:

D(i,j)＝∑ _i,j [G_value(x,y)-G_value(x+αx,y+βy)] ² ×I(G_value(x,y)＝i)×I(G_value(x+ αx,y+βy)＝j)

In the above formula, D(·) represents the difference degree calculation function, and I(·) represents the indicator function. When I(·) is true, the value is 1, otherwise the value is 0;

According to the pixel texture difference measurement formula, the weight of each edge in the undirected graph is calculated as follows:

Step S103: Add two additional node sets in the constructed undirected graph, namely the source node set S and the sink node set T, which are connected to each vertex in the undirected graph to form an edge. Calculate and divide the nodes in the graph into two The minimum cost of a set is calculated as follows:

Cost＝min∑ _{(u,v)∈I,s∈S,t∈T} c(u,v)

In the above formula, min represents the minimum value, u and v represent any two edges in the undirected graph, c(·) represents the path flow calculation function, I represents the skin image, S represents the source node set, and T represents the sink node Set, by finding the augmented path that exists between the source node and the sink node, the weight of each edge on the path is included in the flow of the augmented path. When the augmented path cannot be found, it represents the flow of the current path. When the maximum value is reached, the image is segmented according to the maximum flow path to retain the pathological areas in the image;

Step S104: Use linear interpolation method to complete the image into a distorted image block of 512×512 fixed size to make up for the defect of inconsistent size of the segmented image. The completion calculation formula is as follows:

Pixel'(x _m ,y _m )=(1-U)*(1-V)*Pixel(x _m ,y _m )+U*(1-V)*Pixel(x _m ,y _m )+(1 -U)*V*Pixel(x _m ,y _m )

Pixel(x _m ,y _m )＝w ₁ *R+w ₂ *G+w ₃ *B

In the above formula, (x _m , y _m ) represents the coordinate value of the missing pixel, (x ₀ , y ₀ ) represents the coordinate origin, (x _r , y _r ) represents the coordinate value of the complete pixel near the randomly selected missing pixel, U represents the horizontal offset of the missing pixel, V represents the vertical offset of the missing pixel, Pixel'(·) represents the image grayscale estimation function, Pixel(·) is the image grayscale calculation function, and uses the weighted average method to calculate the target pixel The gray value of w ₁ , w ₂ , and w ₃ represents the weight of the red component, green component, and blue component respectively. It should be noted that the weights of the three color components can be adjusted according to the actual situation and the sum of the weights is 1;

Step S105: Divide the completed image into blocks, use a sliding window to traverse the image and crop the image into a 256×256 fixed-size image G _path , and add corresponding labels to the cropped image. If the original image contains The label information adds corresponding labels based on the original information. If the original image lacks or omits label information, it will be manually identified and specific label information will be added.

3. A method of skin typing data augmentation based on adversarial generation network according to claim 1, characterized in that: the step S2 specifically includes:

Step S201: Construct a generator network with a symmetric structure, and learn the latent features between the input image and the corresponding reference image. The constructed generator network consists of 3 convolutional layers. The first convolutional layer contains 32 pixels of size The convolution kernel is 7×7 and the convolution step is set to 1. The second convolution layer contains 64 convolution kernels of size 5×5 and the convolution step is set to 1. The third convolution layer contains 128 convolution kernels with a size of 3×3 and the convolution step size is set to 1; then, set up 3 residual network units composed of 1 layer of convolution layer and 1 layer of ReLU activation function, and the residual network unit is convolved. The convolutional layer contains 128 convolution kernels with a size of 3×3 and the convolution step size is set to 1; finally, 3 transposed convolution layers are set up to achieve feature sampling and image recovery. The first transposed convolution layer contains 128 convolution kernels of size 3 × 3 and the convolution step size is set to 1. The second transposed convolution layer contains 64 convolution kernels of size 5 × 5 and the convolution step size is set to 1. The three transposed convolutional layers contain 32 convolution kernels with a size of 7×7 and the convolutional layer stride is set to 1. The output features of the last transposed convolutional layer are fused with the input image to obtain the final output. Image G _gen , the expression of the convolution operation is as follows:

In the above formula, Represents the feature map output by the current channel, K represents the size of the convolution kernel, W represents the width of the image, H represents the height of the image, /> Represents the feature map of the current channel input, x, y represents the coordinate position of the output feature map in channel n, w, h represents the element coordinates of the convolution kernel weight matrix in channel n;

Step S202: Construct a discriminator network composed of 5 convolutional layers, extract different features in the input image, and identify different types of skin. The convolutional layers in the discriminator network all use convolution kernels with a size of 3×3. , each convolutional layer is added with a batch normalization layer. In order, the number of convolutional layers included in the convolutional layer are 32, 64, 128, 256, 1 respectively. The first four convolutional layers use the ReLU activation function. , the last convolutional layer uses the Sigmoid activation function to output the probability value of whether the corresponding input image is a generated image.

4. A method of skin typing data augmentation based on adversarial generation network according to claim 1, characterized in that: the step S3 specifically includes:

Step S301: Traverse each pixel of the image G _gen and obtain the corresponding gray value, and count the pixel frequency of each gray value to construct a gray histogram corresponding to the image G _gen . The expression form of the gray histogram is as follows:

k(l)=f(l)×m ^-1

In the above formula, k(l) represents the frequency of occurrence of pixels with gray level l in the image, f(l) represents the frequency of occurrence of pixels with gray level l in the image, and m represents the number of pixels in the image;

Step S302: Calculate the inter-class variance corresponding to different gray level thresholds, obtain the threshold δ to maximize the difference between the disease texture and normal skin, and convert the image G _gen into the binary image G _bin according to the threshold δ, calculation method as follows:

δ=max{δ ₁ , δ ₂ ,..., δ _i }, i=h(l)

In the above formula, δ _i represents the threshold for constructing a binary image based on the i-th gray level, Represents the inter-class variance of the image when the threshold value is δ _i ,/> Indicates the number of pixels with a gray level of 0 in the image when the threshold value is δ _i ,/> Indicates the number of pixels with a gray level of 255 in the image when the threshold value is δ _i , f(l) indicates the frequency of occurrence of pixels with a gray level of l in the image, m indicates the number of pixels in the image, h(l) indicates the image The number of medium gray levels, max{·} is the maximum operator, and δ represents the gray level threshold corresponding to the maximum inter-class variance;

Step S303: There are obvious pixel differences between the skin part and the pathological part in the image G _gen obtained by the generator. Use the binary image G _bin to segment the image G _gen to obtain a more refined pathological image G _acc , and then input it to the discriminator. middle;

Step S304: Use the pathological image generated by the generator network to synthesize the healthy skin. The edge area of the pathological image is significantly different from the pixels of the healthy skin. Use mean filtering to smooth the edges of the pathological image. The image edge mean filter smoothing formula is as follows :

G _i '=λ×G _i +(1-λ)×G _j

In the above formula, G _i ' represents the filtered pixel value, G _i represents the pixel value to be filtered, G _j represents the pixel value of the pixel adjacent to G _i , and λ represents the weight controlling image filtering;

Step S305: Use the bilinear interpolation method to perform multi-scale processing on the synthesized image to construct an image pyramid structure to ensure that the synthesized image can retain the main pathological features. The position coordinates of the pixels and the pixel value are calculated as follows:

In the above formula, l _i represents the coordinate value of the target point to be calculated in the image, x _src represents the abscissa of the image after scaling, y _src represents the abscissa of the image after scaling, x _dst represents the abscissa of the image before scaling, y _dst represents the ordinate of the image before scaling, W _src represents the width of the image after scaling, W _dst represents the width of the image before scaling, H _src represents the width of the image after scaling, H _dst represents the width of the image before scaling, G _i represents the target to be calculated The pixel value of the point, G _j represents the pixel value of the adjacent pixel point of the target point to be calculated, and l _j represents the coordinate value of the adjacent pixel point of the target point to be calculated in the image.

5. A method of skin typing data augmentation based on adversarial generation network according to claim 1, characterized in that: the step S4 specifically includes:

Step S401: Use the residual loss function to measure the dissimilarity between the generated image G _acc and the original image G _path . Ideally, the number of pixel patches between the generated image and the original image is the same, the residual loss value is 0, and the residual The formula of the loss function is as follows:

In the above formula, Loss _gen (·) represents the residual loss function, P _G∈Data (G _path ) represents the probability that the input image belongs to real image data, s represents the number of images, and log(·) represents the logarithmic function;

Step S402: Use the feature loss function to identify the feature difference between the generated image G _acc and the original image G _path . The formula of the feature loss function is as follows:

Loss _dis (G _acc ,G _path )＝-[log(P _G∈Data (G _path ))+log(1-P _G∈Gen (G _acc ))]

In the above formula, Loss _dis (·) represents the feature loss function, and P _G∈Gen (G _acc ) represents the probability that the input image belongs to the sample created by the generator;

Step S403: Weighted splicing of the two loss functions, mapping the target skin image to the latent space, and using the Adam optimizer to update the parameters in the generator network and discriminator network to help the two networks reach a convergence state faster during alternate training. To improve the stability and reliability of the model and finally achieve convergence, the loss function splicing formula is as follows:

Loss(G _acc ,G _path )＝α ₁ *Loss _gen (G _acc ,G _path )+β ₁ *Loss _dis (G _acc ,G _path )

In the above formula, α ₁ represents the parameter that controls the weight of the loss function corresponding to the generator network, β ₁ represents the parameter that controls the weight of the loss function corresponding to the discriminator network, and the sum of α ₁ and β ₁ is 1.

6. A method of skin typing data augmentation based on adversarial generation network according to claim 1, characterized in that: the step S5 specifically includes:

Step S501: Construct a multi-branch feature extraction network containing 4 residual blocks. Each residual block includes a layer of atrous convolution layer, a layer of batch normalization layer and a layer of adaptive average pooling layer. The atrous convolution layer is calculated as follows:

In the above formula, q _i represents the feature tensor of the output image of the i-th residual block, q _i-1 represents the feature tensor of the output image of the i-1 residual block, and z represents the convolution kernel in the dilated convolution layer. The size of , p represents the filling parameter, b represents the moving step size, and eta represents the expansion rate of atrous convolution;

Step S502: Perform feature fusion on image features at different levels to avoid losing important features of skin pathology images and improve the accuracy of skin classification. The calculation method of feature fusion is as follows:

In the above formula, Q represents the feature tensor output after feature fusion, q _i represents the feature tensor output by the i-th residual block, and σ _i represents the feature weight corresponding to the i-th residual block;

Step S503: The fused feature tensor Q is normalized and encoded according to the minimum-maximization principle, and the element value range of the feature tensor is mapped to the interval of [0,1]. The normalization calculation method is as follows:

In the above formula, Q' represents the normalized feature tensor, min(Q) represents the minimum value of all elements in tensor Q, max(Q) represents the maximum value of all elements in tensor Q;

Step S504: Select the kernel function to train the support vector machine model to classify the feature tensor Q', obtain the category label of the target skin image, and complete the skin classification.

7. A system for skin typing data augmentation based on adversarial generation network, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, characterized in that: the processor executes the A computer program implements the method according to any one of claims 1-6.

8. A computer-readable storage medium on which a computer program is stored, characterized in that: when the computer program is executed by a processor, the method according to any one of claims 1-6 is implemented.