CN117292217A - Skin typing data augmentation method and system based on countermeasure generation network - Google Patents
Skin typing data augmentation method and system based on countermeasure generation network Download PDFInfo
- Publication number
- CN117292217A CN117292217A CN202310669194.3A CN202310669194A CN117292217A CN 117292217 A CN117292217 A CN 117292217A CN 202310669194 A CN202310669194 A CN 202310669194A CN 117292217 A CN117292217 A CN 117292217A
- Authority
- CN
- China
- Prior art keywords
- image
- skin
- representing
- pixel
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000013434 data augmentation Methods 0.000 title claims abstract description 23
- 230000007170 pathology Effects 0.000 claims abstract description 41
- 230000004927 fusion Effects 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000008485 antagonism Effects 0.000 claims abstract description 10
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 230000007547 defect Effects 0.000 claims abstract description 5
- 238000012706 support-vector machine Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 70
- 238000004364 calculation method Methods 0.000 claims description 27
- 230000001575 pathological effect Effects 0.000 claims description 27
- 238000010586 diagram Methods 0.000 claims description 20
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 19
- 206010040882 skin lesion Diseases 0.000 claims description 17
- 231100000444 skin lesion Toxicity 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 12
- 230000036074 healthy skin Effects 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 9
- 230000003190 augmentative effect Effects 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 6
- 230000002194 synthesizing effect Effects 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 238000009499 grossing Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 239000000427 antigen Substances 0.000 claims description 3
- 102000036639 antigens Human genes 0.000 claims description 3
- 108091007433 antigens Proteins 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000037311 normal skin Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000003902 lesion Effects 0.000 claims description 2
- 238000003860 storage Methods 0.000 claims description 2
- 230000003042 antagnostic effect Effects 0.000 claims 5
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000003709 image segmentation Methods 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 2
- 230000000295 complement effect Effects 0.000 abstract 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 5
- 201000000849 skin cancer Diseases 0.000 description 5
- 210000004027 cell Anatomy 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000001965 increasing effect Effects 0.000 description 2
- 208000025865 Ulcer Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 210000000270 basal cell Anatomy 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000036555 skin type Effects 0.000 description 1
- 231100000397 ulcer Toxicity 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/54—Extraction of image or video features relating to texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a skin typing data augmentation method and system based on an antagonism generation network, which are used for accurately classifying skin pathology types. Because the number of the skin pathology images which can be acquired is small, the problem of over fitting can occur when the skin pathology images are directly used for training a classifier model, so that the classification effect is not ideal, and a countermeasure generation network is required to generate a vivid skin pathology image so as to alleviate the defect. After the image segmentation is carried out on the skin pathology data set, the data augmentation is carried out through the countermeasure generation network, and the parting of the skin pathology is realized by utilizing a multi-level feature fusion technology. The method mainly comprises the following steps: firstly, collecting skin pathology image data, and performing image segmentation and manual complement label operation; then, constructing an antagonism generation network to generate high-quality skin pathology images for expanding the available data set; and finally, performing feature extraction and modeling on the target image by utilizing a multi-branch feature extraction network, and realizing accurate parting of the skin by utilizing a support vector machine.
Description
Technical Field
The invention relates to the technical field of data analysis and medical treatment, in particular to a method and a system for skin typing data augmentation based on an antagonism generation network.
Background
Skin pathology typing is one of the important research contents in the clinical dermatology field. By classifying and describing skin lesions, important references can be provided for diagnosis and treatment of skin lesions. The skin pathology type can be evaluated by observing multicolor patches and surface ulcers on the skin surface, and thus the skin pathology type is primarily judged. In general, four types of pathological skin can be typed by image recognition technology, namely squamous cell skin cancer, basal cell skin cancer, malignant melanoma and merck cell skin cancer.
Currently, skin typing is mainly achieved by visual inspection. However, the traditional method exposes a plurality of defects due to the specificity and diversity of skin types, and the diagnosis results of different doctors also have certain variability, so that the subjectivity and uncertainty lead to poor reliability and accuracy of the artificial skin typing.
To address the above problems, more and more researchers have begun exploring how to use computer vision techniques to achieve skin typing. In contrast to human eyes observing skin images, computer vision techniques can automatically extract features from a large number of skin images and classify and predict them by machine learning models. The method can reduce adverse effect caused by subjectivity, and improve accuracy and reliability of skin typing. However, in the field of skin typing, the lack of large scale labeling of data sets has resulted in significant limitations in the application of data augmentation techniques. In deep learning, the larger the data amount, the better the trained model performance. Therefore, in the absence of large-scale data sets, how to improve the accuracy and reliability of skin typing by data augmentation techniques is a current challenge.
Data augmentation techniques based on the challenge-generating network (Generative Adversarial Networks, GAN) can improve the typing performance and generalization capability of the model by generating new synthetic data samples. The GAN is composed of two parts, a generator that generates new synthetic data samples by learning the distribution of the training data set, and a discriminator that discriminates whether the generated data samples are authentic. Through the training generator and the discriminator, the GAN can generate high-quality synthetic data samples, so that the scale of the original data set is expanded, and the robustness and generalization capability of the model are improved.
Therefore, the skin typing is realized by using the data augmentation technology based on the antagonism generation network, and the method has very important application value. More synthetic data are generated by utilizing GAN, so that the scale of a data set can be enlarged under the condition of not increasing the data acquisition cost, and the accuracy and the reliability of skin typing are improved. The skin images are analyzed and processed through the computer vision technology, so that the skin pathology types can be classified efficiently, and important references are provided for clinical skin pathology diagnosis and treatment.
Disclosure of Invention
Accordingly, one of the purposes of the present invention is to provide a method for enhancing skin typing data based on an anti-generation network, which uses an anti-generation network model to enhance skin typing data, and can provide various samples with smaller amount of original image data or more complex skin images, assist medical judgment and improve the diagnosis efficiency of skin.
One of the purposes of the invention is realized by the following technical scheme:
the method for enhancing skin typing data based on the antagonism generation network comprises the following steps:
step S1: collecting and preprocessing original skin image data;
step S2: constructing an antagonism generation network model to generate a high-quality synthetic skin pathology image sample;
step S3: dividing the generated skin pathology image and synthesizing the skin pathology image with the healthy skin image to realize data augmentation;
step S4: training an antigen generation network model to expand the scale of the original skin pathology image dataset;
step S5: and constructing a multi-branch feature extraction network to extract multi-level structural features of the skin pathology image and realize skin typing.
Further, the method for acquiring and preprocessing the original skin image data specifically comprises the following steps:
step S101: obtaining skin images containing multiple types from various channels of medical image databases, network skin image databases, large medical scientific institutions and the like, wherein main information of the skin images comprises: appearance of skin lesions, shape of skin lesions, size of skin lesions, type of skin lesions, classification of skin lesions;
step S102: mapping the skin image into a pair of undirected graphs, regarding each pixel as a node, connecting each pixel node to form an edge, and constructing a gray level co-occurrence matrix P to calculate texture differences among different pixels by using the weight of each edge to represent the texture differences among the pixels, wherein the construction mode is as follows:
In the above formula, i represents a current pixel, j represents an adjacent pixel of the i pixel, (x, y) represents coordinates of the current pixel i, αx and βy represent coordinate offsets between the two pixels, θ represents occurrence frequency of the pixel pair (i, j) in the image, W represents width of the image, H represents length of the image, g_value (·) is a gray scale calculation function, and represents gray scale values at coordinates (x, y) in the image;
and calculating the gray value difference square sum between pixels by using the texture feature function, wherein the gray value difference square sum is used for reflecting the texture difference degree between the pixels, and the calculation mode is as follows:
D(i,j)=∑ i,j [G_value(x,y)-G_value(x+αx,y+βy)] 2 ×I(G_value(x,y)=i)×I(G_value(x+αx,y+βy)=j)
in the above formula, D (-) represents a difference degree calculation function, I (-) represents an indication function, and when I (-) is true, the value is 1, otherwise, the value is 0;
according to a pixel texture difference measurement formula, calculating the weight of each edge in the undirected graph in the following manner:
step S103: two additional node sets are added in the constructed undirected graph, namely a source node set S and a sink node set T, and are connected with each vertex in the undirected graph to form an edge, and the minimum cost for dividing the nodes in the graph into two sets is calculated by the following calculation mode:
Cost=min∑ (u,v)∈I,s∈S,t∈T c(u,v)
in the above formula, min represents a minimum value, u and v represent any two sides in the undirected graph, c (-) represents a path flow calculation function, I represents a skin image, S represents a source node set, T represents a sink node set, the weight of each side on the path is counted into the flow of the augmented path by searching the augmented path existing between the source node and the sink node, when the augmented path cannot be found, the flow of the current path reaches the maximum value, and the image is segmented according to the maximum flow path to reserve a pathological area in the image;
Step S104: and (3) complementing the image into a distorted image block with a fixed size of 512 multiplied by 512 by adopting a linear interpolation method, and compensating the defect that the specifications and the sizes of the segmented pictures are not uniform, wherein the complementing calculation formula is as follows:
Pixel'(x m ,y m )=(1-U)*(1-V)*Pixel(x m ,y m )+U*(1-V)*Pixel(x m ,y m )+(1-U)*V*Pixel(x m ,y m )
Pixel(x m ,y m )=w 1 *R+w 2 *G+w 3 *B
in the above, (x) m ,y m ) Coordinate values representing missing pixels, (x) 0 ,y 0 ) Representing the origin of coordinates, (x) r ,y r ) The coordinate value of the complete pixel near the randomly selected missing pixel is represented, U represents the lateral offset of the missing pixel, and V represents the missing imageThe longitudinal offset of the Pixel, pixel' (. Cndot.) represents an image gray level estimation function, pixel (. Cndot.) is an image gray level calculation function, and a weighted average method is adopted to calculate the gray level value, w, of the target Pixel 1 、w 2 、w 3 The weights of the red component, the green component and the blue component are respectively represented, and it is noted that the weights of the three color components can be adjusted according to actual conditions and the total weight is 1;
step S105: the image after the completion is divided into blocks, the image is traversed by adopting a sliding window mode, and the image is cut into an image G with a fixed size of 256 multiplied by 256 path Adding corresponding labels for the cut images, adding corresponding labels according to the original information if the original images contain the label information, and manually identifying and adding specific label information if the original images lack or miss the label information.
Further, constructing an anti-generation network model to generate a high-quality synthetic skin pathology image sample, specifically comprising:
step S201: constructing a generator network with a symmetrical structure, learning potential features between input images and corresponding reference images, wherein the constructed generator network consists of 3 convolution layers, the first convolution layer comprises 32 convolution kernels with the size of 7×7 and the convolution step length is set to 1, the second convolution layer comprises 64 convolution kernels with the size of 5×5 and the convolution step length is set to 1, and the third convolution layer comprises 128 convolution kernels with the size of 3×3 and the convolution step length is set to 1; then, setting 3 residual network units consisting of a 1-layer convolution layer and a 1-layer ReLU activation function, wherein the convolution layer in the residual network units comprises 128 convolution kernels with the size of 3 multiplied by 3, and the convolution step length is set to be 1; finally, setting 3 transposed convolution layers to realize feature sampling and image recovery, wherein the first transposed convolution layer comprises 128 convolution kernels with the size of 3×3 and the convolution step length is set to be 1, the second transposed convolution layer comprises 64 convolution kernels with the size of 5×5 and the convolution step length is set to be 1, the third transposed convolution layer comprises 32 convolution kernels with the size of 7×7 and the convolution step length is set to be 1, and feature fusion is carried out on the output features of the last transposed convolution layer and the input image to obtain a final output image G gen Convolution operationThe expression is as follows:
in the above-mentioned method, the step of,a feature map representing the current channel output, K representing the size of the convolution kernel, W representing the width of the image, H representing the height of the image, +.>The method comprises the steps that a characteristic diagram input by a current channel is represented, x and y represent coordinate positions of an output characteristic diagram in a channel n, and w and h represent element coordinates of a convolution kernel weight matrix in the channel n;
step S202: different characteristics in an input image can be extracted, different types of skins can be identified, convolution kernels with the size of 3 multiplied by 3 are selected for the convolution layers in the discriminator network, batch normalization layers are added for each convolution layer, the number of the convolution layers contained in the convolution layers is 32, 64, 128, 256 and 1 according to the sequence, a ReLU activation function is used for the first four convolution layers, a Sigmoid activation function is used for the last convolution layer, and a probability value of whether the corresponding input image is generated or not is output.
Further, the method for segmenting the generated skin pathology image and synthesizing the skin pathology image with the healthy skin image to realize data augmentation specifically comprises the following steps:
step S301: traversing image G gen And obtaining corresponding gray values, and counting the pixel frequency of each gray value for constructing an image G gen The corresponding gray level histogram is expressed as follows:
k(l)=f(l)×m -1
in the above formula, k (l) represents the frequency of occurrence of a pixel with a gray level of l in the image, f (l) represents the frequency of occurrence of a pixel with a gray level of l in the image, and m represents the number of pixels in the image;
step S302: calculating between classes corresponding to different gray level thresholdsVariance, finding a threshold delta to maximize the difference between the lesion texture and normal skin, and based on the threshold delta, mapping the image G gen Conversion to a binarized image G bin The calculation method is as follows:
δ=max{δ 1 ,δ 2 ,…,δ i },i=h(l)
in the above, delta i Representing the threshold for constructing a binarized image based on the ith gray level,representing an image at a threshold delta i Inter-class variance of time,/>Representing an image at a threshold delta i Number of pixels with gray level 0, +.>Representing an image at a threshold delta i The number of pixels with the gray level of 255, f (l) represents the occurrence frequency of the pixels with the gray level of l in the image, m represents the number of pixels in the image, h (l) represents the number of gray levels in the image, max { · } is a maximum operator, and delta represents a gray level threshold corresponding to the maximum inter-class variance;
step S303: image G obtained by the generator gen The skin part and the pathological part of the skin part have obvious pixel difference, and the binarized image G is utilized bin Image G gen Segmentation to obtain finer pathology image G acc Then input into a discriminator;
step S304: the pathological image generated by the generator network is synthesized with the healthy skin, the pixel difference between the edge area of the pathological image and the healthy skin is obvious, the edge of the pathological image is smoothly improved by means of mean value filtering, and an image edge mean value filtering smoothing formula is as follows:
G i '=λ×G i +(1-λ)×G j
in the above, G i ' represents the filtered pixel value, G i Representing the pixel value to be filtered, G j Representation and G i The pixel values of adjacent pixels, λ, represent the weights controlling the filtering of the image;
step S305: the synthesized image is processed in a multi-scale mode by utilizing a bilinear interpolation method, an image pyramid structure is constructed, the synthesized image can be ensured to keep main pathological features, and the position coordinates of pixels and the pixel values are calculated as follows:
in the above, l i Representing coordinate values of a target point to be calculated in an image, x src Representing the abscissa, y, of the scaled image src Representing the abscissa, x, of the scaled image dst Representing the abscissa, y, of the image before scaling dst Representing the ordinate, W, of the image before scaling src Representing the width of the scaled image, W dst Representing the width of the image before scaling, H src Representing the width of the scaled image, H dst Representing the width of the image before scaling, G i Representing the pixel value of the target point to be calculated, G j Representing the pixel value, l, of the neighboring pixel points of the target point to be calculated j Representing coordinate values of adjacent pixel points of the target point to be calculated in the image.
Further, training the challenge-generating network model to scale the original skin pathology image dataset, specifically comprising:
step S401: generating an image G using residual loss function measurements acc And the original image G path Dissimilarity between the generated image and the original image, the number of pixel patches is the same in ideal case, the residual loss value is 0, and the formula of the residual loss function is as follows:
in the above, loss gen (. Cndot.) represents the residual loss function, P G∈Data (G path ) Representing the probability that the input image belongs to the real image data, s representing the number of images, and log (·) representing a logarithmic function;
step S402: generating an image G using feature loss function discrimination acc And the original image G path The characteristic difference between them, the characteristic loss function is formulated as follows:
Loss dis (G acc ,G path )=-[log(P G∈Data (G path ))+log(1-P G∈Gen (G acc ))]
in the above, loss dis (. Cndot.) represents a characteristic loss function, P G∈Gen (G acc ) Representing a probability that the input image belongs to the generator created sample;
step S403: the two loss functions are weighted and spliced, the target skin image is mapped to a potential space, parameters in a generator network and a discriminator network are updated by using an Adam optimizer, the two networks are helped to reach a convergence state faster during alternate training, the stability and the reliability of a model are improved, convergence is finally achieved, and a loss function splicing formula is as follows:
Loss(G acc ,G path )=α 1 *Loss gen (G acc ,G path )+β 1 *Loss dis (G acc ,G path )
In the above, alpha 1 Parameters representing the weight of the corresponding loss function of the control generator network, beta 1 Parameter, alpha, representing the weight of the corresponding loss function of the control discriminator network 1 And beta 1 The sum is 1.
Further, constructing a multi-branch feature extraction network to extract multi-level structural features of skin pathology images and realize skin typing, and specifically comprises the following steps:
step S501: constructing a multi-branch feature extraction network containing 4 residual blocks, wherein each residual block comprises a cavity convolution layer, a batch normalization layer and a self-adaptive average pooling layer, and the cavity convolution is calculated as follows:
in the above, q i Characteristic tensor representing i-th residual block output image, q i-1 Representing the characteristic tensor of the i-1 th residual block output image, wherein z represents the size of a convolution kernel in a cavity convolution layer, p represents a filling parameter, b represents a moving step length, and eta represents the expansion rate of cavity convolution;
step S502: feature fusion is carried out on the image features of different layers, so that important features of skin pathology images are prevented from being lost, the accuracy of skin typing is improved, and the computing mode of feature fusion is as follows:
in the above formula, Q represents the feature tensor output after feature fusion, Q i Characteristic tensor, sigma representing i-th residual block output i Representing the characteristic weight corresponding to the ith residual block;
step S503: and (3) carrying out normalized encoding processing on the fusion characteristic tensor Q according to a minimum maximization principle, mapping the element numerical range of the characteristic tensor into the interval of [0,1], and carrying out normalization according to the following calculation mode:
in the above formula, Q' represents the normalized characteristic tensor, min (Q) represents the minimum value of all elements in the tensor Q, and max (Q) represents the maximum value of all elements in the tensor Q;
step S504: and selecting a kernel function training support vector machine model to classify the feature tensor Q', obtaining a class label of the target skin image, and completing skin typing.
It is a second object of the present invention to provide a system for skin typing data augmentation based on an countermeasure generation network, comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, said processor implementing the method as described above when executing said computer program.
It is a further object of the invention to provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described above.
The beneficial effects of the invention are as follows:
(1) The method of the invention realizes the augmentation of skin parting data by using the generated countermeasure network model, can provide diversified samples under the condition of smaller original image data quantity or more complex skin images, can greatly reduce the situation that the predictive model is over-fitted under the precondition of not influencing the data quality, and provides effective clinical auxiliary decision support for medical staff to diagnose the skin condition of patients;
(2) The invention solves the problem of huge scale of the generated image by adopting an image binarization method when the pathological image is generated by segmentation, and can accurately and intuitively separate the skin pathological region from the healthy region by performing image pixel binarization treatment, thereby being more convenient for extracting the obvious characteristics of the skin pathological region. In addition, the binary method has lower time complexity, is beneficial to reducing the time cost required by the countermeasure generation network training, and ensures the timeliness of the skin parting result;
(3) The invention solves the problem of unsmooth image by adopting a mean value filtering method when synthesizing pathological images, achieves the purpose of smoothing the image by calculating the mean value of neighborhood pixels around the pixels, effectively reduces noise pixels, ensures that the whole image presents smoother appearance, is beneficial to generating more vivid sample images against a generating network and ensures the accuracy of skin typing in the next step;
(4) According to the invention, an additional synthetic sample is added when the countermeasure generation network is trained, and an additional new sample is synthesized by the pathological skin image generated by the generator and the healthy skin image, so that the diversity of training data is increased, more sufficient sample examples are provided for a specific type of skin pathological data set, the generalization capability of the countermeasure generation network is improved, and the generation capability of the countermeasure generation network on the skin pathological type is improved;
(5) According to the invention, a multi-branch network structure is used when the characteristics of different levels of the image are extracted, each branch is focused on different receptive fields, local and global characteristic information in the image can be effectively captured, and an adjustable characteristic weight is adopted in the characteristic fusion process, so that compared with a fixed characteristic weight, the expression capacity of the image can be better improved, and the accuracy of skin typing tasks can be effectively improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a method and system for enhancing skin typing data based on an countermeasure generation network according to the present invention;
FIG. 2 is a schematic diagram of raw data of an image of skin according to an embodiment of the present invention;
FIG. 3 is a schematic view of maximum flow path image segmentation in accordance with the present invention;
FIG. 4 is a schematic diagram of a network of building generators embodying the present invention;
FIG. 5 is a schematic diagram of a network of construction discriminators embodying the present invention;
FIG. 6 is a schematic diagram of a binary image constructed in accordance with an embodiment of the present invention;
FIG. 7 is a diagram illustrating a binary image segmentation according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a smoothed composite image with mean filtering in accordance with the present invention;
FIG. 9 is a schematic diagram of scaling image features using an image pyramid strategy according to the present invention.
Detailed Description
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. It should be understood that the preferred embodiments are presented by way of illustration only and not by way of limitation.
The invention discloses a skin typing data augmentation method and a skin typing data augmentation system based on an antagonism generation network, which are shown in figure 1 and comprise the following steps:
step S1: collecting and preprocessing original skin image data;
step S2: constructing an antagonism generation network model to generate a high-quality synthetic skin pathology image sample;
step S3: dividing the generated skin pathology image and synthesizing the skin pathology image with the healthy skin image to realize data augmentation;
step S4: training an antigen generation network model to expand the scale of the original skin pathology image dataset;
Step S5: and constructing a multi-branch feature extraction network to extract multi-level structural features of the skin pathology image and realize skin typing.
The specific steps of the above method will be further elucidated by means of a specific embodiment.
In this embodiment, the step S1 specifically includes the following steps:
step S101: collecting skin images containing various types from various channels such as a medical image database, a network skin image database, a large medical scientific research institution and the like, storing the skin images into a MongoDB database, and establishing a skin image database, wherein main information of the skin images comprises: skin lesion appearance, skin lesion shape, skin lesion size, skin lesion type, skin lesion classification, as shown in fig. 2, wherein the data acquisition channels include, but are not limited to, medical image databases, network skin image databases, large medical scientific research institutions, etc.;
step S102: mapping the skin image into a pair of undirected graphs, regarding each pixel as a node, connecting each pixel node to form an edge, and constructing a gray level co-occurrence matrix P to calculate texture differences among different pixels by using the weight of each edge to represent the texture differences among the pixels, wherein the construction mode is as follows:
In the above formula, i represents a current pixel, j represents an adjacent pixel of the i pixel, (x, y) represents coordinates of the current pixel i, αx and βy represent coordinate offsets between the two pixels, θ represents occurrence frequency of the pixel pair (i, j) in the image, W represents width of the image, H represents length of the image, g_value (·) is a gray scale calculation function, and represents gray scale values at coordinates (x, y) in the image;
and calculating the gray value difference square sum between pixels by using the texture feature function, wherein the gray value difference square sum is used for reflecting the texture difference degree between the pixels, and the calculation mode is as follows:
D(i,j)=∑ i,j [G_value(x,y)-G_value(x+αx,y+βy)] 2 ×I(G_value(x,y)=i)×I(G_value(x+αx,y+βy)=j)
in the above formula, D (-) represents a difference degree calculation function, I (-) represents an indication function, and when I (-) is true, the value is 1, otherwise, the value is 0;
according to a pixel texture difference measurement formula, calculating the weight of each edge in the undirected graph in the following manner:
step S103: two additional node sets are added in the constructed undirected graph, namely a source node set S and a sink node set T, and are connected with each vertex in the undirected graph to form an edge, and the minimum cost for dividing the nodes in the graph into two sets is calculated by the following calculation mode:
Cost=min∑ (u,v)∈I,s∈S,t∈T c(u,v)
in the above formula, min represents the minimum value, u and v represent any two sides in the undirected graph, c (-) represents the path flow calculation function, I represents the skin image, S represents the source node set, T represents the sink node set, the weight of each side on the path is counted into the flow of the augmented path by searching the augmented path existing between the source node and the sink node, when the augmented path cannot be found, the flow of the current path reaches the maximum value, and the image is segmented according to the maximum flow path to reserve the pathological area in the image, as shown in fig. 3;
Step S104: and (3) complementing the image into a distorted image block with a fixed size of 512 multiplied by 512 by adopting a linear interpolation method, and compensating the defect that the specifications and the sizes of the segmented pictures are not uniform, wherein the complementing calculation formula is as follows:
Pixel'(x m ,y m )=(1-U)*(1-V)*Pixel(x m ,y m )+U*(1-V)*Pixel(x m ,y m )+(1-U)*V*Pixel(x m ,y m )
Pixel(x m ,y m )=0.299R+0.587G+0.114B
in the above, (x) m ,y m ) Coordinate values representing missing pixels, (x) 0 ,y 0 ) Representing the origin of coordinates, (x) r ,y r ) The coordinate values of pixels existing near the randomly selected missing pixels are represented, U represents the lateral offset of the missing pixels, V represents the longitudinal offset of the missing pixels, pixel' (. Cndot.) represents an image gray estimation function, pixel (. Cndot.) is an image gray calculation function, the gray value of the target Pixel is calculated by adopting a weighted average method, the weight of the red component R is set to 0.299, the weight of the green component G is set to 0.587, the weight of the blue component B is set to 0.114, and it is noted that the weights of the three color components can be adjusted according to actual conditions and the total weight is 1;
step S105: the image after being completed is divided into blocks, and a sliding window mode is adoptedTraversing and cropping an image into a 256×256 fixed-size image G path Adding a corresponding label for the cut image, adding the corresponding label according to the original information if the original image contains the label information, and adding specific label information by manual identification if the original image lacks or omits the label information, specifically, manually adding a skin pathology type label for the image if the original image is manually identified as squamous cell skin cancer, wherein the value of the skin pathology type label is 'squamous cell skin cancer'.
In this embodiment, the step S2 specifically includes the following steps:
step S201: constructing a generator network with a symmetrical structure, learning potential features between input images and corresponding reference images, wherein the constructed generator network consists of 3 convolution layers, the first convolution layer comprises 32 convolution kernels with the size of 7×7 and the convolution step length is set to 1, the second convolution layer comprises 64 convolution kernels with the size of 5×5 and the convolution step length is set to 1, and the third convolution layer comprises 128 convolution kernels with the size of 3×3 and the convolution step length is set to 1; then, setting 3 residual network units consisting of a 1-layer convolution layer and a 1-layer ReLU activation function, wherein the convolution layer in the residual network units comprises 128 convolution kernels with the size of 3 multiplied by 3, and the convolution step length is set to be 1; finally, setting 3 transposed convolution layers to realize feature sampling and image recovery, wherein the first transposed convolution layer comprises 128 convolution kernels with the size of 3×3 and the convolution step length is set to be 1, the second transposed convolution layer comprises 64 convolution kernels with the size of 5×5 and the convolution step length is set to be 1, the third transposed convolution layer comprises 32 convolution kernels with the size of 7×7 and the convolution step length is set to be 1, and feature fusion is carried out on the output features of the last transposed convolution layer and the input image to obtain a final output image G gen As in fig. 4, the expression of the convolution operation is as follows:
in the above-mentioned method, the step of,representing the current stateThe feature map of the trace output, K represents the size of the convolution kernel, W represents the width of the image, H represents the height of the image,/-the convolution kernel>The method comprises the steps that a characteristic diagram input by a current channel is represented, x and y represent coordinate positions of an output characteristic diagram in a channel n, and w and h represent element coordinates of a convolution kernel weight matrix in the channel n;
step S202: constructing a discriminator network consisting of 5 convolution layers, extracting different characteristics in an input image, identifying different types of skin, wherein the convolution layers in the discriminator network all use convolution kernels with the size of 3 multiplied by 3, each convolution layer is added with batch normalization layers, the number of the convolution layers contained in the convolution layers is respectively 32, 64, 128, 256 and 1 according to the sequence, the first four convolution layers use a ReLU activation function, the last convolution layer use a Sigmoid activation function, and outputting a probability value of whether the corresponding input image is generated or not, as shown in fig. 5.
The step S3 specifically comprises the following steps:
step S301: traversing image G gen And obtaining corresponding gray values, and counting the pixel frequency of each gray value for constructing an image G gen The corresponding gray level histogram is expressed as follows:
k(l)=f(l)×m -1
In the above formula, k (l) represents the frequency of occurrence of the pixel with the gray level of l in the image, f (l) represents the frequency of occurrence of the pixel with the gray level of l in the image, m represents the number of pixels in the image, for example, the frequency of occurrence of the pixel with the gray level of 0 in the image, f (0) is 28, the number of pixels in the image, m is 625, and the frequency of occurrence of the pixel with the gray level of 0 in the image, k (0), is 0.0448;
step S302: calculating the inter-class variance corresponding to different gray level thresholds, solving a threshold delta to maximize the difference between the disease texture and the normal skin, and according to the threshold delta, obtaining an image G gen Conversion to a binarized image G bin The calculation method is as follows:
δ=max{δ 1 ,δ 2 ,…,δ i },i=h(l)
in the above, delta i Representing the threshold for constructing a binarized image based on the ith gray level,representing an image at a threshold delta i Inter-class variance of time,/>Representing an image at a threshold delta i Number of pixels with gray level 0, +.>Representing an image at a threshold delta i The number of pixels with the gray level of 255, f (l) represents the occurrence frequency of the pixels with the gray level of l in the image, m represents the number of pixels in the image, h (l) represents the number of gray levels in the image, max { · } is a maximum operator, and δ represents a gray level threshold corresponding to the maximum inter-class variance, as shown in fig. 6;
Step S303: image G obtained by the generator gen The skin part and the pathological part of the skin part have obvious pixel difference, and the binarized image G is utilized bin Image G gen Segmentation to obtain finer pathology image G acc Then input into a discriminator as shown in fig. 7;
step S304: the pathological image generated by the generator network is synthesized with the healthy skin, the pixel difference between the edge area of the pathological image and the healthy skin is obvious, the edge of the pathological image is smoothly improved by means of mean value filtering, and an image edge mean value filtering smoothing formula is as follows:
G i '=λ×G i +(1-λ)×G j
in the above, G i ' represents the filtered pixel value, G i Representing the pixel value to be filtered, G j Representation and G i The pixel value of the adjacent pixel point, lambda represents the weight for controlling the filtering of the image, and is takenA value of 0.625, in particular, as G to be filtered i 220, and its neighboring pixel point G j Is 176, then G after filtering i The pixel value of' is 203.5 as shown in fig. 8;
step S305: the synthesized image is processed in a multi-scale mode by utilizing a bilinear interpolation method, an image pyramid structure is constructed, the synthesized image can be ensured to keep main pathological features, and the position coordinates of pixels and the pixel values are calculated as follows:
In the above, l i Representing coordinate values of a target point to be calculated in an image, x src Representing the abscissa, y, of the scaled image src Representing the abscissa, x, of the scaled image dst Representing the abscissa, y, of the image before scaling dst Representing the ordinate, W, of the image before scaling src Representing the width of the scaled image, W dst Representing the width of the image before scaling, H src Representing the width of the scaled image, H dst Representing the width of the image before scaling, G i Representing the pixel value of the target point to be calculated, G j Representing the pixel value, l, of the neighboring pixel points of the target point to be calculated j The coordinate values of the adjacent pixel points of the target point to be calculated in the image are represented as shown in fig. 9.
The step S4 specifically comprises the following steps:
step S401: generating an image G using residual loss function measurements acc And the original image G path Dissimilarity between the generated images and the original images, the number of patches is the same in ideal case, the residual loss value is 0, and the formula of the residual loss function is as follows:
in the above, loss gen (. Cndot.) represents the residual loss function, P G∈Data (G path ) Representing the probability that the input image belongs to the real image data, s representing the number of images, and log (·) representing a logarithmic function;
step S402: generating an image G using feature loss function discrimination acc And the original image G path The characteristic difference between them, the characteristic loss function is formulated as follows:
Loss dis (G acc ,G path )=-[log(P G∈Data (G path ))+log(1-P G∈Gen (G acc ))]
in the above, loss dis (. Cndot.) represents a characteristic loss function, P G∈Gen (G acc ) Representing a probability that the input image belongs to the generator created sample;
step S403: the two loss functions are weighted and spliced, the target skin image is mapped to a potential space, parameters in a generator network and a discriminator network are updated by using an Adam optimizer, the two networks are helped to reach a convergence state faster during alternate training, the stability and the reliability of a model are improved, convergence is finally achieved, and a loss function splicing formula is as follows:
Loss(G acc ,G path )=α 1 *Loss gen (G acc ,G path )+β 1 *Loss dis (G acc ,G path )
in the above, alpha 1 Parameters representing the weight of the corresponding loss function of the control generator network, the value of which is 0.3, beta 1 Parameters representing the weight of the corresponding loss function of the control discriminator network, which takes a value of 0.7, alpha 1 And beta 1 The sum is 1; specifically, step S401 obtains a Loss gen (G acc ,G path ) A value of 26.523; step S402 of determining Loss dis (G acc ,G path ) A value of 1.046; finally, find Loss (G acc ,G path ) The value was 8.689.
The step S5 specifically comprises the following steps:
step S501: constructing a multi-branch feature extraction network containing 4 residual blocks, wherein each residual block comprises a layer of cavity convolution layer, a layer of batch normalization layer and a layer of self-adaptive average pooling layer, and the cavity convolution is calculated as follows:
In the above, q i Characteristic tensor representing i-th residual block output image, q i-1 The characteristic tensor of the i-1 th residual block output image is represented, z represents the size of a convolution kernel in a cavity convolution layer, p represents a filling parameter, b represents a moving step length, eta represents the expansion rate of cavity convolution, if the convolution kernel with the size of 3 multiplied by 3 is selected to extract smaller local pathological characteristics, the filling parameter selects 'same' with the value of 1, the moving step length is set to be 1, the size of the output image is ensured to be unchanged, and the expansion rates of 4 residual blocks respectively take the values of 1, 4, 9 and 16;
step S502: feature fusion is carried out on the image features of different layers, so that important features of skin pathology images are prevented from being lost, the accuracy of skin typing is improved, and the computing mode of feature fusion is as follows:
in the above formula, Q represents the feature tensor output after feature fusion, Q i Characteristic tensor, sigma representing i-th residual block output i Representing the characteristic weight corresponding to the ith residual block;
step S503: and (3) carrying out normalized encoding processing on the fusion characteristic tensor Q according to a minimum maximization principle, mapping the element numerical range of the characteristic tensor into the interval of [0,1], and carrying out normalization according to the following calculation mode:
in the above formula, Q' represents the normalized characteristic tensor, min (Q) represents the minimum value of all elements in the tensor Q, and max (Q) represents the maximum value of all elements in the tensor Q;
Step S504: and selecting a Gaussian kernel as a kernel function of a support vector machine, learning a proper decision boundary according to the characteristics and the labels of the tensors, maximizing the interval between different categories, classifying the characteristic tensors Q' by using a trained support vector machine model to obtain category labels of the target skin image, and finishing skin typing.
It is noted that the present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principles and embodiments of the present invention have been described in detail with reference to specific examples, which are provided to facilitate understanding of the method and core ideas of the present invention; also, those skilled in the art will appreciate that the present invention can be practiced with other variations in specific details and with respect to a particular embodiment or range of applications, which are not to be construed as limitations on the invention
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.
Claims (8)
1. A method of skin typing data augmentation based on an antagonism generation network, characterized by: the method comprises the following steps:
step S1: collecting and preprocessing original skin image data;
step S2: constructing an antagonism generation network model to generate a high-quality synthetic skin pathology image sample;
step S3: dividing the generated skin pathology image and synthesizing the skin pathology image with the healthy skin image to realize data augmentation;
step S4: training an antigen generation network model to expand the scale of the original skin pathology image dataset;
step S5: and constructing a multi-branch feature extraction network to extract multi-level structural features of the skin pathology image and realize skin typing.
2. A method of skin typing data augmentation based on an antagonizing generating network as recited in claim 1, wherein: the step S1 specifically includes:
step S101: obtaining skin images containing multiple types from various channels of medical image databases, network skin image databases, large medical scientific institutions and the like, wherein main information of the skin images comprises: appearance of skin lesions, shape of skin lesions, size of skin lesions, type of skin lesions, classification of skin lesions;
Step S102: mapping the skin image into a pair of undirected graphs, regarding each pixel as a node, connecting each pixel node to form an edge, and constructing a gray level co-occurrence matrix P to calculate texture differences among different pixels by using the weight of each edge to represent the texture differences among the pixels, wherein the construction mode is as follows:
in the above formula, i represents a current pixel, j represents an adjacent pixel of the i pixel, (x, y) represents coordinates of the current pixel i, αx and βy represent coordinate offsets between the two pixels, θ represents occurrence frequency of the pixel pair (i, j) in the image, W represents width of the image, H represents length of the image, g_value (·) is a gray scale calculation function, and represents gray scale values at coordinates (x, y) in the image;
and calculating the gray value difference square sum between pixels by using the texture feature function, wherein the gray value difference square sum is used for reflecting the texture difference degree between the pixels, and the calculation mode is as follows:
D(i,j)=∑ i,j [G_value(x,y)-G_value(x+αx,y+βy)] 2 ×I(G_value(x,y)=i)×I(G_value(x+αx,y+βy)=j)
in the above formula, D (-) represents a difference degree calculation function, I (-) represents an indication function, and when I (-) is true, the value is 1, otherwise, the value is 0;
according to a pixel texture difference measurement formula, calculating the weight of each edge in the undirected graph in the following manner:
step S103: two additional node sets are added in the constructed undirected graph, namely a source node set S and a sink node set T, and are connected with each vertex in the undirected graph to form an edge, and the minimum cost for dividing the nodes in the graph into two sets is calculated by the following calculation mode:
Cost=min∑ (u,v)∈I,s∈S,t∈T c(u,v)
In the above formula, min represents a minimum value, u and v represent any two sides in the undirected graph, c (-) represents a path flow calculation function, I represents a skin image, S represents a source node set, T represents a sink node set, the weight of each side on the path is counted into the flow of the augmented path by searching the augmented path existing between the source node and the sink node, when the augmented path cannot be found, the flow of the current path reaches the maximum value, and the image is segmented according to the maximum flow path to reserve a pathological area in the image;
step S104: and (3) complementing the image into a distorted image block with a fixed size of 512 multiplied by 512 by adopting a linear interpolation method, and compensating the defect that the specifications and the sizes of the segmented pictures are not uniform, wherein the complementing calculation formula is as follows:
Pixel'(x m ,y m )=(1-U)*(1-V)*Pixel(x m ,y m )+U*(1-V)*Pixel(x m ,y m )+(1-U)*V*Pixel(x m ,y m )
Pixel(x m ,y m )=w 1 *R+w 2 *G+w 3 *B
in the above, (x) m ,y m ) Coordinate values representing missing pixels, (x) 0 ,y 0 ) Representing the origin of coordinates, (x) r ,y r ) Coordinate values of complete pixels near randomly selected missing pixels are represented, U represents lateral offset of the missing pixels, V represents longitudinal offset of the missing pixels, pixel' (. Cndot.) represents an image gray scale estimation function, pixel (. Cndot.) is an image gray scale calculation function, a weighted average method is adopted to calculate gray scale values of target pixels, and w 1 、w 2 、w 3 Separate tableShowing weights of red, green and blue components, it should be noted that the weights of the three color components can be adjusted according to actual conditions and the total weight is 1;
step S105: the image after the completion is divided into blocks, the image is traversed by adopting a sliding window mode, and the image is cut into an image G with a fixed size of 256 multiplied by 256 path Adding corresponding labels for the cut images, adding corresponding labels according to the original information if the original images contain the label information, and manually identifying and adding specific label information if the original images lack or miss the label information.
3. A method of skin typing data augmentation based on an antagonizing generating network as recited in claim 1, wherein: the step S2 specifically includes:
step S201: constructing a generator network with a symmetrical structure, learning potential features between input images and corresponding reference images, wherein the constructed generator network consists of 3 convolution layers, the first convolution layer comprises 32 convolution kernels with the size of 7×7 and the convolution step length is set to 1, the second convolution layer comprises 64 convolution kernels with the size of 5×5 and the convolution step length is set to 1, and the third convolution layer comprises 128 convolution kernels with the size of 3×3 and the convolution step length is set to 1; then, setting 3 residual network units consisting of a 1-layer convolution layer and a 1-layer ReLU activation function, wherein the convolution layer in the residual network units comprises 128 convolution kernels with the size of 3 multiplied by 3, and the convolution step length is set to be 1; finally, setting 3 transposed convolution layers to realize feature sampling and image recovery, wherein the first transposed convolution layer comprises 128 convolution kernels with the size of 3×3 and the convolution step length is set to be 1, the second transposed convolution layer comprises 64 convolution kernels with the size of 5×5 and the convolution step length is set to be 1, the third transposed convolution layer comprises 32 convolution kernels with the size of 7×7 and the convolution step length is set to be 1, and feature fusion is carried out on the output features of the last transposed convolution layer and the input image to obtain a final output image G gen The expression of the convolution operation is as follows:
in the above-mentioned method, the step of,a feature map representing the current channel output, K representing the size of the convolution kernel, W representing the width of the image, H representing the height of the image, +.>The method comprises the steps that a characteristic diagram input by a current channel is represented, x and y represent coordinate positions of an output characteristic diagram in a channel n, and w and h represent element coordinates of a convolution kernel weight matrix in the channel n;
step S202: constructing a discriminator network consisting of 5 convolution layers, extracting different characteristics in an input image, identifying different types of skin, selecting convolution kernels with the size of 3 multiplied by 3 in the discriminator network, adding batch normalization layers into each convolution layer, respectively using a ReLU activation function for the first four convolution layers and using a Sigmoid activation function for the last convolution layer according to the sequence, and outputting the probability value of whether the corresponding input image is a generated image or not.
4. A method of skin typing data augmentation based on an antagonizing generating network as recited in claim 1, wherein: the step S3 specifically includes:
step S301: traversing image G gen And obtaining corresponding gray values, and counting the pixel frequency of each gray value for constructing an image G gen The corresponding gray level histogram is expressed as follows:
k(l)=f(l)×m -1
in the above formula, k (l) represents the frequency of occurrence of a pixel with a gray level of l in the image, f (l) represents the frequency of occurrence of a pixel with a gray level of l in the image, and m represents the number of pixels in the image;
step S302: calculating the inter-class variance corresponding to different gray level thresholds to obtain a threshold deltaMaximizing the difference between the lesion texture and normal skin, image G is based on a threshold delta gen Conversion to a binarized image G bin The calculation method is as follows:
δ=max{δ 1 ,δ 2 ,…,δ i },i=h(l)
in the above, delta i Representing the threshold for constructing a binarized image based on the ith gray level,representing an image at a threshold delta i Inter-class variance of time,/>Representing an image at a threshold delta i Number of pixels with gray level 0, +.>Representing an image at a threshold delta i The number of pixels with the gray level of 255, f (l) represents the occurrence frequency of the pixels with the gray level of l in the image, m represents the number of pixels in the image, h (l) represents the number of gray levels in the image, max { · } is a maximum operator, and delta represents a gray level threshold corresponding to the maximum inter-class variance;
step S303: image G obtained by the generator gen The skin part and the pathological part of the skin part have obvious pixel difference, and the binarized image G is utilized bin Image G gen Segmentation to obtain finer pathology image G acc Then input into a discriminator;
step S304: the pathological image generated by the generator network is synthesized with the healthy skin, the pixel difference between the edge area of the pathological image and the healthy skin is obvious, the edge of the pathological image is smoothly improved by means of mean value filtering, and an image edge mean value filtering smoothing formula is as follows:
G i '=λ×G i +(1-λ)×G j
in the above, G i ' represents the filtered pixel value, G i Representing the pixel value to be filtered, G j Representation and G i The pixel values of adjacent pixels, λ, represent the weights controlling the filtering of the image;
step S305: the synthesized image is processed in a multi-scale mode by utilizing a bilinear interpolation method, an image pyramid structure is constructed, the synthesized image can be ensured to keep main pathological features, and the position coordinates of pixels and the pixel values are calculated as follows:
in the above, l i Representing coordinate values of a target point to be calculated in an image, x src Representing the abscissa, y, of the scaled image src Representing the abscissa, x, of the scaled image dst Representing the abscissa, y, of the image before scaling dst Representing the ordinate, W, of the image before scaling src Representing the width of the scaled image, W dst Representing the width of the image before scaling, H src Representing the width of the scaled image, H dst Representing the width of the image before scaling, G i Representing the pixel value of the target point to be calculated, G j Representing the pixel value, l, of the neighboring pixel points of the target point to be calculated j Representing coordinate values of adjacent pixel points of the target point to be calculated in the image.
5. A method of skin typing data augmentation based on an antagonizing generating network as recited in claim 1, wherein: the step S4 specifically includes:
step S401: generating an image G using residual loss function measurements acc And the original image G path Dissimilarity between, ideally, generate imagesThe number of pixel patches is the same as that of the original images, the residual loss value is 0, and the formula of the residual loss function is as follows:
in the above, loss gen (. Cndot.) represents the residual loss function, P G∈Data (G path ) Representing the probability that the input image belongs to the real image data, s representing the number of images, and log (·) representing a logarithmic function;
step S402: generating an image G using feature loss function discrimination acc And the original image G path The characteristic difference between them, the characteristic loss function is formulated as follows:
Loss dis (G acc ,G path )=-[log(P G∈Data (G path ))+log(1-P G∈Gen (G acc ))]
in the above, loss dis (. Cndot.) represents a characteristic loss function, P G∈Gen (G acc ) Representing a probability that the input image belongs to the generator created sample;
step S403: the two loss functions are weighted and spliced, the target skin image is mapped to a potential space, parameters in a generator network and a discriminator network are updated by using an Adam optimizer, the two networks are helped to reach a convergence state faster during alternate training, the stability and the reliability of a model are improved, convergence is finally achieved, and a loss function splicing formula is as follows:
Loss(G acc ,G path )=α 1 *Loss gen (G acc ,G path )+β 1 *Loss dis (G acc ,G path )
In the above, alpha 1 Parameters representing the weight of the corresponding loss function of the control generator network, beta 1 Parameter, alpha, representing the weight of the corresponding loss function of the control discriminator network 1 And beta 1 The sum is 1.
6. A method of skin typing data augmentation based on an antagonizing generating network as recited in claim 1, wherein: the step S5 specifically includes:
step S501: constructing a multi-branch feature extraction network containing 4 residual blocks, wherein each residual block comprises a cavity convolution layer, a batch normalization layer and a self-adaptive average pooling layer, and the cavity convolution is calculated as follows:
in the above, q i Characteristic tensor representing i-th residual block output image, q i-1 Representing the characteristic tensor of the i-1 th residual block output image, wherein z represents the size of a convolution kernel in a cavity convolution layer, p represents a filling parameter, b represents a moving step length, and eta represents the expansion rate of cavity convolution;
step S502: feature fusion is carried out on the image features of different layers, so that important features of skin pathology images are prevented from being lost, the accuracy of skin typing is improved, and the computing mode of feature fusion is as follows:
in the above formula, Q represents the feature tensor output after feature fusion, Q i Characteristic tensor, sigma representing i-th residual block output i Representing the characteristic weight corresponding to the ith residual block;
step S503: and (3) carrying out normalized encoding processing on the fusion characteristic tensor Q according to a minimum maximization principle, mapping the element numerical range of the characteristic tensor into the interval of [0,1], and carrying out normalization according to the following calculation mode:
in the above formula, Q' represents the normalized characteristic tensor, min (Q) represents the minimum value of all elements in the tensor Q, and max (Q) represents the maximum value of all elements in the tensor Q;
step S504: and selecting a kernel function training support vector machine model to classify the feature tensor Q', obtaining a class label of the target skin image, and completing skin typing.
7. A system for skin typing data augmentation based on an countermeasure generation network, comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor, characterized by: the processor, when executing the computer program, implements the method of any of claims 1-6.
8. A computer-readable storage medium having stored thereon a computer program, characterized by: the computer program implementing the method according to any of claims 1-6 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310669194.3A CN117292217A (en) | 2023-06-07 | 2023-06-07 | Skin typing data augmentation method and system based on countermeasure generation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310669194.3A CN117292217A (en) | 2023-06-07 | 2023-06-07 | Skin typing data augmentation method and system based on countermeasure generation network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117292217A true CN117292217A (en) | 2023-12-26 |
Family
ID=89256000
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310669194.3A Pending CN117292217A (en) | 2023-06-07 | 2023-06-07 | Skin typing data augmentation method and system based on countermeasure generation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117292217A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117993029A (en) * | 2024-04-03 | 2024-05-07 | 武昌首义学院 | Satellite information and training data warehouse network safety protection method and system |
-
2023
- 2023-06-07 CN CN202310669194.3A patent/CN117292217A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117993029A (en) * | 2024-04-03 | 2024-05-07 | 武昌首义学院 | Satellite information and training data warehouse network safety protection method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108830326B (en) | Automatic segmentation method and device for MRI (magnetic resonance imaging) image | |
WO2022012110A1 (en) | Method and system for recognizing cells in embryo light microscope image, and device and storage medium | |
CN106327507B (en) | A kind of color image conspicuousness detection method based on background and foreground information | |
CN106340016B (en) | A kind of DNA quantitative analysis method based on microcytoscope image | |
CN109389129A (en) | A kind of image processing method, electronic equipment and storage medium | |
JP2021512446A (en) | Image processing methods, electronic devices and storage media | |
Pan et al. | Cell detection in pathology and microscopy images with multi-scale fully convolutional neural networks | |
CN110827260B (en) | Cloth defect classification method based on LBP characteristics and convolutional neural network | |
US10769432B2 (en) | Automated parameterization image pattern recognition method | |
CN113096096B (en) | Microscopic image bone marrow cell counting method and system fusing morphological characteristics | |
CN110348435A (en) | A kind of object detection method and system based on clipping region candidate network | |
CN105389821B (en) | It is a kind of that the medical image cutting method being combined is cut based on cloud model and figure | |
CN111444844A (en) | Liquid-based cell artificial intelligence detection method based on variational self-encoder | |
CN108765409A (en) | A kind of screening technique of the candidate nodule based on CT images | |
CN117541844B (en) | Weak supervision histopathology full-section image analysis method based on hypergraph learning | |
CN115641317B (en) | Pathological image-oriented dynamic knowledge backtracking multi-example learning and image classification method | |
Wen et al. | Review of research on the instance segmentation of cell images | |
CN115880266B (en) | Intestinal polyp detection system and method based on deep learning | |
CN116229205A (en) | Lipstick product surface defect data augmentation method based on small sample characteristic migration | |
CN117292217A (en) | Skin typing data augmentation method and system based on countermeasure generation network | |
CN117036288A (en) | Tumor subtype diagnosis method for full-slice pathological image | |
CN114708237A (en) | Detection algorithm for hair health condition | |
CN104751461A (en) | White cell nucleus segmentation method based on histogram threshold and low rank representation | |
CN117830321A (en) | Grain quality detection method based on image recognition | |
CN116883339A (en) | Histopathological image cell nucleus detection method based on point supervision |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |