CN113642385A

CN113642385A - Deep learning-based facial nevus identification method and system

Info

Publication number: CN113642385A
Application number: CN202110748237.8A
Authority: CN
Inventors: 陆华; 谢柯; 张华�; 李登旺; 黄浦; 许化强
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2021-07-01
Filing date: 2021-07-01
Publication date: 2021-11-12
Anticipated expiration: 2041-07-01
Also published as: CN113642385B

Abstract

The invention discloses a face nevus recognition method and system based on deep learning, wherein a VGG16 convolutional neural network is built, a full connection layer in the VGG16 convolutional neural network is abandoned, training parameters are optimized to obtain a specific full connection layer, and a complete model is obtained; fine-tuning all convolution layers and all connection layers in the model by using an enhanced training set, optimizing parameters by using an optimizer, verifying data by using a verification set, verifying generalization capability of the model, and selecting an optimal parameter model as a final face nevus recognition model; after the face image is subjected to face detection, the face is cut and processed in a blocking mode, and the blocked face regions are respectively predicted by using a face mole recognition model, so that the mole detection requirement is met.

Description

Deep learning-based facial nevus identification method and system

Technical Field

The invention belongs to the field of deep learning, and particularly relates to a method and a system for identifying facial nevus based on deep learning.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

By means of continuous development and innovation of science and technology, the biological feature recognition technology makes a major breakthrough. Biometric identification techniques include fingerprint identification, retinal identification, iris identification, and face identification. Because of the characteristics of friendly convenience and accuracy, the face recognition is widely applied to the aspects of safety system detection, missing exploration, identity registration, public place monitoring and the like. The development of higher resolution sensors, the increase of the size of face image databases, the improvement of image processing and computer vision algorithms, make people no longer meet the current situation of the current face recognition technology, but pursue more accurate and refined face micro-feature recognition. Therefore, in order to effectively improve the face recognition and retrieval performance, further research is being carried out to utilize facial microscopic features (such as moles, freckles, scars, etc.) to refine the face recognition assisting technology.

Detection and identification of a facial mole generally face three problems, one is influenced by light shadow and posture change: on one hand, when shooting is carried out, the nevus under the facial area (such as the canthus, the corner of the mouth, the nostrils or wrinkles) can become unclear, so that the nevus is difficult to distinguish, and the detection accuracy of the nevus is reduced; on the other hand, the facial expression, the front-view side-view posture change and the like during photographing also affect the recognition of the nevus. Secondly, people use a beauty tool for pursuing higher-quality photos, the beauty tool often performs Gaussian blur processing on the photos, particularly on the parts of moles, and the processed photos have the problems of distortion and the like, so that the recognition of the moles is hindered. And thirdly, other facial features (such as color spots, birthmarks, hair, beard and the like), wearing articles (such as nasal rings, lip nails and eyebrows) and the use of cosmetics (such as concealer, foundation cream and the like) can interfere with the detection and identification of nevi. Therefore, the detailed characteristics of the nevus need to be considered when identifying the nevus, and the uniqueness of the nevus is the difference from other faces; secondly, the stability of the nevus is identified, and repeated detection is performed to ensure the accuracy of nevus identification.

Disclosure of Invention

In order to solve the problems, the invention provides a method and a system for identifying face nevi based on deep learning, wherein a VGG16 convolutional neural network is built, a full connection layer in the VGG16 convolutional neural network is abandoned, training parameters are optimized, a specific full connection layer is obtained, and a complete model is obtained; fine-tuning all convolution layers and all connection layers in the model by using an enhanced training set, optimizing parameters by using an optimizer, verifying data by using a verification set, verifying generalization capability of the model, and selecting an optimal parameter model as a final face nevus recognition model; after the face image is subjected to face detection, the face is cut and processed in a blocking mode, and the blocked face regions are respectively predicted by using a face mole recognition model, so that the mole detection requirement is met.

According to some embodiments, the invention adopts the following technical scheme:

the facial nevus identification method based on deep learning comprises the following steps:

acquiring a picture containing a human face;

detecting a face in the picture by using the HOG characteristics and cutting to obtain a face image;

dividing a face image into a plurality of image blocks with overlapped edges;

inputting the image blocks into a face mole identification model, predicting the mole-containing probability of each image block, and positioning the mole-containing image blocks in the face image; the acquisition process of the face nevus recognition model is as follows: extracting the features of the pictures in the enhanced training set by using a feature extraction layer of the VGG16 convolutional neural network; constructing a face nevus recognition classifier, and training and optimizing parameters by using the extracted features to obtain a full-connection layer; constructing the feature extraction layer and the full connection layer to form a complete face mole identification model; and fine-tuning the complete face mole recognition model to obtain a final face mole recognition model.

Further, the specific steps of fine tuning the complete face mole recognition model are as follows:

loading a complete face nevus recognition model;

fine-tuning all the convolution layers and all the connection layers in the model by using an enhanced training set, and optimizing parameters by using an optimizer;

and (5) adopting the verification set to verify data, and selecting an optimal parameter model as a final face nevus recognition model.

Further, the enhanced training set is obtained by sequentially performing normalization preprocessing, image rotation, image horizontal offset, image vertical offset, random miscut change image, random zooming image, random horizontal turning image and pixel filling on the images in the training set.

Further, a preprocessing operation is performed before the image block is input into the face mole recognition model, and the preprocessing operation specifically includes:

adjusting the size of an image block;

converting the adjusted image blocks into arrays;

data is added to the 0 position of the array and normalized.

Further, the specific steps of detecting the face in the picture by using the HOG features are as follows:

performing graying on the picture, and then performing color space normalization processing on the picture by adopting gamma conversion;

calculating the transverse gradient, the longitudinal gradient, the gradient direction and the amplitude of each pixel point in the picture by using a first-order differential equation;

dividing the picture into a plurality of small squares, dividing the gradient direction of each small square into a plurality of direction blocks, and calculating the number of different gradient directions to obtain the feature vector of each small square;

forming a sliding window by a plurality of adjacent small squares, and connecting the feature vectors of all the small squares in the sliding window in series to obtain the HOG feature of the sliding window;

scanning the picture by using a sliding window, setting scanning step length, performing sliding scanning in a mode of overlapping a plurality of pixels, collecting HOG characteristics of all sliding windows connected in series to obtain HOG characteristic vectors of the face in the picture;

and inputting the obtained HOG feature vector into an SVM model to obtain the face in the picture.

Further, the specific steps of cutting to obtain the face image are as follows:

extracting a plurality of key points of the face in the picture;

dividing a plurality of key points into different key point sets;

and calculating the cutting position of the face image based on the key point set, and cutting to obtain the face image.

Deep learning based facial nevus recognition system, comprising:

the data acquisition module is used for acquiring a picture containing a human face;

the face image cutting module is used for detecting the face in the picture by using the HOG characteristics and cutting the face to obtain a face image;

the overlapping and blocking module is used for dividing the face image into a plurality of image blocks with overlapped edges;

the face mole identification module is used for inputting the image blocks into the face mole identification model, predicting the mole-containing probability of each image block and positioning the mole-containing image block in the face image; the acquisition process of the face nevus recognition model is as follows: extracting the features of the pictures in the enhanced training set by using a feature extraction layer of the VGG16 convolutional neural network; constructing a face nevus recognition classifier, and training and optimizing parameters by using the extracted features to obtain a full-connection layer; constructing the feature extraction layer and the full connection layer to form a complete face mole identification model; and fine-tuning the complete face mole recognition model to obtain a final face mole recognition model.

An electronic device comprising a memory and a processor, and computer instructions stored on the memory and executed on the processor, the computer instructions, when executed by the processor, performing the steps of the method of the first aspect.

A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method of the first aspect.

The invention has the beneficial effects that:

1. the method is based on the HOG characteristics and the overlapping blocking technology, successfully and automatically establishes the face nevus image database, and enriches the database samples.

2. The invention adopts a data enhancement technology, namely, the original data set is subjected to graph enhancement transformation, such as cutting, scaling, rotation, movement, overturning and the like, and the slightly changed graph is also regarded as different data resources, so that data samples are increased, and the data set is effectively expanded.

3. The invention adopts a convolution base of a VGG16 model, the network of VGG16 has 1 hundred million 3800 ten thousand parameters, is a very deep large-scale network, and can keep higher classification accuracy and have very good migratable learning robustness.

4. According to the invention, through fine adjustment of five convolution base modules and a full connection layer in convolution bases of the VGG16 neural network, the overfitting phenomenon of the model can be effectively inhibited, the convergence speed of the model is accelerated, the complexity of model training is simplified, and the accuracy of the model is effectively improved.

5. The invention considers the particularity of the nevus, if uniform partitioning treatment is carried out, the nevus can be divided into two parts to be damaged, so that overlapping partitioning is adopted to ensure the recognition rate of the nevus.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.

Fig. 1 is a flowchart of a method for identifying a facial mole based on deep learning according to the present invention;

FIG. 2 is a flowchart of the face detection based on HOG feature according to the present invention;

fig. 3(a) is an original image for detecting a human face based on HOG features;

FIG. 3(b) is a facial feature map of the present application based on HOG features;

FIG. 3(c) is a face image cropped by the present application;

FIG. 4 is an image after the application has been overlapped and partitioned;

fig. 5(a) is a sample library containing nevi;

fig. 5(b) is a non-nevus sample library;

FIG. 6 is a flow chart of the present invention for initializing the VGG16 model;

FIG. 7 is a diagram of the size of an image through a custom neural network fabric layer of the present invention;

FIG. 8 is a diagram of a post-fine-tuning VGG16 model of the present invention;

FIG. 9 is a neural network diagram of the VGG16 fine tuning model of the present invention;

FIG. 10(a) is a graph of model accuracy obtained for the model after fine tuning in accordance with the present invention;

FIG. 10(b) shows the loss curve results obtained for the model after fine tuning according to the present invention;

FIG. 11 is a confusion matrix for a fine-tuning model test set according to the present invention;

fig. 12(a) is a flowchart of a detection process for predicting a mole on a face, in which the face is detected and points are drawn on five sense organs of the face;

fig. 12(b) is a face image obtained after cropping in the detection flow step of predicting a face mole;

fig. 12(c) is a face image obtained after the overlap blocking in the detection flow step of predicting the face nevus;

fig. 12(d) is a diagram of the predicted face mole location in the detection flow of the predicted face mole;

fig. 13(a) is an original drawing for detecting a single mole image using the molerec. h5 model;

fig. 13(b) is an effect diagram of detecting a single mole image using molerec. h5 model;

fig. 14(a) is an original drawing for detecting a multi-mole image using molerec. h5 model;

fig. 14(b) is an effect diagram of detecting a multi-mole image using the molerec. h5 model.

Detailed Description

The invention is further described with reference to the following figures and examples.

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example 1

The method for identifying a facial mole based on deep learning provided by this embodiment includes first establishing a database composed of mole-containing pictures and mole-free pictures; secondly, building a VGG16 convolutional neural network, abandoning a full connection layer in the VGG16 convolutional neural network, and optimizing training parameters to obtain a specific full connection layer to obtain a complete model; initializing the built VGG16 model, and finely adjusting five convolution base modules and a specific full connection layer of the VGG16 neural network on the basis of initializing the VGG16 model to obtain a more accurate convolution neural network model; thirdly, training the built deep learning network, namely training in the training set after data enhancement and the preprocessed verification set to obtain a trained deep learning model, testing the test set database and verifying the generalization ability of the model; and finally, detecting the nevus in the face, namely after face detection is carried out on the face image containing the nevus, cutting and blocking the face, and respectively predicting blocked face regions by using a trained fine adjustment model so as to meet the requirement of detecting the nevus. Through the mode, the method can effectively detect the facial nevus and is beneficial to improving the identification performance of the nevus identification system.

A method for recognizing a facial mole based on deep learning, as shown in fig. 1, specifically includes the following steps:

the method comprises the following steps: the method comprises the following steps of constructing a mole image database, namely establishing a database consisting of mole-containing pictures and mole-free pictures, wherein the database has two sources: firstly, detecting a face by using HOG characteristics and combining a dlib library, and performing face cutting and overlapping partitioning to obtain an area block containing face nevus and an area block containing no nevus; and secondly, searching, downloading and cutting the mole-containing picture from the network. The first database establishment procedure is mainly described below:

s101: performing face detection by using the HOG characteristics in combination with a dlib library, and performing face cutting and overlapping blocking;

as shown in fig. 2, the specific steps of detecting a face in a picture by using the HOG features are as follows:

(1) inputting a color face image, and setting a face sliding window to be 196 multiplied by 196;

(2) graying an image, i.e., viewing the image as a (x, y) two-dimensional image;

(3) carrying out normalization processing of a color space on an input image by adopting gamma conversion:

the contrast of the image is adjusted, the influence caused by local shadow and illumination of the image is reduced, and meanwhile, the interference of noise can be inhibited;

(4) and the first order differential equation is utilized to calculate the image gradient, so that the face contour information is obtained, and the illumination interference is further reduced. Namely setting the horizontal gradient of the pixel point (x, y) as H_x(x, y) a longitudinal gradient of H_y(x, y), the gradient direction is θ (x, y) and the magnitude is H (x, y), the calculation formula is:

H_x(x，y)＝H(x+1，y)-H(x-1，y)

H_y(x，y)＝H(x，y+1)-H(x，y-1)

(5) dividing the picture into a plurality of small squares, dividing the gradient direction of each small square into a plurality of direction blocks, and calculating the number of different gradient directions to obtain the feature vector of each small square; preferably, the image is divided into small squares (cells) with the size of 16 × 16 pixels, gradient information of the 16 × 16 pixels is counted by adopting histograms of 9 bin bins, that is, 360 ° of the gradient direction of each cell is divided into 9 direction blocks, each direction block has a range of 40 °, the number of different gradient directions is calculated, and 9-dimensional feature vectors of each cell are obtained;

(6) forming a sliding window by a plurality of adjacent small squares, and connecting the feature vectors of all the small squares in the sliding window in series to obtain the HOG feature of the sliding window; preferably, every four adjacent cells (i.e. every 2 × 2 cells) form a sliding window (block), and the 36-dimensional HOG features of the block can be obtained by connecting 9-dimensional feature vectors of all cells in the block in series;

(7) scanning the picture by using a sliding window, setting scanning step length, performing sliding scanning in a mode of overlapping a plurality of pixels, collecting HOG characteristics of all sliding windows connected in series to obtain HOG characteristic vectors of the face in the picture; preferably, the input image is scanned by using blocks, that is, the scanning step length is set to 8 pixels, and sliding scanning is performed by overlapping 8 pixels, so that the HOG features of all blocks connected in series are collected, and the HOG feature vector of the face in the image is obtained;

(8) and inputting the obtained HOG feature vector into an SVM model to obtain the face in the picture, namely predicting HOG features of the face to be recognized by using the SVM model trained and finished in the dlib library.

S102: carrying out face cutting and overlapping blocking, and specifically comprising the following steps:

firstly, extracting a plurality of key points of a face in a picture, preferably, extracting 68 personal face characteristic points by using a shape _ predictor _68_ face _ landworks. dat model in a dlib library, determining the position of the face according to the 68 personal face key points, and defining four different key point sets, namely, dividing the plurality of key points into different key point sets, specifically:

the abscissa containing 11 key points on the eyebrows;

the ordinate containing the 11 key points on the eyebrow;

abscissa containing 16 key points on the chin;

ordinate containing 16 key points on the chin;

fig. 3 is a diagram showing that after a face is detected by combining the HOG features with the dlib library, that is, after the flow of fig. 2, 68 personal face feature points are extracted by using a shape _ predictor _68_ face _ landworks. dat model in the dlib library;

secondly, calculating the clipping position of the face image based on the key point set, specifically, calculating top, bottom, left and right of the face through the following formulas:

wherein x1 and x2 are respectively a set

The sixth and third points;

outputting the accurate position of the face in the image through the top, bottom, left and right coordinates, cutting the face to obtain a face image, and printing the position and the size of the face;

and performing overlapping and blocking processing on the cut face image, namely dividing the face image into a plurality of image blocks with overlapped edges according to the set image block size and step length. Preferably, setting the step length of the abscissa and the step length of the ordinate to be 15 pixel points, firstly moving the ordinate, and when the step length of the ordinate is moved to the end of a row, starting to move the abscissa according to the step length of the abscissa until the image is completely partitioned; for example, the first block location is image [0:19,0:19], and then the second location is image [15:34,0:19 ].

S103: displaying and storing the block image, specifically as follows:

(1) displaying the block image of S102;

(2) establishing a folder to uniformly store the block images in a PNG format;

(3) the mole-containing images and non-mole images are classified.

Fig. 4 is a view showing the face image cut out by the overlap and block processing.

The database of fig. 5 is derived from two sources: firstly, detecting a face by using HOG characteristics and combining a dlib library, and performing face cutting and overlapping partitioning to obtain an area block containing face nevus and an area block containing no nevus; and secondly, searching, downloading and cutting the mole-containing picture from the network. And finally, 1200 sample data are established in the database, the database is divided into a training set, a verification set and a test set, the training set comprises 600 samples, the verification set comprises 300 samples, the test set comprises 300 samples, and the two parts of data do not have a cross phenomenon. The mole-containing picture and the mole-free picture are respectively regarded as a positive sample and a negative sample, and the number of the positive sample and the negative sample in each data set is equal.

Step two: building a VGG16 model, wherein the specific process is as follows:

s201: abandoning a VGG16 convolutional neural network full-connection layer, and performing feature extraction and storage on a training set, a verification set and a test set picture by using a feature extraction layer;

s202: constructing a face nevus recognition classifier, loading the features stored in the steps, and training and optimizing the parameters to obtain a specific full-connection layer;

s203: and constructing the feature extraction layer and the specific full connection layer to form a complete face mole recognition model.

Step three: the VGG16 model is initialized, as shown in fig. 6, in the following specific process:

s301: initializing the VGG16 network model as follows:

(1) setting the size of the color picture of the input training to be 32 multiplied by 3;

(2) the images pass through the network layer as shown in fig. 7 in sequence:

conv1_ 1: the dimensionality of the convolution kernel is 3 multiplied by 3, the step length is 1, the Padding is 1, and the dimensionality of a feature map output by calculation is 64 multiplied by 32;

convolution kernel calculation formula:

wherein, W_outFor outputting image width, H_outFor outputting image height, W_inFor input image width, H_inFor input image height, W_filterIs convolution kernel width, H_filterFor convolution kernel height, P (i.e., padding) is the number of boundary pixel layers for image edge padding, and S (i.e., Stride) is the step size.

Conv1_ 2: the dimensions of the convolution kernel are 3 × 3 × 64, the step size is 1, Padding is 1, and the dimensions of the feature map output by calculation are 64 × 32 × 32.

Avgpool 1: adopting an average pooling layer, wherein the size of a sliding window is 2 multiplied by 2, the step length is 2, the dimension of the sliding window is 2 multiplied by 64, and the dimension of a feature map output by calculation is 64 multiplied by 16;

the pooling layer calculation formula is:

conv2_ 1: the dimensions of the convolution kernel are 3 × 3 × 64, the step size is 1, and Padding is 1. The feature map dimensions output by calculation are 128 × 16 × 16.

Conv2_ 2: the dimensions of the convolution kernel are 3 × 3 × 128, the step size is 1, and Padding is 1. The feature map dimensions output by calculation are 128 × 16 × 16.

Avgpool 2: and (3) adopting an average pooling layer, wherein the size of a sliding window is 2 multiplied by 2, the step length is 2, and the dimension of the sliding window is 2 multiplied by 128. The feature map dimension output by calculation is 128 × 8 × 8.

Conv3_ 1: the dimensions of the convolution kernel are 3 × 3 × 128, the step size is 1, Padding is 1, and the dimensions of the feature map output by calculation are 256 × 8 × 8.

Conv3_ 2: the dimensions of the convolution kernel are 3 × 3 × 256, the step size is 1, Padding is 1, and the dimensions of the feature map output by calculation are 256 × 8 × 8.

Conv3_ 3: the dimensions of the convolution kernel are 3 × 3 × 256, the step size is 1, Padding is 1, and the dimensions of the feature map output by calculation are 256 × 8 × 8.

Avgpool 3: and (3) adopting an average pooling layer, wherein the size of a sliding window is 2 multiplied by 2, the step length is 2, the dimension of the sliding window is 2 multiplied by 256, and the dimension of a feature map output by calculation is 256 multiplied by 4.

Conv4_ 1: the dimensions of the convolution kernel are 3 × 3 × 256, the step size is 1, Padding is 1, and the dimensions of the feature map output by calculation are 512 × 4 × 4.

Conv4_ 2: the dimension of the convolution kernel is 3 × 3 × 512, the step size is 1, Padding is 1, and the dimension of the feature map output by calculation is 512 × 4 × 4.

Conv4_ 3: the dimension of the convolution kernel is 3 × 3 × 512, the step size is 1, Padding is 1, and the dimension of the feature map output by calculation is 512 × 4 × 4.

Avgpool 4: and (3) adopting an average pooling layer, wherein the size of a sliding window is 2 multiplied by 2, the step length is 2, the dimension of the sliding window is 2 multiplied by 512, and the dimension of a feature map output by calculation is 512 multiplied by 2.

Conv5_ 1: the dimension of the convolution kernel is 3 × 3 × 512, the step size is 1, Padding is 1, and the dimension of the feature map output by calculation is 512 × 2 × 2.

Conv5_ 2: the dimension of the convolution kernel is 3 × 3 × 512, the step size is 1, Padding is 1, and the dimension of the feature map output by calculation is 512 × 2 × 2.

Conv5_ 3: the dimension of the convolution kernel is 3 × 3 × 512, the step size is 1, Padding is 1, and the dimension of the feature map output by calculation is 512 × 2 × 2.

Avgpool 5: and (3) adopting an average pooling layer, wherein the size of a sliding window is 2 multiplied by 2, the step length is 2, the dimension of the sliding window is 2 multiplied by 512, and the dimension of a feature map output by calculation is 512 multiplied by 1.

Flatten: the multidimensional input is subjected to one-dimensional input, namely the input feature map has dimensions of 512 multiplied by 1 and is flattened into data of 1 multiplied by 512.

Fc 6: and (4) fully connecting layers and outputting data with the dimension of 1 × 256.

Sigmoid: and transmitting the output data of the full-connection layer into a Sigmoid function to obtain 1 predicted value.

S302: performing data enhancement operation on the data set, specifically as follows:

performing data enhancement operation on the training set to obtain an enhanced training set:

(1) carrying out normalization preprocessing on an input data image;

(2) image rotation, namely rotating the image, and setting the deflection angle to be 10 degrees;

(3) horizontally shifting the image, namely horizontally shifting the image, wherein the shift position is 0.05 cm;

(4) vertically shifting the image, namely vertically shifting the image, wherein the shift position is 0.05 cm;

(5) randomly and staggeredly cutting the change image, and setting the change angle to be 0.05 degrees;

(6) randomly zooming the image, and setting a zooming range of 0.05 cm;

(7) randomly and horizontally turning the picture, namely randomly and horizontally turning the picture;

(8) pixel filling, namely filling pixels, adopts a nearest neighbor filling method.

Data enhancement operations cannot be performed on the verification set and the test set, so only preprocessing operations of image normalization are performed.

Training database samples, unifying the image size to be 32 multiplied by 32, setting the sample data batch _ size to be 32, returning to a binary label array form, and improving the network generalization capability.

S303: training a database sample, and performing the following operations on a training set, a verification set and a test set:

(1) giving a data sample file path;

(2) the uniform image size is 32 × 32;

(3) setting sample data batch _ size to 32;

(4) return to binary label array form because the loss function used is a binary cross entropy function.

Step four: the complete face mole recognition model is finely tuned, and the convolution bases and the full connection layer of the five modules of the VGG16 model are finely tuned on the basis of initializing the VGG16 model, as shown in fig. 8, the specific process is as follows:

s401: loading a complete face mole recognition model, namely loading a VGG16 model and a full connection layer;

s402: fine-tuning (Fine-tune) all convolutional layers and all connected layers using a training data set and a validation set, using RMSprop optimizer (Root Mean Square Prop, Root Mean Square transfer) to optimize parameters, dividing the training set into n lots, each lot containing m samples, and setting the learning rate lr to 1e-5, as follows:

the optimization parameters are optimized by adopting an optimizer, specifically, the gradients of the weight W and the offset b are optimized by using a differential squared weighted average, and in the process of the t-th iteration:

S_dw＝βS_dw+(1-β)dW²

S_db＝βS_db+(1-β)db²

wherein S is_dwAnd S_dbRespectively, the gradient momentum accumulated by the loss function in the previous t-1 iteration process, beta is an index of gradient accumulation, epsilon is used for smoothing processing, and the value is generally 10^-8。

S403: verifying data by adopting a verification set, namely training a model by utilizing a fit _ generator function, specifically as follows:

(1) setting step _ per _ epoch to 20, i.e., 20 generators need to be executed in each epoch to produce data;

(2) setting the epochs to 50, namely training is iterated for 50 times;

(3) judging whether the model is over-fitted;

(4) set validation _ steps to 10, i.e., specify that the generator of the validation set returns 10 times to validate the data.

S404: selecting an optimal parameter model and storing the model, namely a MoleRec.h5 model, namely a final face nevus recognition model; in the network after fine tuning, namely a face mole recognition model, Sigmoid output is used in classification, and a binary cross entropy function is used as a loss function; and finally, outputting the model accuracy and the loss curve result, and drawing a confusion matrix.

Fig. 9 shows a structure diagram of the neural network of the VGG16 model obtained after fine tuning.

Fig. 10 is based on the MacOS platform (memory 8GB), and the fine training model is trained for 50 rounds, each 27s, for about 22 minutes. The accuracy of the obtained training set reaches 100%, the loss value is 0.0006, the accuracy of the verification set reaches 98.26%, and the loss value is 0.1334.

Fig. 11 is a diagram of the fine-tuning model tested on a test set, where 0 represents mole identification, 1 represents non-mole identification, columns represent predicted values, and rows represent actual categories in a confusion matrix. From this, 00 represents a correct mole identification, 01 represents a wrong mole identification as a non-mole, 10 represents a wrong mole identification as a mole, and 11 represents a correct mole identification (determination of advanced ranking). Therefore, 99% of the nevi are successfully identified, 1% of the nevi are incorrectly identified as non-nevi, 5% of the non-nevi are incorrectly identified as nevi, 95% of the non-nevi are correctly identified, and the identification rate of the whole test set is obtained by averaging the sum of the correct identification rates of the two, that is, the identification rate of the test set is 97.0%.

Step five: as shown in fig. 12, the specific process of predicting a picture is as follows:

s501: acquiring pictures containing human faces, detecting the human faces contained in a single picture, and performing point tracing display on the five sense organs of the human faces, wherein the method specifically comprises the following steps:

(1) acquiring a picture containing a human face, and loading a test set picture file into numpy data in the implementation;

(2) and detecting the face in the acquired picture by using the HOG feature in combination with the dlib library (the method is the same as the step I, S101).

(3) Converting the generated array containing the face position into an image, and circularly finding all faces;

(4) appointing and printing the upper, right, lower and left position information of each face;

(5) and finding out the facial features according to the position of the face, extracting a plurality of key points of the face in the picture, and performing point tracing display on the facial features.

S502: dividing a plurality of key points into different key point sets; and calculating the clipping position of the face image based on the key point set, clipping the face image, and clipping the face image on the basis of S501.

S503: and carrying out overlapping blocking on the face image, and dividing the face image into a plurality of image blocks with overlapped edges according to the set image block size and the set step length, wherein the step length is smaller than the image block size. The blocking treatment usually adopts uniform blocking, and considering the particularity of the nevus, if the uniform blocking treatment is performed, the nevus can be divided into two parts and damaged, so that the overlapped blocking is adopted to ensure the recognition rate of the nevus. The specific steps of overlapping and blocking the face image are as follows:

(1) adjusting the size of the face image to be 200 multiplied by 3;

(2) setting the size of an image block, the step length of an abscissa and a step length of an ordinate, wherein the step length of the abscissa and the step length of the ordinate are both smaller than the length and the width of the image block, for example, the step length of the abscissa and the step length of the ordinate are both 15 pixel points, and the size of the image block is 19 x 19;

(3) moving the step length according to the set block size and step length of the image, specifically, when the step length of the ordinate is moved to the end of a line, the abscissa starts to move according to the step length of the abscissa until the image is completely blocked; the first block location is image [0:19,0:19], then the second location is image [15:34,0:19 ];

(4) and displaying the blocked image.

S504: predicting the face image by using a face mole recognition model (molerec. h5 model), which comprises the following steps:

(1) adjusting the size of an image block, and adjusting the size of each block to be 32 multiplied by 3;

(2) converting the adjusted block image into an array X;

(3) adding data at the position 0 of the array X, and carrying out normalization processing on the data;

(4) inputting a test array X, predicting the mole position through a trained fine tuning model (namely a final face mole recognition model), and outputting the prediction probability.

(5) Setting the threshold value to be 0.5, if the predicted value of the block is less than 0.5, judging the block to be a mole, and marking the picture frame as a mole.

Fig. 13 is an example of a face mole prediction performed on a picture containing a single mole, and the specific detection steps are the same as those in fig. 12; fig. 14 is an example of predicting a face mole for a picture with many moles, and the specific detection steps are the same as those in fig. 12.

The method builds the VGG16 convolutional neural network, abandons the full-link layer in the VGG16 convolutional neural network, optimizes training parameters to obtain a specific full-link layer, and obtains a complete model. And initializing the built VGG16 model, and finely adjusting five convolution base modules and a specific full connection layer of the VGG16 neural network on the basis of initializing the VGG16 model to obtain a more accurate convolution neural network model. And training the built deep learning network. Namely training is carried out in a training set after data enhancement and a verification set of preprocessing, and a trained deep learning model is obtained. And testing by using a test set database on the basis of the above steps to verify the generalization capability of the model. And finally, detecting the nevus in the face. After the face image containing the nevus is subjected to face detection, the face is cut and processed in a blocking mode, and the blocked face regions are respectively predicted by using a trained fine tuning model, so that the requirement of detecting the nevus is met.

Example 2

The present embodiment provides a facial nevus recognition system based on deep learning, including:

Example 3

The present embodiment also provides an electronic device, which includes a memory, a processor, and computer instructions stored in the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the steps of the method of embodiment 1.

Example 4

The present embodiment also provides a computer-readable storage medium for storing computer instructions, which when executed by a processor, perform the steps of the method of embodiment 1.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The method for recognizing the facial nevus based on deep learning is characterized by comprising the following steps:

acquiring a picture containing a human face;

dividing a face image into a plurality of image blocks with overlapped edges;

2. The method for recognizing the facial nevus based on deep learning of claim 1, wherein the fine-tuning of the complete facial nevus recognition model comprises the following specific steps:

loading a complete face nevus recognition model;

3. The method according to claim 1, wherein the enhanced training set is obtained by sequentially performing normalization preprocessing, image rotation, image horizontal shift, image vertical shift, random miscut change image, random scaling image, random horizontal flipping image, and pixel filling on the images in the training set.

4. The method according to claim 1, wherein the preprocessing is performed before the image block is input into the face mole recognition model, and the method specifically includes:

adjusting the size of an image block;

converting the adjusted image blocks into arrays;

data is added to the 0 position of the array and normalized.

5. The method for recognizing the nevus facialis based on deep learning of claim 1, wherein the specific steps of detecting the face in the picture by using the HOG features are as follows:

6. The method for recognizing the nevus facialis based on the deep learning of claim 1, wherein the step of cropping the face image comprises the following steps:

extracting a plurality of key points of the face in the picture;

dividing a plurality of key points into different key point sets;

7. Facial nevus recognition system based on degree of deep learning, its characterized in that includes:

8. The system of claim 7, wherein the enhanced training set is obtained by sequentially performing normalization preprocessing, image rotation, image horizontal shift, image vertical shift, random miscut change image, random scaling image, random horizontal flipping image, and pixel filling on the images in the training set.

9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and executable on the processor, the computer instructions when executed by the processor performing the steps of the method of any of claims 1 to 6.

10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method of any one of claims 1 to 6.