CN113642385B

CN113642385B - Facial nevus recognition method and system based on deep learning

Info

Publication number: CN113642385B
Application number: CN202110748237.8A
Authority: CN
Inventors: 陆华; 谢柯; 张华�; 李登旺; 黄浦; 许化强
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2021-07-01
Filing date: 2021-07-01
Publication date: 2024-03-15
Anticipated expiration: 2041-07-01
Also published as: CN113642385A

Abstract

The invention discloses a facial nevus recognition method and a facial nevus recognition system based on deep learning, which are characterized in that a VGG16 convolutional neural network is built, a full-connection layer in the VGG16 convolutional neural network is abandoned, training parameters are optimized, a specific full-connection layer is obtained, and a complete model is obtained; fine tuning all convolution layers and all connection layers in the model by using an enhanced training set, optimizing parameters by adopting an optimizer, verifying data by adopting a verification set, verifying generalization capability of the model, and selecting an optimal parameter model as a final face nevus recognition model; after face detection is carried out on the face image, the face is cut and segmented, and the segmented face areas are respectively predicted by using a face nevus recognition model, so that the requirement of detecting the nevus is met.

Description

Facial nevus recognition method and system based on deep learning

Technical Field

The invention belongs to the field of deep learning, and particularly relates to a facial nevus recognition method and system based on deep learning.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

By means of continuous development and innovation of technology, the biological feature recognition technology makes a great breakthrough. Biometric techniques include fingerprint recognition, retinal recognition, iris recognition, and face recognition. The face recognition is widely applied to the aspects of security system detection, missing exploration, identity registration, public place monitoring and the like due to the characteristics of friendliness, convenience and accuracy. The development of higher resolution sensors, the increase in the size of face image databases, the improvement of image processing and computational vision algorithms have led to people no longer meeting the current state of the art for face recognition, but pursuing more accurate, refined facial microfeature recognition. Therefore, in order to effectively improve the face recognition and retrieval performance, people begin to further research and utilize facial micro-features (such as nevi, freckles, scars, etc.) to refine the assisted face recognition technology.

Detection and recognition of facial nevi generally faces three major problems, namely, is affected by light shadows and posture changes: on the one hand, when shooting is carried out, weak illumination and shadow can cause moles under the face area (such as corners of eyes, corners of mouth, nostrils or wrinkles) to become unclear, so that the moles are difficult to distinguish, and the detection accuracy of the moles is reduced; on the other hand, changes in facial expression, front view, side view, and other gestures during photographing can also affect nevi recognition. Secondly, people use a beautifying tool for pursuing better quality photos, and the beautifying tool often carries out Gaussian blur treatment on the photos, particularly on the parts of nevi, and the treated photos can generate distortion and other problems to prevent the nevi from being identified. Thirdly, other facial features (such as stains, birthmarks, hairs, beards, etc.), some wearing articles (such as nose rings, lip nails, eyebrow decorations), and the use of cosmetics (such as concealer, compact, plain facial cream, etc.) can interfere with the detection and identification of moles. Therefore, the details of the nevus need to be considered when identifying the nevus, firstly, the uniqueness of the nevus needs to be reflected and distinguished from other faces; secondly, identifying the stability of the nevus, and repeatedly detecting to ensure the accuracy of nevus identification.

Disclosure of Invention

In order to solve the problems, the invention provides a facial nevus recognition method and a facial nevus recognition system based on deep learning, which are characterized in that a VGG16 convolutional neural network is built, a full connection layer in the VGG16 convolutional neural network is abandoned, training parameters are optimized, a specific full connection layer is obtained, and a complete model is obtained; fine tuning all convolution layers and all connection layers in the model by using an enhanced training set, optimizing parameters by adopting an optimizer, verifying data by adopting a verification set, verifying generalization capability of the model, and selecting an optimal parameter model as a final face nevus recognition model; after face detection is carried out on the face image, the face is cut and segmented, and the segmented face areas are respectively predicted by using a face nevus recognition model, so that the requirement of detecting the nevus is met.

According to some embodiments, the present invention employs the following technical solutions:

a facial mole recognition method based on deep learning, comprising:

acquiring a picture containing a human face;

detecting a face in the picture by using the HOG characteristics and cutting to obtain a face image;

dividing a face image into a plurality of image blocks with overlapped edges;

inputting the image blocks into a facial mole recognition model, predicting mole-containing probability of each image block, and positioning the positions of the mole-containing image blocks in a facial image; the face nevus recognition model is obtained through the following steps: performing feature extraction on the pictures in the enhanced training set by utilizing a feature extraction layer of the VGG16 convolutional neural network; constructing a facial nevus recognition classifier, and training and optimizing parameters by using the extracted features to obtain a full connection layer; building the feature extraction layer and the full connection layer to form a full face nevus recognition model; and fine-tuning the complete facial mole recognition model to obtain a final facial mole recognition model.

Further, the specific steps of fine tuning the complete facial mole recognition model are as follows:

loading a complete facial mole recognition model;

fine tuning all convolution layers and all connection layers in the model by using the enhanced training set, and optimizing parameters by adopting an optimizer;

and adopting verification data of a verification set to select an optimal parameter model as a final facial mole recognition model.

Furthermore, the enhanced training set is obtained by sequentially carrying out normalization pretreatment, image rotation, image horizontal offset, image vertical offset, random shear-thinning image, random scaling image, random horizontal overturning image and pixel filling on the images in the training set.

Further, the preprocessing operation is performed before the image block is input into the facial mole recognition model, which specifically includes:

adjusting the size of the image block;

converting the adjusted image blocks into a plurality of groups;

data is added at the 0 position of the array, and normalized.

Further, the specific steps of detecting the face in the picture by using the HOG feature are as follows:

after the picture is grayed, gamma conversion is adopted to normalize the color space of the picture;

calculating the transverse gradient, the longitudinal gradient, the gradient direction and the amplitude of each pixel point in the picture by using a first-order differential equation;

dividing a picture into a plurality of small squares, dividing the gradient direction of each small square into a plurality of direction blocks, and calculating the number of different gradient directions to obtain the feature vector of each small square;

forming a sliding window by a plurality of adjacent small blocks, and connecting feature vectors of all the small blocks in the sliding window in series to obtain HOG features of the sliding window;

scanning the picture by utilizing the sliding window, setting a scanning step length, performing sliding scanning in a mode of overlapping a plurality of pixels, and collecting HOG characteristics of all the sliding windows connected in series to obtain HOG characteristic vectors of faces in the picture;

and inputting the obtained HOG feature vector into an SVM model to obtain the face in the picture.

Further, the specific steps of clipping to obtain the face image are as follows:

extracting a plurality of key points of a human face in the picture;

dividing a plurality of key points into different key point sets;

and calculating the clipping position of the face image based on the key point set, and clipping to obtain the face image.

A deep learning based facial mole recognition system, comprising:

the data acquisition module is used for acquiring pictures containing human faces;

the face image clipping module is used for detecting faces in the pictures by utilizing the HOG characteristics and clipping to obtain face images;

the overlapping block module is used for dividing the face image into a plurality of image blocks with overlapped edges;

the facial mole recognition module is used for inputting the image blocks into a facial mole recognition model, predicting mole-containing probability of each image block and positioning the positions of the mole-containing image blocks in the facial image; the face nevus recognition model is obtained through the following steps: performing feature extraction on the pictures in the enhanced training set by utilizing a feature extraction layer of the VGG16 convolutional neural network; constructing a facial nevus recognition classifier, and training and optimizing parameters by using the extracted features to obtain a full connection layer; building the feature extraction layer and the full connection layer to form a full face nevus recognition model; and fine-tuning the complete facial mole recognition model to obtain a final facial mole recognition model.

An electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the method of the first aspect.

A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method of the first aspect.

The beneficial effects of the invention are as follows:

1. the invention successfully establishes the facial mole image database by self based on HOG characteristics and overlapping blocking technology, and enriches database samples.

2. The invention adopts a data enhancement technology, namely, the original data set is subjected to graph enhancement transformation, such as cutting, scaling, rotation, movement, overturning and the like, and the graph subjected to small change is regarded as different data resources, so that the data sample is increased, and the data set is effectively expanded.

3. The invention adopts VGG16 model convolution base, the network of VGG16 has 1 hundred million 3800 parameters, is a very deep large-scale network, and has very good migratable learning robustness while maintaining higher classification precision.

4. By fine tuning the five convolution basis modules and the full connection layer in the convolution basis of the VGG16 neural network, the method can effectively inhibit the overfitting phenomenon of the model, quicken the convergence rate of the model, simplify the complexity of model training and effectively improve the model accuracy.

5. In the invention, considering the specificity of the nevus, if the nevus is uniformly segmented, the nevus is broken by being divided into two parts, so that the overlapped segmentation is adopted for ensuring the recognition rate of the nevus.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.

FIG. 1 is a flow chart of a facial mole recognition method based on deep learning of the present invention;

FIG. 2 is a flow chart of face detection based on HOG features of the invention;

FIG. 3 (a) is an original image of a face detected based on HOG features;

FIG. 3 (b) is a facial plot of the present application for face detection based on HOG features;

FIG. 3 (c) is a face image cut out in the present application;

FIG. 4 is an image of the application after overlapping tiles;

fig. 5 (a) is a pool of moles-containing samples;

fig. 5 (b) is a non-nevus sample library;

FIG. 6 is a flow chart of the initialization process for the VGG16 model according to the present invention;

FIG. 7 is a representation of the size of an image through a custom neural network structural layer of the present invention;

FIG. 8 is a diagram of a fine tuned model of VGG16 of the present invention;

FIG. 9 is a neural network diagram of a VGG16 fine tuning model of the invention;

FIG. 10 (a) is the model accuracy obtained for the trimmed model of the invention;

FIG. 10 (b) is the loss curve results obtained for the trimmed model of the invention;

FIG. 11 is a confusion matrix for a fine-tuning model test set of the present invention;

fig. 12 (a) is a diagram showing the detected face and facial features in the step of the detection flow for predicting a facial nevus;

fig. 12 (b) is a face image obtained after clipping in the detection flow step of predicting a face nevus;

fig. 12 (c) is a face image obtained after overlapping and blocking in the detection flow step of predicting a face nevus;

fig. 12 (d) is a predicted facial mole position map in the step of the detection flow of predicted facial mole;

FIG. 13 (a) is an original image of a detected single mole image using the MoleRec.h5 model;

FIG. 13 (b) is an effect graph of detecting single mole images using the MoleRec.h5 model;

FIG. 14 (a) is an original image of a detection of multiple moles using the MoleRec.h5 model;

fig. 14 (b) is an effect graph of detecting multi-nevi images using molerec.h5 model.

Detailed Description

The invention will be further described with reference to the drawings and examples.

It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

Example 1

According to the facial nevus recognition method based on deep learning, firstly, a database composed of nevus-containing pictures and nevus-free pictures is established; secondly, constructing a VGG16 convolutional neural network, discarding a full-connection layer in the VGG16 convolutional neural network, and optimizing training parameters to obtain a specific full-connection layer to obtain a complete model; initializing the built VGG16 model, and fine-tuning five convolution base modules and a specific full-connection layer of the VGG16 neural network on the basis of initializing the VGG16 model to obtain a more accurate convolution neural network model; furthermore, training the built deep learning network, namely training in a training set with enhanced data and a preprocessed verification set to obtain a trained deep learning model, testing a test set database, and verifying the generalization capability of the model; finally, detecting the nevus in the face, namely, after the face image containing the nevus is subjected to face detection, cutting and blocking the face, and respectively predicting the blocked face areas by using a trained fine tuning model so as to meet the requirement of detecting the nevus. By the mode, the facial nevus can be effectively detected, and the recognition performance of the nevus recognition system can be improved.

The facial nevus recognition method based on deep learning, as shown in fig. 1, specifically comprises the following steps:

step one: constructing a database of nevus images, namely, constructing a database consisting of nevus-containing pictures and nevus-free pictures, wherein the sources of the database are as follows: firstly, detecting a human face by utilizing HOG features and combining dlib library, and cutting and overlapping the human face to obtain a region block containing a human face nevus and a region block without the nevus; second, searching, downloading and clipping the nevus-containing picture from the network. The first database creation process is mainly described below:

s101: performing face detection by combining HOG features with dlib library, and performing face cutting and overlapping segmentation;

as shown in fig. 2, the specific steps for detecting a face in a picture by using HOG features are as follows:

(1) Inputting a color face image, and setting a face sliding window to 196×196;

(2) Graying the image, i.e. treating the image as a two-dimensional image of (x, y);

(3) The input image is normalized in color space by gamma transformation:

the contrast of the image is adjusted, the influence caused by partial shadow and illumination of the image is reduced, and meanwhile, the noise interference can be restrained;

(4) The image gradient is calculated by using a first-order differential equation, so that the face contour information is acquired, and the illumination interference is further reduced. I.e. setting the lateral gradient of the pixel (x, y) as H _x (x, y), longitudinal gradient H _y (x, y), gradient direction θ (x, y) and magnitude H (x, y), the calculation formula is:

H _x (x，y)＝H(x+1，y)-H(x-1，y)

H _y (x，y)＝H(x，y+1)-H(x，y-1)

(5) Dividing a picture into a plurality of small squares, dividing the gradient direction of each small square into a plurality of direction blocks, and calculating the number of different gradient directions to obtain the feature vector of each small square; preferably, the image is divided into small squares (cells) with the size of 16×16 pixels, the gradient information of the 16×16 pixels is counted by adopting a histogram of 9 interval bins, namely 360 degrees of the gradient direction of each cell is divided into 9 direction blocks, each direction block has a range of 40 degrees, and the number of different gradient directions is calculated to obtain 9-dimensional feature vectors of each cell;

(6) Forming a sliding window by a plurality of adjacent small blocks, and connecting feature vectors of all the small blocks in the sliding window in series to obtain HOG features of the sliding window; preferably, every four adjacent cells (i.e. every 2×2 cells) are formed into a sliding window (block), and the 36-dimensional HOG feature of the block can be obtained by concatenating 9-dimensional feature vectors of all cells in the block;

(7) Scanning the picture by utilizing the sliding window, setting a scanning step length, performing sliding scanning in a mode of overlapping a plurality of pixels, and collecting HOG characteristics of all the sliding windows connected in series to obtain HOG characteristic vectors of faces in the picture; preferably, the input image is scanned by utilizing blocks, namely, the scanning step length is set to be 8 pixels, and sliding scanning is carried out in a mode of overlapping 8 pixels, so that HOG characteristics of all blocks connected in series are collected, and HOG characteristic vectors of faces in the image are obtained;

(8) And inputting the HOG feature vector into an SVM model to obtain a face in the picture, namely, predicting the HOG feature of the face to be recognized by using the SVM model trained in the dlib library.

S102: the face clipping and overlapping blocking are carried out, and the specific steps comprise:

firstly, extracting a plurality of key points of a face in a picture, preferably, extracting 68 face feature points by adopting a shape_predictor_68_face_landmarks.dat model in a dlib library, determining a face position according to the 68 face key points, defining four different key point sets, namely dividing the key points into different key point sets, and specifically:

an abscissa containing 11 key points on the eyebrow;

an ordinate containing 11 key points on the eyebrow;

an abscissa containing 16 keypoints on the chin;

an ordinate containing 16 keypoints on the chin;

FIG. 3 shows that after a face is detected by combining HOG features with dlib library, that is, after the flow of FIG. 2, a shape_predictor_68_face_landmarks.dat model in the dlib library is adopted to extract 68 face feature points;

secondly, calculating the clipping position of the face image based on the key point set, specifically, calculating top, bottom, left and right of the face according to the following formula:

wherein x1 and x2 are respectively a setA sixth point and a third point;

outputting the accurate position of the face in the image through four coordinates of top, bottom, left and right, cutting the face to obtain a face image, and printing the position and the size of the face;

and performing overlapping and blocking processing on the cut face image, namely dividing the face image into a plurality of image blocks with overlapped edges according to the set image block size and step length. Preferably, setting the step sizes of the abscissa and the ordinate to be 15 pixel points, firstly moving the ordinate, and when the step size of the ordinate is moved to the end of one row, starting to move according to the step size of the abscissa until the image is completely segmented; for example, the first block is located at image [0:19,0:19], and the second block is located at image [15:34,0:19 ].

S103: displaying and storing the segmented image, which comprises the following steps:

(1) Displaying the segmented image of the S102;

(2) Establishing a folder and uniformly storing the segmented images in a PNG format;

(3) Classification of nevus-containing images and non-nevus images.

Fig. 4 is a view showing the face image cut out by overlapping and blocking.

The database sources of fig. 5 are two: firstly, detecting a human face by utilizing HOG features and combining dlib library, and cutting and overlapping the human face to obtain a region block containing a human face nevus and a region block without the nevus; second, searching, downloading and clipping the nevus-containing picture from the network. The final database establishes 1200 sample data, and the database is divided into a training set, a verification set and a test set, wherein 600 samples of the training set, 300 samples of the verification set and 300 samples of the test set are obtained, and the two data do not have the cross phenomenon. The nevus-containing picture and the nevus-free picture are respectively regarded as positive samples and negative samples, and the number of the positive samples and the negative samples in each data set is equal.

Step two: the VGG16 model is built, and the concrete process is as follows:

s201: giving up the VGG16 convolutional neural network full-connection layer, and extracting and storing the characteristics of the training set, the verification set and the test set pictures by using the characteristic extraction layer;

s202: constructing a facial mole recognition classifier, loading the features stored in the steps, and training and optimizing parameters to obtain a specific full-connection layer;

s203: and constructing the feature extraction layer and the specific full-connection layer to form a complete facial nevus recognition model.

Step three: the VGG16 model is initialized, as shown in fig. 6, and the specific process is as follows:

s301: the VGG16 network model is initialized as follows:

(1) Setting the size of a color picture of input training to be 32 multiplied by 3;

(2) The images sequentially pass through the network layers as shown in fig. 7:

conv1_1: the dimension of the convolution kernel is 3 multiplied by 3, the step size is 1, the padding is 1, and the dimension of the feature map output through calculation is 64 multiplied by 32;

convolution kernel calculation formula:

wherein W is _out To output image width H _out To output high image, W _in To input image width H _in To input high image, W _filter Is convolution kernel wide, H _filter For convolution kernel height, P (i.e., padding) is the number of boundary pixel layers for image edge filling, and S (i.e., stride) is the step size.

Conv1_2: the convolution kernel has dimensions of 3×3×64, the step size is 1, the padding is 1, and the feature map output by calculation has dimensions of 64×32×32.

Avgpool1: adopting an average pooling layer, wherein the size of a sliding window is 2 multiplied by 2, the step length is 2, the dimension of the sliding window is 2 multiplied by 64, and the dimension of a feature map output through calculation is 64 multiplied by 16;

the calculation formula of the pooling layer is as follows:

conv2_1: the convolution kernel has dimensions of 3×3×64, a step size of 1, and a padding of 1. The dimension of the feature map output by the calculation is 128×16×16.

Conv2_2: the convolution kernel has dimensions of 3×3×128, a step size of 1, and a padding of 1. The dimension of the feature map output by the calculation is 128×16×16.

Avgpool2: with the average pooling layer, the sliding window size is 2×2, the step size is 2, and the sliding window dimension is 2×2×128. The dimension of the feature map output by the calculation is 128×8×8.

Conv3_1: the convolution kernel has dimensions of 3×3×128, a step size of 1, a padding of 1, and a feature map output by calculation has dimensions of 256×8×8.

Conv3_2: the convolution kernel has dimensions of 3×3×256, a step size of 1, a padding of 1, and a feature map output by calculation has dimensions of 256×8×8.

Conv3_3: the convolution kernel has dimensions of 3×3×256, a step size of 1, a padding of 1, and a feature map output by calculation has dimensions of 256×8×8.

Avgpool3: the average pooling layer is adopted, the size of a sliding window is 2×2, the step size is 2, the dimension of the sliding window is 2×2×256, and the dimension of a feature map output through calculation is 256×4×4.

Conv4_1: the convolution kernel has dimensions of 3×3×256, a step size of 1, a padding of 1, and a feature map output by calculation has dimensions of 512×4×4.

Conv4_2: the convolution kernel has dimensions of 3×3×512, step size of 1, padding of 1, and feature map dimension of 512×4×4 by calculation output.

Conv4_3: the convolution kernel has dimensions of 3×3×512, step size of 1, padding of 1, and feature map dimension of 512×4×4 by calculation output.

Avgpool4: the average pooling layer is adopted, the size of a sliding window is 2×2, the step size is 2, the dimension of the sliding window is 2×2×512, and the dimension of a feature map output through calculation is 512×2×2.

Conv5_1: the convolution kernel has dimensions of 3×3×512, step size of 1, padding of 1, and feature map dimension of 512×2×2 output by calculation.

Conv5_2: the convolution kernel has dimensions of 3×3×512, step size of 1, padding of 1, and feature map dimension of 512×2×2 output by calculation.

Conv5_3: the convolution kernel has dimensions of 3×3×512, step size of 1, padding of 1, and feature map dimension of 512×2×2 output by calculation.

Avgpool5: the average pooling layer is adopted, the size of a sliding window is 2×2, the step size is 2, the dimension of the sliding window is 2×2×512, and the dimension of a feature map output through calculation is 512×1×1.

Flame: the multidimensional input is unidimensionalized, namely, the dimension of the input characteristic diagram is 512 multiplied by 1, and the dimension of the input characteristic diagram is flattened to be 1 multiplied by 512.

Fc6: and the full connection layer outputs data with the dimension of 1 multiplied by 256.

Sigmoid: and transmitting the output data of the full connection layer into a Sigmoid function to obtain 1 predicted value.

S302: the data enhancement operation is carried out on the data set, and the data enhancement operation is concretely as follows:

data enhancement operation on the training set, obtaining an enhanced training set:

(1) Carrying out normalization pretreatment on an input data image;

(2) The image is rotated, namely the image is rotated, and the deflection angle is set to be 10 degrees;

(3) Horizontally shifting the image, namely horizontally shifting the image, wherein the shifting position is 0.05cm;

(4) The image is vertically shifted, namely, the image is vertically shifted, and the shifting position is 0.05cm;

(5) Randomly staggering the images, and setting the change angle to be 0.05 degrees;

(6) Randomly zooming the image, and setting a zooming range of 0.05cm;

(7) Randomly horizontally overturning the picture, namely randomly horizontally overturning the picture;

(8) Pixel filling, i.e. filling pixels, adopts a nearest neighbor filling method.

The data enhancement operation cannot be performed on the verification set and the test set, so that only the preprocessing operation of image normalization is performed.

Training database samples, unifying the image size to be 32×32, setting the sample data batch_size to be 32, returning to a binary tag array form, and improving the network generalization capability.

S303: training a database sample, and performing the following operations on a training set, a verification set and a test set:

(1) Giving a data sample file path;

(2) The unified image size is 32×32;

(3) Setting the sample data batch_size to be 32;

(4) Returning to the binary tag array form because the loss function used is a binary cross entropy function.

Step four: the whole facial nevus recognition model is fine-tuned, and on the basis of initializing the VGG16 model, the convolution bases and the full connection layers of five modules of the VGG16 model are fine-tuned, as shown in fig. 8, and the specific process is as follows:

s401: loading a complete facial nevus recognition model, namely loading a VGG16 model and a full connection layer;

s402: fine tuning (Fine-tuning) is performed on all convolutional layers and all fully-connected layers using training data sets and validation sets, the training set is divided into n latches, each containing m samples, with the learning rate lr set to 1e-5, using RMSprop optimizer (Root Mean Square Prop, root mean square pass) optimization parameters, as follows:

the optimization parameters are optimized by an optimizer, specifically, the gradient of the weight W and the bias b is optimized by using a differential square weighted average, and in the t-th iteration process:

S _dw ＝βS _dw +(1-β)dW ²

S _db ＝βS _db +(1-β)db ²

wherein S is _dw And S is _db Respectively refer to the gradient momentum accumulated by the loss function in the previous t-1 round of iteration, beta is an index of gradient accumulation, epsilon is used for smoothing and is generally 10 ^-8 。

S403: the data is validated using a validation set, i.e., a model is trained using the fit_generator function, as follows:

(1) Setting steps_per_epoch to 20, i.e. 20 generators need to be executed in each epoch to produce data;

(2) Setting epochs to be 50, namely training iteration 50 times;

(3) Judging whether the model is over fitted;

(4) The validation_steps is set to 10, i.e., the generator of the specified validation set returns 10 times to validate the data.

S404: selecting an optimal parameter model and storing the model, namely a MoleRec.h5 model, namely a final facial mole recognition model; in the finely tuned network, namely the facial mole recognition model, sigmoid output is used for classification, and a binary cross entropy function is used as a loss function; and finally outputting model accuracy and loss curve results, and drawing a confusion matrix.

Fig. 9 shows a neural network structure diagram of the VGG16 model obtained after the trimming.

Figure 10 is based on MacOS platform (8 GB memory), fine tuning training model training for 50 rounds, each round for 27s, for a total of about 22 minutes. The accuracy of the obtained training set reaches 100%, the loss value is 0.0006, the accuracy of the verification set reaches 98.26%, and the loss value is 0.1334.

Fig. 11 is a graph of a fine-tuning model tested on a test set, where 0 represents the identification of moles, 1 represents the identification of non-moles, column represents the predicted value, and row represents the actual category in the confusion matrix. From this (prior re-ranking), 00 represents correctly identified moles, 01 represents incorrectly identified moles as non-moles, 10 represents incorrectly identified moles as non-moles, 11 represents correctly identified moles. Thus, 99% of the nevi were successfully identified, 1% were incorrectly identified as non-nevi, 5% were incorrectly identified as nevi, 95% were correctly identified, and the overall test set identification rate was the sum of the two correct identification rates averaged to obtain a test set identification rate of 97.0%.

Step five: the picture is predicted, as shown in fig. 12, and the specific process is as follows:

s501: the method comprises the steps of obtaining pictures containing human faces, detecting the human faces contained in a single picture, and carrying out dotting display on the five sense organs of the human faces, wherein the specific steps are as follows:

(1) Acquiring a picture containing a human face, and in the implementation, loading a test set picture file into numpy data;

(2) And (3) detecting the face in the acquired picture by using the HOG characteristic and the dlib library (S101 of the same method as the step one).

(3) The generated face-containing position array is converted into an image, and all the found faces are circulated;

(4) Designating and printing the upper, right, lower and left position information of each face;

(5) Finding out the facial features according to the facial positions, extracting a plurality of key points of the facial features in the pictures, and tracing and displaying the features.

S502: dividing a plurality of key points into different key point sets; and calculating the clipping position of the face image based on the key point set, clipping the face image, and clipping the face image on the basis of S501.

S503: and overlapping and blocking the face image, and dividing the face image into a plurality of image blocks with overlapped edges according to the set image block size and the step length, wherein the step length is smaller than the image block size. The dividing process generally adopts uniform dividing, and considering the specificity of the nevus, if the uniform dividing process is carried out, the nevus is divided into two parts to be destroyed, so that overlapping dividing is adopted for ensuring the recognition rate of the nevus. The specific steps of overlapping and blocking the face image are as follows:

(1) Adjusting the size of the face image to 200 multiplied by 3;

(2) Setting the size of the image block, the step length of the abscissa and the step length of the ordinate, wherein the step length of the abscissa and the step length of the ordinate are smaller than the length and the width of the image block, for example, the step length of the abscissa and the step length of the ordinate are 15 pixel points, and the size of the image block is 19 x 19;

(3) Step movement is carried out according to the set image block size and step length, specifically, when the step length of the ordinate is moved to the end of one row, the abscissa starts to move according to the abscissa step length until the image is completely segmented; the first block is located at image [0:19,0:19], then the second is located at image [15:34,0:19];

(4) And displaying the segmented image.

S504: the face image is predicted by using a face nevus recognition model (molerec.h5 model), and the specific steps are as follows:

(1) Adjusting the size of the image block, and adjusting the size of each block to be 32 multiplied by 3;

(2) Converting the adjusted segmented image into an array X;

(3) Adding data at the 0 position of the array X, and carrying out normalization processing on the data;

(4) Inputting a test array X, predicting the position of the nevus through a trained fine tuning model (namely a final facial nevus recognition model), and outputting the prediction probability.

(5) And setting the threshold value to be 0.5, judging as a mole if the predicted value of the block is smaller than 0.5, and marking the frame as a mole.

Fig. 13 is an example of predicting a facial mole for a single mole-containing picture, and the specific detection procedure is the same as that of fig. 12; fig. 14 shows an example of predicting a facial mole for a multi-mole-containing picture, and the specific detection procedure is the same as that of fig. 12.

According to the invention, a VGG16 convolutional neural network is built, a full-connection layer in the VGG16 convolutional neural network is abandoned, training parameters are optimized, a specific full-connection layer is obtained, and a complete model is obtained. And initializing the built VGG16 model, and fine-tuning five convolution base modules and a specific full-connection layer of the VGG16 neural network on the basis of initializing the VGG16 model to obtain a more accurate convolution neural network model. Furthermore, the built deep learning network is trained. Training is carried out in the training set after data enhancement and the preprocessed verification set, and a trained deep learning model is obtained. And testing by using a test set database on the basis, and verifying the generalization capability of the model. Finally, detecting the nevus in the face. After face detection is carried out on the face image containing the nevus, the face is cut and segmented, and the segmented face areas are respectively predicted by utilizing the trained fine tuning model, so that the requirement of detecting the nevus is met.

Example 2

The present embodiment provides a facial nevus recognition system based on deep learning, including:

Example 3

The present embodiment also provides an electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the method of embodiment 1.

Example 4

The present embodiment also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, perform the steps of the method of embodiment 1.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The facial nevus recognition method based on deep learning is characterized by comprising the following steps of:

acquiring a picture containing a human face;

dividing a face image into a plurality of image blocks with overlapped edges;

inputting the image blocks into a facial mole recognition model, predicting mole-containing probability of each image block, and positioning the positions of the mole-containing image blocks in a facial image; the face nevus recognition model is obtained through the following steps: performing feature extraction on the pictures in the enhanced training set by utilizing a feature extraction layer of the VGG16 convolutional neural network; constructing a facial nevus recognition classifier, and training and optimizing parameters by using the extracted features to obtain a full connection layer; building the feature extraction layer and the full connection layer to form a full face nevus recognition model; fine tuning the complete facial mole recognition model to obtain a final facial mole recognition model;

the specific steps of detecting the face in the picture by using the HOG features are as follows: after the picture is grayed, gamma conversion is adopted to normalize the color space of the picture; calculating the transverse gradient, the longitudinal gradient, the gradient direction and the amplitude of each pixel point in the picture by using a first-order differential equation; dividing a picture into a plurality of small squares, dividing the gradient direction of each small square into a plurality of direction blocks, and calculating the number of different gradient directions to obtain the feature vector of each small square; forming a sliding window by a plurality of adjacent small blocks, and connecting feature vectors of all the small blocks in the sliding window in series to obtain HOG features of the sliding window; scanning the picture by utilizing the sliding window, setting a scanning step length, performing sliding scanning in a mode of overlapping a plurality of pixels, and collecting HOG characteristics of all the sliding windows connected in series to obtain HOG characteristic vectors of faces in the picture; inputting the HOG feature vector into an SVM model to obtain a face in the picture;

wherein, carry out the face and tailor includes: extracting a plurality of key points of a face, dividing the key points into different key point sets, calculating the clipping position of the face image based on the key point sets, and calculating top, bottom, left and right of the face: wherein x1 and x2 are respectively set +.>Is the sixth and third point,/->Abscissa including 11 key points on eyebrow,/->The ordinate containing 11 key points on the eyebrows,abscissa including 16 key points on chin, ++>The method comprises the steps that the ordinate of 16 key points on a chin is included, the face positions in an image are output through four coordinates of top, bottom, left and right, and the face is cut to obtain a face image;

wherein the overlapping blocking process includes: and dividing the face image into a plurality of image blocks with overlapped edges according to the set image block size and step length.

2. The facial nevus recognition method based on deep learning according to claim 1, wherein the specific step of fine tuning the complete facial nevus recognition model is:

loading a complete facial mole recognition model;

3. The deep learning based facial mole recognition method according to claim 1, wherein the enhanced training set is obtained by sequentially performing normalization preprocessing, image rotation, image horizontal shift, image vertical shift, random misplacement of the image, random scaling of the image, random horizontal flip of the image, and pixel filling on the pictures in the training set.

4. The facial nevus recognition method based on deep learning according to claim 1, wherein the image block is subjected to a preprocessing operation before being input into the facial nevus recognition model, specifically comprising:

adjusting the size of the image block;

converting the adjusted image blocks into a plurality of groups;

data is added at the 0 position of the array, and normalized.

5. A facial mole recognition system based on deep learning, comprising:

the facial mole recognition module is used for inputting the image blocks into a facial mole recognition model, predicting mole-containing probability of each image block and positioning the positions of the mole-containing image blocks in the facial image; the face nevus recognition model is obtained through the following steps: performing feature extraction on the pictures in the enhanced training set by utilizing a feature extraction layer of the VGG16 convolutional neural network; constructing a facial nevus recognition classifier, and training and optimizing parameters by using the extracted features to obtain a full connection layer; building the feature extraction layer and the full connection layer to form a full face nevus recognition model; fine tuning the complete facial mole recognition model to obtain a final facial mole recognition model;

wherein, carry out the face and tailor, include: extracting a plurality of key points of a face, dividing the key points into different key point sets, calculating the clipping position of the face image based on the key point sets, and calculating top, bottom, left and right of the face: wherein x1 and x2 are respectively set +.>Is the sixth and third point,/->Abscissa including 11 key points on eyebrow,/->The ordinate containing 11 key points on the eyebrows,abscissa including 16 key points on chin, ++>Ordinate containing 16 keypoints on chin, output by top, bottom, left and right four coordinatesThe face position in the image is used for cutting the face to obtain a face image;

wherein, overlap blocking processing includes: and dividing the face image into a plurality of image blocks with overlapped edges according to the set image block size and step length.

6. The deep learning based facial mole recognition system according to claim 5, wherein the enhanced training set is obtained by sequentially performing normalization preprocessing, image rotation, image horizontal shift, image vertical shift, random misplacement of images, random scaling of images, random horizontal flip of pictures, and pixel filling on the pictures in the training set.

7. An electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the method of any of claims 1-4.

8. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method of any of claims 1-4.