WO2021244352A1 - Method and apparatus for determining local area that affects degree of facial aging - Google Patents

Method and apparatus for determining local area that affects degree of facial aging Download PDF

Info

Publication number
WO2021244352A1
WO2021244352A1 PCT/CN2021/095753 CN2021095753W WO2021244352A1 WO 2021244352 A1 WO2021244352 A1 WO 2021244352A1 CN 2021095753 W CN2021095753 W CN 2021095753W WO 2021244352 A1 WO2021244352 A1 WO 2021244352A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial image
facial
area
degree
apparent age
Prior art date
Application number
PCT/CN2021/095753
Other languages
French (fr)
Chinese (zh)
Inventor
汪思佳
王馥迪
杜思源
Original Assignee
中国科学院上海营养与健康研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院上海营养与健康研究所 filed Critical 中国科学院上海营养与健康研究所
Publication of WO2021244352A1 publication Critical patent/WO2021244352A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to facial aging prediction technology, in particular to the technology of determining the local area that affects the degree of facial aging.
  • Facial aging refers to a complex biological process in which facial morphology and structure change over time. With the improvement of the quality of life, people are increasingly concerned about facial aging. Facial aging is not only a criterion for judging health in the field of biomedicine, but also a general concern of society. Therefore, accurately predicting the facial aging area, so as to help individuals delay or improve the facial aging situation, has extremely important research and application value.
  • the purpose of this application is to provide a method and device for determining the local area that affects the degree of facial aging, which can accurately assess the degree of influence of the local area of the face on facial aging.
  • This application discloses a method for determining the local area that affects the degree of facial aging, including:
  • the apparent age prediction model, and the second facial image determine the effect of changed pixels and/or regions in the second facial image on the first apparent age degree.
  • the image processing adopts a method selected from the following group:
  • Pixel derivation method area masking method, or a combination thereof.
  • the predetermined number of pixels are all pixels of the first facial image.
  • the predetermined area is selected from one or more of the following group:
  • Eye area cheek area, mouth area, forehead area.
  • the image processing adopts a pixel derivation method
  • the performing image processing on the first facial image and changing a predetermined number of pixels and/or predetermined areas in the first facial image to obtain a second facial image further includes:
  • the apparent age prediction model, and the second facial image determine the effect of changed pixels and/or regions in the second facial image on the first apparent age
  • the degree further includes:
  • the image processing adopts a region masking method
  • the performing image processing on the first facial image and changing a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image further includes:
  • the degree of influence further includes:
  • the second apparent age and the first apparent age are compared, and the degree of influence of the predetermined area on the apparent age of the human face is calculated based on the comparison result.
  • the image processing of the first facial image by using a region masking method, and covering the predetermined region with the average pixel value of the first facial image to obtain the second facial image further includes:
  • the first facial image is divided into a plurality of local regions, and the region masking method is used to perform image processing on the first facial image, and the pixels of the first facial image are sequentially used. Covering each local area by means to obtain a corresponding second facial image covering each local area further includes:
  • the first facial image is divided into four local areas of eye area, cheek area, mouth area, and forehead area, and image processing is performed on the first facial image by using the area masking method.
  • the pixel average of the first facial image covers each local area to obtain a corresponding second facial image that covers each local area.
  • the apparent age prediction model is obtained by a method including the following steps:
  • the convolutional neural network model is trained with the training sample set to obtain the apparent age prediction model.
  • the convolutional neural network model is a ResNet18 model.
  • the application also discloses a device for determining the local area that affects the degree of facial aging, including:
  • An image acquisition module for acquiring a first facial image of an object
  • An image processing module configured to perform image processing on the first facial image, and change a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image;
  • An age prediction module configured to input the first facial image into an apparent age prediction model of a human face to obtain the first apparent age
  • the influence degree determination module is configured to determine, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or regions in the second facial image are relevant to the The degree of influence of the first apparent age.
  • the application also discloses a device for determining the local area that affects the degree of facial aging, including:
  • Memory for storing computer executable instructions
  • the processor is used to implement the steps in the method described above when executing the computer-executable instructions.
  • the present application also discloses a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the steps in the method described above are implemented.
  • the facial perception experimental method is selected to quantify the facial aging phenotype as a whole, and combined with deep learning and visualization methods, to locate the main affected areas of the overall facial aging, which can objectively and effectively evaluate the degree of impact of facial local areas on facial aging.
  • visualization methods such as pixel derivation method and/or area masking method can more accurately and objectively locate the facial area that affects facial aging, and provide a more scientific basis for decision-making in the fields of medical treatment and cosmetology.
  • the feature A+B+C is disclosed, and in another example, the feature A+B+D+E is disclosed, and the features C and D are equivalent technical means that play the same role.
  • Feature E can be combined with feature C technically, then the A+B+C+D solution should not be regarded as recorded because it is technically infeasible, and A+B+ The C+E plan should be deemed to have been documented.
  • Fig. 1 is a schematic flowchart of a method for determining a local area that affects the degree of facial aging according to a first embodiment of the present application.
  • FIG. 2 shows a schematic diagram of the preprocessing flow of facial image data in an embodiment of the present application.
  • Figure A is a schematic diagram of the identification of 106 facial key points in the facial area
  • Figure B is a schematic diagram of calculating the position of the central axis of the face using a regression model
  • Figure C is a schematic diagram of rotating the facial image according to the tilt angle to align the face vertically
  • Figure D and Figure E are the schematic diagrams of the interception of the two pictures according to the mandibular point, left and right cheek points and upper forehead point.
  • Fig. 3 shows a schematic diagram of facial area division according to an embodiment of the present application.
  • FIG. 4 shows a schematic diagram of the variation curve of the variance of 1000 samples with the number of evaluators in an embodiment of the present application.
  • Fig. 5 shows a schematic diagram of the deep learning, visualization and verification process in an embodiment of the present application.
  • Figure 6 shows a schematic diagram of the ResNet18 network structure in an embodiment of the present application.
  • Figure a is the basic module of the residual network, which establishes a shortcut link from input to output
  • Figure b is the network structure of ResNet18, in which the dotted line refers to the feature doubled.
  • FIG. 7 shows a schematic diagram of a comparison of training effects using three different deep learning models and three different training tags in an embodiment of the present application.
  • FIG. 8 shows a schematic diagram of the division and visualization of the facial area in an embodiment of the present application.
  • Figure a is a schematic diagram of the division of the facial area, showing the four regions of the face
  • Figure b is a schematic diagram of the second facial image obtained based on the pixel derivation method
  • Figure d is the result of the pixel derivation of Figure b being counted to four The heat map of the aging degree of each part
  • Figure c is the heat map of the aging degree of the four parts obtained by the result of the area covering method.
  • FIG. 9 shows a schematic flow chart of using the pixel derivation method in an embodiment of the present application.
  • FIG. 10 shows a schematic diagram of the process of adopting the area covering method in an embodiment of the present application.
  • Figure 11 shows a schematic diagram of the consistency check in an embodiment of the present application.
  • Figure A is an example of deep learning ranking results, and the numbers represent the order of importance of different regions
  • Figure BD shows the ranking results of eye movement experiments (or manual evaluation), and the bold dashed box is the main contrast area, where Figure B represents The order of the four regions is exactly the same.
  • Figure C shows that only the most important regions are consistent, and Figure D shows that the most important regions are consistent with the less important regions.
  • FIG. 12 shows a schematic diagram of the structure of the device for determining the local area that affects the degree of facial aging according to the second embodiment of the present application.
  • Visualization Display the basis of neural network decision-making in the form of images or pictures.
  • the first embodiment of the present application relates to a method for determining the local area that affects the degree of facial aging.
  • the process is shown in Figure 1.
  • the method includes the following steps:
  • step 101 a first facial image of an object is acquired
  • step 102 the first facial image is input into the apparent age prediction model of the human face to obtain the first apparent age
  • step 103 image processing is performed on the first facial image, and a predetermined number of pixels and/or predetermined regions in the first facial image are changed to obtain a second facial image;
  • step 104 according to the first apparent age, the apparent age prediction model, and the second facial image, determine the degree of influence of the changed pixels and/or regions in the second facial image on the first apparent age .
  • the first facial image in step 101 may be a whole facial image or a partial facial image.
  • the apparent age prediction model is obtained in advance through the following steps 1 and 2:
  • the perception experiment is used to quantify the age distribution, average age or median age of the facial sample image as deep learning Train the label to obtain the training sample set; then perform step 2, use the training sample set to train the convolutional neural network model to obtain the apparent age prediction model.
  • the age distribution is used as the training label.
  • the convolutional neural network model may be, but is not limited to, a VGG 16 model, a ResNet 18 model, or a ResNet 50.
  • the convolutional neural network model is a ResNet18 model.
  • the image processing may adopt a pixel derivation method, an area covering method, or a combination thereof.
  • the predetermined number of pixels are all pixels of the first facial image.
  • the predetermined area is a sub-area of m1 pixels ⁇ m2 pixels, and m1 and m2 are each independently a positive integer of 1-1000, preferably 2-500, more preferably 3-250, and most preferably 5- 100.
  • the predetermined area is 0.01%-25% of the entire facial area, preferably 0.1-10%, more preferably 1-5%.
  • the predetermined area is selected from one or more of the following group: eye area, cheek area, mouth area, and forehead area.
  • step 103 can be further implemented as the following steps: image processing is performed on the first facial image by using a pixel derivation method, and Gaussian noise is added to the predetermined number of pixels to obtain a second facial image For example, random Gaussian noise can be added to all pixels of the first face image, but it is not limited to this.
  • this step 104 can be further implemented as the following steps: use the apparent age prediction model to derivate the second facial image to obtain a derivative value corresponding to each pixel on the second facial image, and based on the derivative of each pixel Numerically calculate the degree of influence of the changed pixels on the first apparent age.
  • step 1 the first facial image is divided into multiple local areas, and all the pixels in each local area are counted separately The sum of the derivative values of is used as the weight coefficient of the influence of the local area on the overall facial aging; and in step 2, based on the influence weight coefficient of each local area, on the first facial image for each local area Make an annotation to obtain a third facial image.
  • step 103 can be further implemented as the following steps: image processing the first facial image by using an area masking method, and covering the predetermined area with the average pixel value of the first facial image to obtain the first facial image Two facial images.
  • this step 104 can be further implemented as the following step: input the second facial image into the apparent age prediction model of the face to obtain the second apparent age, and compare the second apparent age with the first apparent age. The apparent age, and the degree of influence of the preset area on the apparent age of the human face is calculated based on the comparison result.
  • the above-mentioned "adopting an area covering method to perform image processing on the first facial image, and using the average pixel value of the first facial image to cover the predetermined area to obtain a second facial image” is further implemented as: A facial image is divided into a plurality of local areas, the first facial image is processed by the area masking method, and each local area is sequentially covered with the pixel average of the first facial image to obtain the corresponding cover each The second facial image of the local area.
  • the first facial image can be divided into four local areas of eye area, cheek area, mouth area, and forehead area, and image processing is performed on the first facial image by using the area masking method.
  • the pixel average of the first facial image covers each local area to obtain a corresponding second facial image that covers each local area.
  • the above “divide the first facial image into multiple partial regions, use the region masking method to perform image processing on the first facial image, and sequentially cover each partial region with the average pixel value of the first facial image Area to obtain the corresponding second facial image covering each local area", it also includes the following steps 1 and 2: In step 1, the second apparent age and the first age corresponding to each local area are counted separately. The difference in apparent age is used as the weight coefficient of the influence of the local area on the overall facial aging; and in step 2, based on the influence weight coefficient of each local area, on the first facial image for each local area Make an annotation to obtain a third facial image.
  • a deep learning network such as ResNet 18 is used to build a facial aging evaluation system, and a deep learning visualization method such as pixel derivation or facial masking is used to locate the main areas of facial aging.
  • the specific plan is as follows:
  • the face++ software was selected to identify 106 key points of the face in the face area ( Figure 2A, https://www.faceplusplus.com/). Then, according to the position coordinates of the eyes, nose and mouth in the key points, the regression model is used to calculate the position of the central axis of the face (Figure 2B, the red solid line), and the vertical line between the central axis and the numerical direction is calculated at the same time (Figure 2B) , The red dashed line) of the inclination angle. According to this, the face picture is rotated according to the tilt angle, so as to align the face vertically (Figure 2C). Finally, we intercepted the pictures according to the jaw point, left and right cheek points and upper forehead point. The final results of the interception are shown in Figure 2D and Figure 2E. The captured pictures were used for follow-up experiments and analysis.
  • the evaluator uses a unified display device to evaluate the perceived age. Before the experiment, the evaluator did not know the age of the sample and the age distribution of the data set. In the evaluation, the evaluator needs to observe all 5,768 sample pictures, and then predict the age of each sample and record it. In addition, we selected 1,014 sample photos (500 men and 514 women). For these 1,014 sample photos, the evaluator needs to evaluate the facial area that he focused on when evaluating the age of the sample according to Figure 3, and then tick it in the evaluation form. The evaluation form is shown in Table 1, for example, where multiple areas can be selected.
  • simulation data 10,000 perceptual age data for 1,000 evaluators.
  • the simulated value of 1000 perceived age was repeatedly sampled for the i-th sampling, so as to calculate the variance of the perceptual age of each i-th sample.
  • X i represents the i-th sampling
  • n is selected represents the number of panelists.
  • deep learning uses 5,768 sample photos as training data, uses the perceived age evaluated by the perception experiment to quantify facial aging, as a deep learning training label; then uses deep learning to visualize and locate the main areas that affect facial aging.
  • the training data set has only 5,768 samples, less training data will not only affect the learning of network parameters, but also prone to overfitting. Therefore, before the actual training process, we follow the traditional deep learning data enhancement method to enhance the training data set in two ways: mirroring and tailoring.
  • Mirror enhancement is the image generated by reflecting a picture relative to the Y axis, that is, one original picture generates two mirror pictures; Crop enhancement is to intercept five times on the top left, bottom right, top right, and the center to generate five different area pictures. According to this, the training data set can be expanded tenfold.
  • the convolutional neural network models VGG 16, ResNet 18 and ResNet 50 are selected for age prediction.
  • ResNet18 is one of the five main models of the ResNet network structure, which mainly includes three parts: Input, Output and Intermediate Convolution (Stages).
  • One of the main difficulties in training a deep neural network is that as the number of network layers increases, the parameters of the neural network will become difficult to optimize, and the problem of network degradation will occur, that is, the ability of the trained model to fit the data is even low. For models with fewer network layers.
  • ResNet establishes a shortcut link from shallow features to deep features by introducing a residual module, and uses neural networks to model the residuals between deep and shallow features instead of deep features
  • the gradient can be returned more effectively when the back propagation algorithm is used to optimize the network parameters, so the degradation problem of the deep learning network can be better solved.
  • this project uses VGG16 and ResNet50 to measure the training effect of the samples.
  • the network parameters are initialized using the model pre-trained on ImageNet, the batch size is set to 64, the learning rate is set to 0.001, and a total of 100 epochs are trained.
  • Figure 6(a) is the basic module of the residual network, which establishes a shortcut link from input to output, where weight layer refers to the weight layer, and relu is an activation function.
  • Figure 6(b) shows the network structure of ResNet18, where the dotted line indicates the feature doubled.
  • the connection part of the solid line indicates the same channel, such as 3*3conv, 64 means that 64 convolution kernels with a size of 3*3 are used for convolution;
  • the connection part of the dashed line indicates that the channel is different, and the feature is doubled, such as 3 *3conv,128,/2 means that 128 convolution kernels with a size of 3*3 are used for convolution, and /2 means that the feature layer is doubled compared to the upper layer 64.
  • the evaluation result distribution, evaluation median, and evaluation mean of 22 age evaluators were used as labels for deep learning model training.
  • the error variance between the predicted result and the real result is used as the effect evaluation criterion to select the model.
  • the project uses a 10-fold cross-validation method to obtain the prediction data of the training data set, that is, the sample data is divided into 10 parts, and each time 9 parts are used for training to obtain the model to predict the age of the remaining 1 part, and the cycle repeats 10 times.
  • the deep learning model can be used to obtain the perceptual age data predicted by all samples.
  • the model evaluation uses three training labels of 22 evaluators' age distribution, age mean and age median, and three deep learning models of VGG 16, ResNet 18, and ResNet 50.
  • the correlation coefficient is calculated by Pearson's correlation coefficient, and the P values in the table are all less than 0.001.
  • the results show that using the ResNet18 model, the age distribution is used as the training label to present the best results (the average difference is 2.27 years, and the correlation coefficient is 0.96, as shown in Figure 7).
  • the ResNet 18 model trained with the age distribution as the training label is used for the subsequent process.
  • the pixel derivation method and the area masking method are selected to realize visualization, and the face is divided into four regions: forehead, eyes, mouth and cheeks based on the facial anatomy and 106 facial feature points automatically calibrated by face++ (Figure 8a).
  • the core idea of the pixel derivation method is to derive the value of the pixel from the predicted perceived age, and use the magnitude of the derivative to measure the importance of each pixel of the picture to the perceived age.
  • the neural network is a highly nonlinear mapping, there are usually a small number of pixels with very large derivatives, which brings difficulties to visualization, so random noise is added to the picture, and the results of multiple derivation are averaged to get smoother Visualization results.
  • sensitivity mask sensitivity mask
  • the pixel derivation method adds noise to each pixel separately, and then calculates the derivative of the perceived age to the pixel after noise is added, and uses the derivative to evaluate the importance of each pixel.
  • n refers to the number of calculations
  • x refers to the original pixel value
  • M c sensitivity means:
  • step 901 the first facial image is input to the trained ResNet18 model to output the corresponding first apparent age; in step 902, n copies of the first facial image are copied , And respectively add random Gaussian noise to obtain corresponding n second facial images; in step 903, for each second facial image, calculate the derivative of the trained ResNet18 model for each pixel in the image; in step 904
  • the derivation results of n second facial images are averaged and visualized to obtain the second facial image (as shown in Figure 8b, the brighter the color, the stronger the importance of the point); in step 905, count the parts of the face
  • the sum of the derivatives of the area mouth, forehead, eyes, cheeks), sort each part according to the corresponding sum value, the higher the sum, the higher the importance and aging degree of the area, and the color shade indicates the degree of aging to each part Mark the area (as shown in Figure 8d, the redder the color, the more aging).
  • the regional occlusion rule is to occlude each region with the mean value of all pixels in the data set, and evaluate the importance of each region with the prediction difference before and after the occlusion.
  • the average pixel value is used to cover the four regions, and then The age of the concealed picture is predicted by the network, and then the age difference between the occluded and before the occlusion is obtained. If the age difference is negative, it means that the occluded area will increase the overall aging of the face; on the contrary, it means that the occluded area will reduce the overall aging of the face. The larger the difference, the stronger the degree.
  • step 1001 the first facial image (unoccluded) is input to the trained ResNet18 model to output the corresponding first apparent age; in step 1002, the first facial image The four local areas (mouth, forehead, eyes, and cheeks) of the image are occluded and filled with the average value of the image to obtain the corresponding second facial image; in step 1003, these four occluded images are input into the trained ResNet18 model to obtain The corresponding second apparent age; in step 1004, the difference between the apparent age prediction results before and after the occlusion is used as the aging degree of each area, and the order is sorted (as shown in Figure 8c, the value corresponding to the color is the value after the occlusion and The predicted age difference before occlusion, the stronger the blue, the more aging, the stronger the red, the younger).
  • Eyelink 1000 eye tracker was used to observe and record the ratio of the staying time of the evaluator's gaze point in each facial area ( Figure 11) to the total gaze time, to objectively quantify the degree of influence of local facial areas on the overall facial aging evaluation.
  • this project designed three consistency test indicators based on the numerical characteristics of different methods. Using this indicator, combined with the results of manual evaluation and eye movement experiments, verifies the reliability of the deep learning visualization of this application.
  • the indicators are introduced as follows:
  • Top1 matching rate The ratio of the number of samples that match each other in the most important area (Figure 11C) to the total number of samples in different methods. This ratio shows that the two methods are completely consistent when looking for the most significant local areas that have the most significant impact on the overall aging of the face.
  • Top2 matching rate In different methods, the ratio of the number of samples matching the most important regions (Figure 11C) and the number of samples matching the most important regions ( Figure 11D) to the total number of samples in different methods. The higher the ratio, the stronger the consistency between the two methods.
  • Deep learning visualization uses two methods: pixel derivation and area concealment, and the visualization effect is evaluated according to three evaluation criteria: perfect agreement rate, key matching rate and secondary key matching rate.
  • the results show that, compared with the area masking method, the use of pixel derivation obtains better results.
  • the deep learning-pixel derivation visualization result has the highest coincidence rate with manual evaluation (0.18vs0.13), which is about 38% higher than the area masking method.
  • the matching rate of Top1 is 0.52, and the matching rate of Top2 reaches 0.89; the matching rate of Top1 with the eye movement experiment is 0.61, and the matching rate of Top2 is 0.85.
  • the second embodiment of the present application relates to a device for determining the local area that affects the degree of facial aging. Its structure is shown in FIG. 12.
  • the device for determining the local area that affects the degree of facial aging includes:
  • An image acquisition module for acquiring a first facial image of an object
  • An image processing module configured to perform image processing on the first facial image, change a predetermined number of pixels and/or a predetermined area in the first facial image, to obtain a second facial image;
  • An age prediction module configured to input the first facial image into the apparent age prediction model of the human face to obtain the first apparent age
  • the influence degree determination module is configured to determine, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or areas in the second facial image are relative to the first apparent age The degree of influence.
  • the first embodiment is a method embodiment corresponding to this embodiment.
  • the technical details in the first embodiment can be applied to this embodiment, and the technical details in this embodiment can also be applied to the first embodiment.
  • each module shown in the above implementation of the device for determining the local area that affects the degree of facial aging can be realized by a program (executable instruction) running on the processor, or can be realized by a specific logic circuit . If the device for determining the local area that affects the degree of facial aging in the embodiment of the present application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to the prior art.
  • the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the method in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM, Read Only Memory), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • the embodiments of the present application also provide a computer-readable storage medium in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, each method implementation of the present application is implemented.
  • Computer-readable storage media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • Information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable storage media does not include transitory media, such as modulated data signals and carrier waves.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • CD-ROM compact disc
  • DVD digital versatile disc
  • Magnetic cassettes magnetic tape magnetic disk storage or other magnetic storage devices or any other
  • the embodiments of the present application also provide a device for determining a local area that affects the degree of facial aging, which includes a memory for storing computer-executable instructions, and a processor; the processor is used to execute data in the memory
  • the computer-executable instructions implement the steps in the foregoing method implementation manners.
  • the processor can be a central processing unit (Central Processing Unit, "CPU"), other general-purpose processors, digital signal processors (Digital Signal Processor, "DSP"), and application specific integrated circuits (Application Specific Integrated Circuits). Integrated Circuit, referred to as "ASIC”), etc.
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • the aforementioned memory may be a read-only memory (read-only memory, "ROM” for short), random access memory (random access memory, "RAM” for short), flash memory (Flash), hard disk or solid state hard disk, etc.
  • ROM read-only memory
  • RAM random access memory
  • flash flash memory
  • hard disk or solid state hard disk etc.
  • the steps of the method disclosed in the various embodiments of the present invention may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • an act is performed based on a certain element, it means that the act is performed at least based on that element. It includes two situations: performing the act only based on the element, and performing the act based on the element and Other elements perform the behavior. Multiple, multiple, multiple, etc. expressions include 2, 2, 2 and more than 2, 2 or more, and 2 or more.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to facial aging prediction technology. Disclosed are a method and apparatus for determining a local area that affects the degree of facial aging. By using the method and the apparatus, the degree of impact of a local facial area on face aging can be accurately evaluated. The method comprises: acquiring a first facial image of an object; inputting the first facial image into an apparent age prediction model for human faces, to obtain a first apparent age; performing image processing on the first facial image, and changing a predetermined number of pixel points and/or predetermined areas in the first facial image, so as to obtain a second facial image; and according to the first apparent age, the apparent age prediction model and the second facial image, determining the degree of impact of the changed pixel points and/or areas in the second facial image on the first apparent age.

Description

用于确定影响面部衰老程度的局部区域的方法和装置Method and device for determining the local area that affects the degree of facial aging 技术领域Technical field
本申请涉及面部衰老预测技术,特别涉及确定影响面部衰老程度的局部区域技术。This application relates to facial aging prediction technology, in particular to the technology of determining the local area that affects the degree of facial aging.
背景技术Background technique
面部衰老是指随时间的推移,面部形态、结构等发生改变的一个复杂生物学过程。随着生活质量的提高,人们对面部衰老的关注度日渐增长。面部衰老已不仅是一个生物医学领域的健康状况判断标准,更是一个社会普遍关注的问题。因此,准确预判面部衰老区域,从而更有针对性的帮助个体延缓或改善面部衰老状况,具有极其重要的研究和应用价值。Facial aging refers to a complex biological process in which facial morphology and structure change over time. With the improvement of the quality of life, people are increasingly concerned about facial aging. Facial aging is not only a criterion for judging health in the field of biomedicine, but also a general concern of society. Therefore, accurately predicting the facial aging area, so as to help individuals delay or improve the facial aging situation, has extremely important research and application value.
发明内容Summary of the invention
本申请的目的在于提供一种用于确定影响面部衰老程度的局部区域的方法和装置,能够准确地评估面部局部区域对面部衰老的影响程度。The purpose of this application is to provide a method and device for determining the local area that affects the degree of facial aging, which can accurately assess the degree of influence of the local area of the face on facial aging.
本申请公开了一种用于确定影响面部衰老程度的局部区域的方法,包括:This application discloses a method for determining the local area that affects the degree of facial aging, including:
获取一对象的第一面部图像;Acquiring a first facial image of an object;
将所述第一面部图像输入人面部的表观年龄预测模型,以获得第一表观年龄;Inputting the first facial image into the apparent age prediction model of the human face to obtain the first apparent age;
对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像;Performing image processing on the first facial image, and changing a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image;
根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度。According to the first apparent age, the apparent age prediction model, and the second facial image, determine the effect of changed pixels and/or regions in the second facial image on the first apparent age degree.
在一个优选例中,所述图像处理采用选自下组的方法:In a preferred example, the image processing adopts a method selected from the following group:
像素求导法、区域遮盖法、或其组合。Pixel derivation method, area masking method, or a combination thereof.
在一个优选例中,所述预定数量的像素点为所述第一面部图像的所有像素点。In a preferred example, the predetermined number of pixels are all pixels of the first facial image.
在一个优选例中,所述预定区域选自下组的一个或多个:In a preferred example, the predetermined area is selected from one or more of the following group:
眼部区域、面颊区域、嘴部区域、前额部区域。Eye area, cheek area, mouth area, forehead area.
在一个优选例中,所述图像处理采用像素求导法;In a preferred example, the image processing adopts a pixel derivation method;
所述对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像,进一步包括:The performing image processing on the first facial image and changing a predetermined number of pixels and/or predetermined areas in the first facial image to obtain a second facial image further includes:
采用像素求导法对所述第一面部图像进行图像处理,在所述预定数量的像素点上添加高斯噪声,以获得第二面部图像;Performing image processing on the first facial image by using a pixel derivation method, and adding Gaussian noise to the predetermined number of pixels to obtain a second facial image;
根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度,进一步包括:According to the first apparent age, the apparent age prediction model, and the second facial image, determine the effect of changed pixels and/or regions in the second facial image on the first apparent age The degree further includes:
使用所述表观年龄预测模型对所述第二面部图像求导得到对应于第二面部图像上每一像素点的导数值,基于各像素点的导数值计算发生改变的像素点对于所述第一表观年龄的影响程度。Use the apparent age prediction model to derive the second facial image to obtain the derivative value corresponding to each pixel on the second facial image, and calculate the changed pixel based on the derivative value of each pixel for the first The degree of influence of apparent age.
在一个优选例中,所述使用所述表观年龄预测模型对所述第二面部图像求导得到对应于第二面部图像上每一像素点的导数值,基于各像素点的导数值计算发生改变的像素点对于所述第一表观年龄的影响程度之后,还包括:In a preferred example, the use of the apparent age prediction model to derive the second facial image to obtain a derivative value corresponding to each pixel on the second facial image, and calculation takes place based on the derivative value of each pixel After the degree of influence of the changed pixels on the first apparent age, it also includes:
将所述第一面部图像划分成多个局部区域,分别统计每个局部区域的所有像素点的导数值的和作为该局部区域对整体面部衰老程度的影响权重系数;Dividing the first facial image into a plurality of partial regions, and separately counting the sum of the derivative values of all pixels in each partial region as the weight coefficient of the influence of the partial region on the overall facial aging;
基于每个局部区域的所述影响权重系数,在所述第一面部图像上对所述每个局部区域进行标注,以获得第三面部图像。Based on the influence weight coefficient of each partial area, mark each partial area on the first facial image to obtain a third facial image.
在一个优选例中,所述图像处理采用区域遮盖法;In a preferred example, the image processing adopts a region masking method;
所述对所述第一面部图像进行图像处理,改变所述第一面部图像中预 定数量的像素点和/或预定区域,以获得第二面部图像进一步包括:The performing image processing on the first facial image and changing a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image further includes:
采用区域遮盖法对所述第一面部图像进行图像处理,用所述第一面部图像的像素均值遮盖所述预定区域,以获得第二面部图像;Performing image processing on the first facial image by using an area covering method, and covering the predetermined area with an average pixel value of the first facial image to obtain a second facial image;
所述根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度,进一步包括:The determining, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or regions in the second facial image are relative to the first apparent age The degree of influence further includes:
将所述第二面部图像输入所述表观年龄预测模型,以获得第二表观年龄;Inputting the second facial image into the apparent age prediction model to obtain a second apparent age;
比对所述第二表观年龄和所述第一表观年龄,并基于比对结果计算所述预设区域对于人面部的表观年龄的影响程度。The second apparent age and the first apparent age are compared, and the degree of influence of the predetermined area on the apparent age of the human face is calculated based on the comparison result.
在一个优选例中,所述采用区域遮盖法对所述第一面部图像进行图像处理,用所述第一面部图像的像素均值遮盖所述预定区域,以获得第二面部图像进一步包括:In a preferred example, the image processing of the first facial image by using a region masking method, and covering the predetermined region with the average pixel value of the first facial image to obtain the second facial image further includes:
将所述第一面部图像划分为多个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像;Divide the first facial image into multiple partial regions, use the region masking method to perform image processing on the first facial image, and sequentially cover each partial region with the average pixel value of the first facial image to Obtain a corresponding second facial image covering each local area;
所述将所述第一面部图像划分为多个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像之后,还包括:Said dividing the first facial image into a plurality of partial regions, performing image processing on the first facial image by using a region masking method, and sequentially covering each partial region with the average pixel value of the first facial image , After obtaining the corresponding second facial image covering each local area, it also includes:
分别统计每个局部区域对应的所述第二表观年龄和所述第一表观年龄的差作为该局部区域对整体面部衰老程度的影响权重系数;Separately counting the difference between the second apparent age and the first apparent age corresponding to each local area as the weight coefficient of the influence of the local area on the overall facial aging;
基于每个局部区域的所述影响权重系数,在所述第一面部图像上对所述每个局部区域进行标注,以获得第三面部图像。Based on the influence weight coefficient of each partial area, mark each partial area on the first facial image to obtain a third facial image.
在一个优选例中,所述将所述第一面部图像划分为多个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像进一步包括:In a preferred example, the first facial image is divided into a plurality of local regions, and the region masking method is used to perform image processing on the first facial image, and the pixels of the first facial image are sequentially used. Covering each local area by means to obtain a corresponding second facial image covering each local area further includes:
将所述第一面部图像划分为眼部区域、面颊区域、嘴部区域、前额部区域的四个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像。The first facial image is divided into four local areas of eye area, cheek area, mouth area, and forehead area, and image processing is performed on the first facial image by using the area masking method. The pixel average of the first facial image covers each local area to obtain a corresponding second facial image that covers each local area.
在一个优选例中,所述表观年龄预测模型是采用包括以下步骤的方法获得:In a preferred example, the apparent age prediction model is obtained by a method including the following steps:
利用感知实验定量面部样本图像的年龄分布、年龄均值或年龄中位数作为深度学习训练标签,得到训练样本集;和Use perception experiments to quantify the age distribution, age mean or age median of facial sample images as deep learning training labels to obtain a training sample set; and
用所述训练样本集训练卷积神经网络模型得到所述表观年龄预测模型。The convolutional neural network model is trained with the training sample set to obtain the apparent age prediction model.
在一个优选例中,所述卷积神经网络模型是ResNet18模型。In a preferred example, the convolutional neural network model is a ResNet18 model.
本申请还公开了一种用于确定影响面部衰老程度的局部区域的装置包括:The application also discloses a device for determining the local area that affects the degree of facial aging, including:
图像获取模块,用于获取一对象的第一面部图像;An image acquisition module for acquiring a first facial image of an object;
图像处理模块,用于对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像;An image processing module, configured to perform image processing on the first facial image, and change a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image;
年龄预测模块,用于将所述第一面部图像输入人面部的表观年龄预测模型,以获得第一表观年龄;An age prediction module, configured to input the first facial image into an apparent age prediction model of a human face to obtain the first apparent age;
影响程度确定模块,用于根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度。The influence degree determination module is configured to determine, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or regions in the second facial image are relevant to the The degree of influence of the first apparent age.
本申请还公开了一种用于确定影响面部衰老程度的局部区域的装置包括:The application also discloses a device for determining the local area that affects the degree of facial aging, including:
存储器,用于存储计算机可执行指令;以及,Memory for storing computer executable instructions; and,
处理器,用于在执行所述计算机可执行指令时实现如前文描述的方法中的步骤。The processor is used to implement the steps in the method described above when executing the computer-executable instructions.
本申请还公开了一种计算机可读存储介质所述计算机可读存储介质 中存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现如前文描述的方法中的步骤。The present application also discloses a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the steps in the method described above are implemented.
本申请实施方式中,至少包括以下优点和有益效果:In the implementation of this application, at least the following advantages and beneficial effects are included:
选用面部感知实验方法将面部衰老表型作为整体进行量化,并结合深度学习和可视化方法,定位整体面部衰老的主要影响区域,可以客观有效地评估面部局部区域对面部衰老的影响程度。The facial perception experimental method is selected to quantify the facial aging phenotype as a whole, and combined with deep learning and visualization methods, to locate the main affected areas of the overall facial aging, which can objectively and effectively evaluate the degree of impact of facial local areas on facial aging.
进一步地,通过像素求导法和/或区域遮盖法等可视化方法,可以更准确更客观地定位影响面部衰老的面部区域,为医疗、美容等领域提供更科学的决策依据。Furthermore, visualization methods such as pixel derivation method and/or area masking method can more accurately and objectively locate the facial area that affects facial aging, and provide a more scientific basis for decision-making in the fields of medical treatment and cosmetology.
进一步地,采用ResNet18等深度学习模型,同时利用年龄分布等作为训练标签,并且准确模拟人类进行感知年龄实验来定量整体面部衰老表型,更加快速有效地评估面部局部区域对面部衰老的影响程度。Further, using deep learning models such as ResNet18, using age distribution as training labels, and accurately simulating humans to perform age perception experiments to quantify the overall facial aging phenotype, and more quickly and effectively assess the impact of local facial areas on facial aging.
本申请的说明书中记载了大量的技术特征,分布在各个技术方案中,如果要罗列出本申请所有可能的技术特征的组合(即技术方案)的话,会使得说明书过于冗长。为了避免这个问题,本申请上述发明内容中公开的各个技术特征、在下文各个实施方式和例子中公开的各技术特征、以及附图中公开的各个技术特征,都可以自由地互相组合,从而构成各种新的技术方案(这些技术方案均因视为在本说明书中已经记载),除非这种技术特征的组合在技术上是不可行的。例如,在一个例子中公开了特征A+B+C,在另一个例子中公开了特征A+B+D+E,而特征C和D是起到相同作用的等同技术手段,技术上只要择一使用即可,不可能同时采用,特征E技术上可以与特征C相组合,则,A+B+C+D的方案因技术不可行而应当不被视为已经记载,而A+B+C+E的方案应当视为已经被记载。A large number of technical features are recorded in the specification of this application, which are distributed in various technical solutions. If all possible combinations of technical features (ie, technical solutions) of this application are to be listed, the specification will be too long. In order to avoid this problem, the various technical features disclosed in the above invention content of this application, the various technical features disclosed in the various embodiments and examples below, and the various technical features disclosed in the drawings can be freely combined with each other to form Various new technical solutions (these technical solutions are deemed to have been recorded in this specification), unless such a combination of technical features is technically infeasible. For example, in one example, the feature A+B+C is disclosed, and in another example, the feature A+B+D+E is disclosed, and the features C and D are equivalent technical means that play the same role. Technically, just choose It can be used once and cannot be used at the same time. Feature E can be combined with feature C technically, then the A+B+C+D solution should not be regarded as recorded because it is technically infeasible, and A+B+ The C+E plan should be deemed to have been documented.
附图说明Description of the drawings
图1是根据本申请第一实施方式的用于确定影响面部衰老程度的局部区域的方法流程示意图。Fig. 1 is a schematic flowchart of a method for determining a local area that affects the degree of facial aging according to a first embodiment of the present application.
图2显示了本申请一个实施例中的面部图像数据预处理流程示意图。 其中,图A为对面部区域106个面部关键点的识别示意图;图B为利用回归模型计算出面部中轴线的位置示意图;图C为按照倾斜角对面部图片进行旋转以将面部竖直对齐示意图;图D和图E分别是根据下颌点、左右面颊点及上额点对两张图片进行了截取效果示意图。Fig. 2 shows a schematic diagram of the preprocessing flow of facial image data in an embodiment of the present application. Among them, Figure A is a schematic diagram of the identification of 106 facial key points in the facial area; Figure B is a schematic diagram of calculating the position of the central axis of the face using a regression model; Figure C is a schematic diagram of rotating the facial image according to the tilt angle to align the face vertically Figure D and Figure E are the schematic diagrams of the interception of the two pictures according to the mandibular point, left and right cheek points and upper forehead point.
图3显示了本申请一个实施例的面部区域划分示意图。Fig. 3 shows a schematic diagram of facial area division according to an embodiment of the present application.
图4显示了本申请一个实施例中的采用1000例抽样方差随评测者数量变化曲线示意图。FIG. 4 shows a schematic diagram of the variation curve of the variance of 1000 samples with the number of evaluators in an embodiment of the present application.
图5显示了本申请一个实施例中的深度学习、可视化及其检验流程示意图。Fig. 5 shows a schematic diagram of the deep learning, visualization and verification process in an embodiment of the present application.
图6显示了本申请一个实施例中的ResNet18网络结构示意图。其中,图a为残差网络基本模块,建立了从输入到输出的快捷链接;图b为ResNet18的网络结构,其中虚线指特征翻倍。Figure 6 shows a schematic diagram of the ResNet18 network structure in an embodiment of the present application. Among them, Figure a is the basic module of the residual network, which establishes a shortcut link from input to output; Figure b is the network structure of ResNet18, in which the dotted line refers to the feature doubled.
图7显示了本申请一个实施例中的采用三种不同深度学习模型和三种不同训练标签的训练效果比较示意图。FIG. 7 shows a schematic diagram of a comparison of training effects using three different deep learning models and three different training tags in an embodiment of the present application.
图8显示了本申请一个实施例中的对面部区域进行划分及可视化处理示意图。其中,图a为面部区域划分示意图,展示了面部的四个分区;图b为基于像素求导法获得的第二面部图像的示意图;图d为将图b的像素求导结果被统计到四个部位的衰老程度热图;图c为将区域遮盖法的结果得到四个部位的衰老程度热图。FIG. 8 shows a schematic diagram of the division and visualization of the facial area in an embodiment of the present application. Among them, Figure a is a schematic diagram of the division of the facial area, showing the four regions of the face; Figure b is a schematic diagram of the second facial image obtained based on the pixel derivation method; Figure d is the result of the pixel derivation of Figure b being counted to four The heat map of the aging degree of each part; Figure c is the heat map of the aging degree of the four parts obtained by the result of the area covering method.
图9显示了本申请一个实施例中的采用像素求导法的流程示意图。FIG. 9 shows a schematic flow chart of using the pixel derivation method in an embodiment of the present application.
图10显示了本申请一个实施例中的采用区域遮盖法的流程示意图。FIG. 10 shows a schematic diagram of the process of adopting the area covering method in an embodiment of the present application.
图11显示了本申请一个实施例中的一致性检验的示意图。其中,图A为深度学习排序结果示例,数字代表不同区域的重要性排列顺序;图B-D展示了眼动实验(或人工评测)排序结果,加粗虚线框为主要对比的区域,其中图B表示四个区域顺序完全一致,图C表示仅有最重要区域一致,图D表示最重要区域与次重要区域一致。Figure 11 shows a schematic diagram of the consistency check in an embodiment of the present application. Among them, Figure A is an example of deep learning ranking results, and the numbers represent the order of importance of different regions; Figure BD shows the ranking results of eye movement experiments (or manual evaluation), and the bold dashed box is the main contrast area, where Figure B represents The order of the four regions is exactly the same. Figure C shows that only the most important regions are consistent, and Figure D shows that the most important regions are consistent with the less important regions.
图12显示了本申请第二实施方式的用于确定影响面部衰老程度的局部区域的装置结构示意图。FIG. 12 shows a schematic diagram of the structure of the device for determining the local area that affects the degree of facial aging according to the second embodiment of the present application.
具体实施方式detailed description
在以下的叙述中,为了使读者更好地理解本申请而提出了许多技术细节。但是,本领域的普通技术人员可以理解,即使没有这些技术细节和基于以下各实施方式的种种变化和修改,也可以实现本申请所要求保护的技术方案。In the following description, many technical details are proposed for the reader to better understand this application. However, those of ordinary skill in the art can understand that even without these technical details and various changes and modifications based on the following embodiments, the technical solution claimed in this application can be realized.
术语解释:Term explanation:
可视化:将神经网络进行决策的依据以图像或图片的方式展示。Visualization: Display the basis of neural network decision-making in the form of images or pictures.
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请的实施方式作进一步地详细描述。In order to make the objectives, technical solutions, and advantages of the present application clearer, the implementation manners of the present application will be described in further detail below in conjunction with the accompanying drawings.
本申请的第一实施方式涉及一种用于确定影响面部衰老程度的局部区域的方法,其流程如图1所示,该方法包括以下步骤:The first embodiment of the present application relates to a method for determining the local area that affects the degree of facial aging. The process is shown in Figure 1. The method includes the following steps:
步骤101中,获取一对象的第一面部图像;In step 101, a first facial image of an object is acquired;
步骤102中,将该第一面部图像输入人面部的表观年龄预测模型,以获得第一表观年龄;In step 102, the first facial image is input into the apparent age prediction model of the human face to obtain the first apparent age;
步骤103中,对该第一面部图像进行图像处理,改变该第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像;In step 103, image processing is performed on the first facial image, and a predetermined number of pixels and/or predetermined regions in the first facial image are changed to obtain a second facial image;
步骤104中,根据该第一表观年龄、该表观年龄预测模型和该第二面部图像确定该第二面部图像中发生改变的像素点和/或区域对于该第一表观年龄的影响程度。In step 104, according to the first apparent age, the apparent age prediction model, and the second facial image, determine the degree of influence of the changed pixels and/or regions in the second facial image on the first apparent age .
可选地,在步骤101中的第一面部图像可以是整体面部图像也可以是部分面部图像。Optionally, the first facial image in step 101 may be a whole facial image or a partial facial image.
可选地,在步骤102之前,预先通过以下步骤①和②得到该表观年龄预测模型:在步骤①中,利用感知实验定量面部样本图像的年龄分布、年龄均值或年龄中位数作为深度学习训练标签,得到训练样本集;之后执行步骤②,用该训练样本集训练卷积神经网络模型得到该表观年龄预测模型。Optionally, before step 102, the apparent age prediction model is obtained in advance through the following steps ① and ②: In step ①, the perception experiment is used to quantify the age distribution, average age or median age of the facial sample image as deep learning Train the label to obtain the training sample set; then perform step ②, use the training sample set to train the convolutional neural network model to obtain the apparent age prediction model.
优选地,该步骤①中采用年龄分布作为训练标签。Preferably, in this step ①, the age distribution is used as the training label.
可选地,该卷积神经网络模型可以但不限于是VGG 16模型、ResNet 18模型或ResNet 50。优选地,该卷积神经网络模型是ResNet18模型。Optionally, the convolutional neural network model may be, but is not limited to, a VGG 16 model, a ResNet 18 model, or a ResNet 50. Preferably, the convolutional neural network model is a ResNet18 model.
可选地,在步骤103中,该图像处理可以采用像素求导法、区域遮盖法、或其组合。Optionally, in step 103, the image processing may adopt a pixel derivation method, an area covering method, or a combination thereof.
可选地,该预定数量的像素点为该第一面部图像的所有像素点。Optionally, the predetermined number of pixels are all pixels of the first facial image.
可选地,该预定区域为m1像素×m2像素的子区域,m1和m2各自独立地为1-1000的正整数,较佳地2-500,更佳地3-250,最佳地5-100。可选地,该预定区域为整个面部区域0.01%-25%,较佳地0.1-10%,更佳地1-5%。Optionally, the predetermined area is a sub-area of m1 pixels×m2 pixels, and m1 and m2 are each independently a positive integer of 1-1000, preferably 2-500, more preferably 3-250, and most preferably 5- 100. Optionally, the predetermined area is 0.01%-25% of the entire facial area, preferably 0.1-10%, more preferably 1-5%.
优选地,该预定区域选自下组的一个或多个:眼部区域、面颊区域、嘴部区域、前额部区域。Preferably, the predetermined area is selected from one or more of the following group: eye area, cheek area, mouth area, and forehead area.
在一个实施例中,该步骤103可以进一步实现为以下步骤:采用像素求导法对该第一面部图像进行图像处理,在该预定数量的像素点上添加高斯噪声,以获得第二面部图像,例如,可以在第一面部图像的所有像素点添加随机高斯噪声,但不限于此。进一步地,该步骤104可以进一步实现为以下步骤:使用该表观年龄预测模型对该第二面部图像求导得到对应于第二面部图像上每一像素点的导数值,基于各像素点的导数值计算发生改变的像素点对于该第一表观年龄的影响程度。In an embodiment, step 103 can be further implemented as the following steps: image processing is performed on the first facial image by using a pixel derivation method, and Gaussian noise is added to the predetermined number of pixels to obtain a second facial image For example, random Gaussian noise can be added to all pixels of the first face image, but it is not limited to this. Further, this step 104 can be further implemented as the following steps: use the apparent age prediction model to derivate the second facial image to obtain a derivative value corresponding to each pixel on the second facial image, and based on the derivative of each pixel Numerically calculate the degree of influence of the changed pixels on the first apparent age.
可选地,上述“使用该表观年龄预测模型对该第二面部图像求导得到对应于第二面部图像上每一像素点的导数值,基于各像素点的导数值计算发生改变的像素点对于该第一表观年龄的影响程度”之后,还包括以下步骤①和②:在步骤①中,将该第一面部图像划分成多个局部区域,分别统计每个局部区域的所有像素点的导数值的和作为该局部区域对整体面部衰老程度的影响权重系数;以及在步骤②中,基于每个局部区域的该影响权重系数,在该第一面部图像上对该每个局部区域进行标注,以获得第三面部图像。Optionally, the above "use the apparent age prediction model to derive the second facial image to obtain the derivative value corresponding to each pixel on the second facial image, and calculate the changed pixel based on the derivative value of each pixel After "the degree of influence of the first apparent age", the following steps ① and ② are included: In step ①, the first facial image is divided into multiple local areas, and all the pixels in each local area are counted separately The sum of the derivative values of is used as the weight coefficient of the influence of the local area on the overall facial aging; and in step ②, based on the influence weight coefficient of each local area, on the first facial image for each local area Make an annotation to obtain a third facial image.
在另一个实施例中,该步骤103可以进一步实现为以下步骤:采用区域遮盖法对该第一面部图像进行图像处理,用该第一面部图像的像素均值遮盖该预定区域,以获得第二面部图像。进一步地,该步骤104可以进一 步实现为以下步骤:将该第二面部图像输入该人面部的表观年龄预测模型,以获得第二表观年龄,比对该第二表观年龄和该第一表观年龄,并基于比对结果计算该预设区域对于人面部的表观年龄的影响程度。In another embodiment, step 103 can be further implemented as the following steps: image processing the first facial image by using an area masking method, and covering the predetermined area with the average pixel value of the first facial image to obtain the first facial image Two facial images. Further, this step 104 can be further implemented as the following step: input the second facial image into the apparent age prediction model of the face to obtain the second apparent age, and compare the second apparent age with the first apparent age. The apparent age, and the degree of influence of the preset area on the apparent age of the human face is calculated based on the comparison result.
可选地,上述“采用区域遮盖法对该第一面部图像进行图像处理,用该第一面部图像的像素均值遮盖该预定区域,以获得第二面部图像”进一步实现为:将该第一面部图像划分为多个局部区域,采用区域遮盖法对该第一面部图像进行图像处理,依次用该第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像。例如,可以将该第一面部图像划分为眼部区域、面颊区域、嘴部区域、前额部区域的四个局部区域,采用区域遮盖法对该第一面部图像进行图像处理,依次用该第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像。Optionally, the above-mentioned "adopting an area covering method to perform image processing on the first facial image, and using the average pixel value of the first facial image to cover the predetermined area to obtain a second facial image" is further implemented as: A facial image is divided into a plurality of local areas, the first facial image is processed by the area masking method, and each local area is sequentially covered with the pixel average of the first facial image to obtain the corresponding cover each The second facial image of the local area. For example, the first facial image can be divided into four local areas of eye area, cheek area, mouth area, and forehead area, and image processing is performed on the first facial image by using the area masking method. The pixel average of the first facial image covers each local area to obtain a corresponding second facial image that covers each local area.
可选地,上述“将该第一面部图像划分为多个局部区域,采用区域遮盖法对该第一面部图像进行图像处理,依次用该第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像”之后,还包括以下步骤①和②:在步骤①中,分别统计每个局部区域对应的该第二表观年龄和该第一表观年龄的差作为该局部区域对整体面部衰老程度的影响权重系数;以及在步骤②中,基于每个局部区域的该影响权重系数,在该第一面部图像上对该每个局部区域进行标注,以获得第三面部图像。Optionally, the above "divide the first facial image into multiple partial regions, use the region masking method to perform image processing on the first facial image, and sequentially cover each partial region with the average pixel value of the first facial image Area to obtain the corresponding second facial image covering each local area", it also includes the following steps ① and ②: In step ①, the second apparent age and the first age corresponding to each local area are counted separately. The difference in apparent age is used as the weight coefficient of the influence of the local area on the overall facial aging; and in step ②, based on the influence weight coefficient of each local area, on the first facial image for each local area Make an annotation to obtain a third facial image.
下面结合具体实施例,进一步阐述本发明。应理解,该实施例仅用于说明本发明而不用于限制本发明的范围。The present invention will be further explained below in conjunction with specific embodiments. It should be understood that this embodiment is only used to illustrate the present invention and not to limit the scope of the present invention.
在本实施例中,采用了ResNet 18等深度学习网络搭建面部衰老的评价体系,同时使用像素求导或面部遮盖等深度学习可视化方法,定位面部衰老的主要区域。具体方案如下:In this embodiment, a deep learning network such as ResNet 18 is used to build a facial aging evaluation system, and a deep learning visualization method such as pixel derivation or facial masking is used to locate the main areas of facial aging. The specific plan is as follows:
一、数据采集One, data collection
为了进行面部感知实验,本示例需要将图像包含受试者发型及衣着等部分信息去除。根据实验要求,设计了一套自动化流程进行图片预处理流程,如图2所示。In order to perform a facial perception experiment, this example needs to remove some information including the subject's hairstyle and clothing. According to the experimental requirements, an automated process is designed for image preprocessing, as shown in Figure 2.
首先,选用了face++软件进行面部区域106个面部关键点的识别(图2A,https://www.faceplusplus.com/)。而后,根据关键点中眼部、鼻部及嘴部的位置坐标,利用回归模型计算出面部中轴线的位置(图2B,红色实线),同时计算出中轴线与数值方向垂直线(图2B,红色虚线)的倾斜角。据此,按照倾斜角对面部图片进行旋转,从而将面部竖直对齐(图2C)。最后,我们根据下颌点、左右面颊点及上额点对图片进行了截取,截取最终效果如图2D及图2E。截取后的图片被用于进行后续实验及分析。First, the face++ software was selected to identify 106 key points of the face in the face area (Figure 2A, https://www.faceplusplus.com/). Then, according to the position coordinates of the eyes, nose and mouth in the key points, the regression model is used to calculate the position of the central axis of the face (Figure 2B, the red solid line), and the vertical line between the central axis and the numerical direction is calculated at the same time (Figure 2B) , The red dashed line) of the inclination angle. According to this, the face picture is rotated according to the tilt angle, so as to align the face vertically (Figure 2C). Finally, we intercepted the pictures according to the jaw point, left and right cheek points and upper forehead point. The final results of the interception are shown in Figure 2D and Figure 2E. The captured pictures were used for follow-up experiments and analysis.
二、感知年龄人工评测2. Manual evaluation of perceived age
招募了22位评测者对样本进行感知年龄评测,包括10名男性评测者和12名女性评测者。为尽可能缩小实验误差,评测者采用统一显示设备进行感知年龄评测。实验前,评测者均不知道样本的年龄及数据集的年龄分布等信息。在评测时,评测者需要观察所有5,768例样本图片,而后预测每个样本的年龄并记录。另外,我们挑选了1,014张样本照片(500名男性,514名女性)。对于这1,014张样本照片,评测者需根据图3评测其在评测样本年龄时所关注的面部区域,然后勾选在评测表中,评测表样例如下表1,其中区域部分可多选。Twenty-two reviewers were recruited to evaluate the perceived age of the sample, including 10 male reviewers and 12 female reviewers. In order to reduce the experimental error as much as possible, the evaluator uses a unified display device to evaluate the perceived age. Before the experiment, the evaluator did not know the age of the sample and the age distribution of the data set. In the evaluation, the evaluator needs to observe all 5,768 sample pictures, and then predict the age of each sample and record it. In addition, we selected 1,014 sample photos (500 men and 514 women). For these 1,014 sample photos, the evaluator needs to evaluate the facial area that he focused on when evaluating the age of the sample according to Figure 3, and then tick it in the evaluation form. The evaluation form is shown in Table 1, for example, where multiple areas can be selected.
表1Table 1
图片编号Picture ID 感知年龄Perceived age 眼睛Eye 嘴巴mouth 额头Forehead 脸颊cheek 所有区域All regions
15HanTZ0005TB1_F15HanTZ0005TB1_F  To  To  To  To  To  To
15HanTZ0010TB1_F15HanTZ0010TB1_F  To  To  To  To  To  To
15HanTZ0014TB1_F15HanTZ0014TB1_F  To  To  To  To  To  To
15HanTZ0022TB1_F15HanTZ0022TB1_F  To  To  To  To  To  To
15HanTZ0023TB1_F15HanTZ0023TB1_F  To  To  To  To  To  To
三、评测者评测质量分析Third, the evaluation quality analysis of the evaluator
采用感知实验定量整体面部衰老,获得高质量深度学习训练标签。Use perception experiments to quantify overall facial aging and obtain high-quality deep learning training labels.
为了验证深度学习训练标签(感知年龄)的可靠性,首先进行了评测者数量与评测质量的模拟分析。我们根据22位评测者的真实数据产生模拟参数,构造含有系统误差和随机误差的评测数据,并采用平均值、标准误两个统计量与评测者数量的关系进行学习,以此反映评测者数量与评测质 量的关系。In order to verify the reliability of the deep learning training label (perceived age), a simulation analysis of the number of evaluators and evaluation quality was first carried out. We generate simulation parameters based on the real data of 22 evaluators, construct evaluation data containing systematic and random errors, and learn the relationship between the average and standard errors and the number of evaluators to reflect the number of evaluators The relationship with the quality of the evaluation.
首先,将22位评测者对每个样本评测值的算术平均值作为样本真实值,将每位评测者对每例样本的评测值与样本真实值相减,对其差值取算术平均值作为评测者的系统误差。而后,每位评测者对每例样本的评测值与系统误差相减之后的数据作为评测者随机误差,我们计算不同评测者的随机误差的标准差σ ij,假设随机误差服从N(0,σ ij)。随后,我们根据得出的模拟参数(样本真实值、系统误差、随机误差等参数)产生了1000位评测者的10000例感知年龄数据模拟数据。随后,按照不同评测者数量(n i=1-100)分别从模拟数据中抽样。并对第i次抽样重复抽取1000例感知年龄模拟数值,以此计算i次每例感知年龄的方差。计算公式如下式(1),X i表示第i次抽样中,每例受试者感知年龄的模拟数值,n表示选取的评测者数量。 First, take the arithmetic mean of the evaluation values of each sample by 22 evaluators as the true value of the sample, subtract the evaluation value of each sample by each evaluator from the true value of the sample, and take the arithmetic mean of the difference as the true value of the sample. The evaluator’s systematic error. Then, the data obtained by subtracting the evaluation value of each sample from the system error by each evaluator is regarded as the random error of the evaluator. We calculate the standard deviation σ ij of the random error of different evaluators, assuming that the random error obeys N(0,σ ij ). Subsequently, based on the obtained simulation parameters (parameters such as the true value of the sample, systematic error, random error, etc.), we generated simulation data of 10,000 perceptual age data for 1,000 evaluators. Subsequently, samples were taken from the simulated data according to the number of different evaluators (n i = 1-100). The simulated value of 1000 perceived age was repeatedly sampled for the i-th sampling, so as to calculate the variance of the perceptual age of each i-th sample. Calculated by the following formula (1), X i represents the i-th sampling, each of the subjects perceived age of analog value, n is selected represents the number of panelists.
Figure PCTCN2021095753-appb-000001
Figure PCTCN2021095753-appb-000001
图4为1000例随机抽样的方差随评测者数量的变化曲线。由此图可见,随评测者数量不断增加,方差降低趋势逐渐缓慢。对该曲线求导,找出拐点位置(n=12),该位置为评测者数量的最优选取方法。即在保证数据质量的前提下,评测者数量最小数量为12,且评测者越多,数据质量越好。在我们的感知实验中,共有22位评测者,数量远超过最优方案,从根本上保证了感知数据质量。Figure 4 shows the variation curve of the variance of 1000 random samples with the number of evaluators. It can be seen from this figure that as the number of evaluators continues to increase, the trend of decreasing variance is gradually slowing down. Derivation of the curve to find the position of the inflection point (n=12), which is the optimal selection method for the number of evaluators. That is, under the premise of ensuring data quality, the minimum number of evaluators is 12, and the more evaluators, the better the data quality. In our perception experiment, there are 22 reviewers, the number far exceeds the optimal solution, which fundamentally guarantees the quality of the perception data.
四、深度学习得到局部衰老模型Fourth, deep learning to get a local aging model
如图5所示,深度学习以5,768例样本照片作为训练数据,利用感知实验所评估的感知年龄定量面部衰老,作为深度学习训练标签;而后利用深度学习可视化定位影响面部衰老的主要区域。As shown in Figure 5, deep learning uses 5,768 sample photos as training data, uses the perceived age evaluated by the perception experiment to quantify facial aging, as a deep learning training label; then uses deep learning to visualize and locate the main areas that affect facial aging.
(1)训练数据集增强(1) Training data set enhancement
由于训练数据集仅有5,768例样本,较少的训练数据不仅会影响网络参数的学习,也容易出现过拟合。因此,在实际训练过程前,我们按照传统深度学习数据增强方法,对训练数据集进行镜像及剪裁两种方式的增强。镜像增强将一张图片相对于Y轴反射生成的像,即一张原图片生成两张镜 像图片;裁剪增强则分别在左上下、右上下、中央截取五次,生成五张不同的区域图片。据此,可以将训练数据集扩增十倍。Since the training data set has only 5,768 samples, less training data will not only affect the learning of network parameters, but also prone to overfitting. Therefore, before the actual training process, we follow the traditional deep learning data enhancement method to enhance the training data set in two ways: mirroring and tailoring. Mirror enhancement is the image generated by reflecting a picture relative to the Y axis, that is, one original picture generates two mirror pictures; Crop enhancement is to intercept five times on the top left, bottom right, top right, and the center to generate five different area pictures. According to this, the training data set can be expanded tenfold.
(2)深度学习模型选择(2) Deep learning model selection
选取卷积神经网络模型VGG 16、ResNet 18及ResNet 50进行年龄预测。其中,ResNet18是ResNet网络结构的五种主要模型之一,其主要包括三个部分:输入(Input)、输出(Output)及中间卷积(Stages)。训练深度神经网络的一个主要困难之处在于,随着网络层数的增加,神经网络的参数会变得难以优化,出现网络退化的问题,也就是训练后的模型的拟合数据的能力甚至低于网络层数更少的模型。与目前常用的其他深度学习模型相比,ResNet通过引入残差模块建立了从浅层特征到深层特征的快捷链接,用神经网络建模深层特征与浅层特征之间的残差而不是深层特征本身,使得在利用反向传播算法优化网络参数的时候梯度可以更加有效地回传,故可以较好地解决深度学习网络的退化问题。此外,为验证ResNet18的鲁棒性(Robust),本项目使用VGG16和ResNet50对样本的训练效果进行衡量。在训练三种神经网络时,都使用在ImageNet上预训练的模型初始化网络参数,批量大小(batch size)设置为64,学习率设置为0.001,共训练100个时期(epoch)。The convolutional neural network models VGG 16, ResNet 18 and ResNet 50 are selected for age prediction. Among them, ResNet18 is one of the five main models of the ResNet network structure, which mainly includes three parts: Input, Output and Intermediate Convolution (Stages). One of the main difficulties in training a deep neural network is that as the number of network layers increases, the parameters of the neural network will become difficult to optimize, and the problem of network degradation will occur, that is, the ability of the trained model to fit the data is even low. For models with fewer network layers. Compared with other deep learning models commonly used at present, ResNet establishes a shortcut link from shallow features to deep features by introducing a residual module, and uses neural networks to model the residuals between deep and shallow features instead of deep features In itself, the gradient can be returned more effectively when the back propagation algorithm is used to optimize the network parameters, so the degradation problem of the deep learning network can be better solved. In addition, in order to verify the robustness of ResNet18 (Robust), this project uses VGG16 and ResNet50 to measure the training effect of the samples. When training three kinds of neural networks, the network parameters are initialized using the model pre-trained on ImageNet, the batch size is set to 64, the learning rate is set to 0.001, and a total of 100 epochs are trained.
图6(a)为残差网络基本模块,建立了从输入到输出的快捷链接,其中weight layer指权重层,relu为一种激活函数。图6(b)为ResNet18的网络结构,其中虚线指特征翻倍。实线的连接部分:表示通道相同,如3*3conv,64指使用64个大小为3*3的卷积核进行卷积;虚线的连接部分:表示通道不同,特征进行了翻倍,如3*3conv,128,/2指使用128个大小为3*3的卷积核进行卷积,/2表示特征层于上层64相比翻倍。Figure 6(a) is the basic module of the residual network, which establishes a shortcut link from input to output, where weight layer refers to the weight layer, and relu is an activation function. Figure 6(b) shows the network structure of ResNet18, where the dotted line indicates the feature doubled. The connection part of the solid line: indicates the same channel, such as 3*3conv, 64 means that 64 convolution kernels with a size of 3*3 are used for convolution; the connection part of the dashed line: indicates that the channel is different, and the feature is doubled, such as 3 *3conv,128,/2 means that 128 convolution kernels with a size of 3*3 are used for convolution, and /2 means that the feature layer is doubled compared to the upper layer 64.
模型训练时,分别采用了22位年龄评测者的评测结果分布、评测中位数及评测均值作为标签进行深度学习模型训练。模型训练后,采用预测结果与真实结果的误差方差作为效果评判标准挑选模型。During model training, the evaluation result distribution, evaluation median, and evaluation mean of 22 age evaluators were used as labels for deep learning model training. After the model is trained, the error variance between the predicted result and the real result is used as the effect evaluation criterion to select the model.
此外,项目采用了10倍交叉验证法获取训练数据集的预测数据,即将样本数据分为10份,每次使用9份进行训练得到模型,来预测剩下1份的年龄,循环重复10次即可使用深度学习模型得到所有样本预测的感 知年龄数据。In addition, the project uses a 10-fold cross-validation method to obtain the prediction data of the training data set, that is, the sample data is divided into 10 parts, and each time 9 parts are used for training to obtain the model to predict the age of the remaining 1 part, and the cycle repeats 10 times. The deep learning model can be used to obtain the perceptual age data predicted by all samples.
(3)深度学习模型训练效果比较(3) Comparison of training effects of deep learning models
利用深度学习方法模拟人类进行面部表观年龄评测,其相关性高达96%以上。模型评价分别采用了22位评测者的年龄分布、年龄均值及年龄中位数三种训练标签,以及VGG 16、ResNet 18及ResNet 50三种深度学习模型。Using deep learning methods to simulate humans to evaluate the apparent age of the face, the correlation is as high as 96%. The model evaluation uses three training labels of 22 evaluators' age distribution, age mean and age median, and three deep learning models of VGG 16, ResNet 18, and ResNet 50.
评价结果如下表2所示。The evaluation results are shown in Table 2 below.
表2Table 2
Figure PCTCN2021095753-appb-000002
Figure PCTCN2021095753-appb-000002
注:相关系数采用皮尔森相关系数计算,表中P值均小于0.001。Note: The correlation coefficient is calculated by Pearson's correlation coefficient, and the P values in the table are all less than 0.001.
结果表明,使用ResNet18模型,采用年龄分布作为训练标签呈现最优结果(平均差值2.27岁,相关系数0.96,如图7)。优选地采用用年龄分布作为训练标签训练好的ResNet 18模型进行后续过程。The results show that using the ResNet18 model, the age distribution is used as the training label to present the best results (the average difference is 2.27 years, and the correlation coefficient is 0.96, as shown in Figure 7). Preferably, the ResNet 18 model trained with the age distribution as the training label is used for the subsequent process.
五、深度学习可视化评估面部局部衰老Five, deep learning visual evaluation of facial aging
分别选取像素求导法和区域遮盖法实现可视化,并根据面部解剖结构及face++自动标定的106个面部特征点将面部划分为了额头、眼睛、嘴及脸颊四个区域(图8a)。The pixel derivation method and the area masking method are selected to realize visualization, and the face is divided into four regions: forehead, eyes, mouth and cheeks based on the facial anatomy and 106 facial feature points automatically calibrated by face++ (Figure 8a).
(1)像素求导法(1) Pixel derivation method
像素求导法(SmoothGrad)的核心思想是将预测的感知年龄对像素的取值求导,用导数的大小衡量图片的各个像素点对感知年龄的重要程度。因为神经网络是高度非线性的映射,通常会有少量的像素点具有非常大的 导数,给可视化带来困难,所以对图片加上随机噪声,对多次的求导结果取平均,得到更加光滑的可视化结果。为了确定深度学习模型在做出预测时所依据的特征,我们可以建立与原始图像尺寸相同的特征重要性蒙版,其亮度数值与每个像素点的重要性相对应,整个图像称为灵敏度蒙版(sensitivity mask)。像素求导法分别将每个像素点加上噪声,而后计算感知年龄对于加上噪声后的像素点的导数,并以该导数评估各像素点的重要程度。根据如下公式(2),式中n指计算次数,
Figure PCTCN2021095753-appb-000003
指标准误差为σ的高斯噪声,x指原像素值,M c指灵敏度:
The core idea of the pixel derivation method (SmoothGrad) is to derive the value of the pixel from the predicted perceived age, and use the magnitude of the derivative to measure the importance of each pixel of the picture to the perceived age. Because the neural network is a highly nonlinear mapping, there are usually a small number of pixels with very large derivatives, which brings difficulties to visualization, so random noise is added to the picture, and the results of multiple derivation are averaged to get smoother Visualization results. In order to determine the features that the deep learning model is based on when making predictions, we can create a feature importance mask with the same size as the original image. Its brightness value corresponds to the importance of each pixel. The entire image is called the sensitivity mask. Edition (sensitivity mask). The pixel derivation method adds noise to each pixel separately, and then calculates the derivative of the perceived age to the pixel after noise is added, and uses the derivative to evaluate the importance of each pixel. According to the following formula (2), where n refers to the number of calculations,
Figure PCTCN2021095753-appb-000003
Refers to standard error of the Gaussian noise σ, x refers to the original pixel value, M c sensitivity means:
Figure PCTCN2021095753-appb-000004
Figure PCTCN2021095753-appb-000004
得到各像素点重要性程度之后,我们在训练中统计了面部各区域的导数的均值,并以其大小排序定量面部局部区域对面部衰老的影响程度。在应用像素求导法时,计算次数n设置为10,高斯噪声的标准差σ设置为0.3。After obtaining the importance of each pixel, we calculated the average value of the derivative of each area of the face during training, and sorted by its size to quantify the degree of influence of the local area of the face on facial aging. When applying the pixel derivation method, the number of calculations n is set to 10, and the standard deviation σ of Gaussian noise is set to 0.3.
具体如图9所示的流程图,在步骤901中,将第一面部图像输入训练好的ResNet18模型输出对应的第一表观年龄;在步骤902中,将第一面部图像复制n份,并分别添加上随机高斯噪声得到对应的n份第二面部图像;在步骤903中,对于每份第二面部图像,计算训练好的ResNet18模型对于该图像中各像素点的导数;在步骤904中,将n份第二面部图像的求导结果取平均值并可视化得到第二面部图像(如图8b所示,颜色越亮该点重要程度越强);在步骤905中,统计面部各局部区域(嘴巴、额头、眼睛、脸颊)的导数和,将各部位按照对应的和值排序,总和越高即该区域重要程度和衰老程度越高,并用颜色深浅表示衰老程度高低来对每个局部区域进行标注(如图8d所示,颜色越红则越显衰老)。Specifically, as shown in the flowchart shown in Figure 9, in step 901, the first facial image is input to the trained ResNet18 model to output the corresponding first apparent age; in step 902, n copies of the first facial image are copied , And respectively add random Gaussian noise to obtain corresponding n second facial images; in step 903, for each second facial image, calculate the derivative of the trained ResNet18 model for each pixel in the image; in step 904 In step 905, the derivation results of n second facial images are averaged and visualized to obtain the second facial image (as shown in Figure 8b, the brighter the color, the stronger the importance of the point); in step 905, count the parts of the face The sum of the derivatives of the area (mouth, forehead, eyes, cheeks), sort each part according to the corresponding sum value, the higher the sum, the higher the importance and aging degree of the area, and the color shade indicates the degree of aging to each part Mark the area (as shown in Figure 8d, the redder the color, the more aging).
(2)区域遮盖法(2) Area covering method
区域遮盖法则是将各区域分别用数据集中所有像素的均值进行遮挡,并以遮挡前后的预测差值评估各区域的重要性。首先,用未遮挡的图片训练进行感知年龄估计的模型,然后计算该图所有像素点的均值,针对样本的每个面部分区(图8a)分别用像素点均值来遮盖四个区域,之后将遮盖后的图片通过网络预测得出年龄,继而得到遮挡后与遮挡前的年龄差。若年龄差为负,说明该遮挡区域会导致面部整体衰老程度增加;相反,说明该遮挡区域会导致面部整体衰老程度降低,差值越大则程度越强。The regional occlusion rule is to occlude each region with the mean value of all pixels in the data set, and evaluate the importance of each region with the prediction difference before and after the occlusion. First, use unoccluded images to train a model for perceptual age estimation, and then calculate the average value of all pixels in the image. For each facial partition of the sample (Figure 8a), the average pixel value is used to cover the four regions, and then The age of the concealed picture is predicted by the network, and then the age difference between the occluded and before the occlusion is obtained. If the age difference is negative, it means that the occluded area will increase the overall aging of the face; on the contrary, it means that the occluded area will reduce the overall aging of the face. The larger the difference, the stronger the degree.
具体流程图如图10所示,在步骤1001中,将第一面部图像(未遮挡)输入训练好的ResNet18模型输出对应的第一表观年龄;在步骤1002中,分别将该第一面部图像的四个局部区域(嘴巴、额头、眼睛、脸颊)遮挡并用图像均值填补,得到对应的第二面部图像;在步骤1003中,将这四张遮挡的图片输入训练好的ResNet18模型,得出对应的第二表观年龄;在步骤1004中,用遮挡前后的表观年龄预测结果的差作为各区域的衰老程度,并排序(如图8c所示,颜色所对应的数值为遮挡后与遮挡前的预测年龄差,蓝色越强为越衰老,红色越强为越年轻)。The specific flowchart is shown in Figure 10. In step 1001, the first facial image (unoccluded) is input to the trained ResNet18 model to output the corresponding first apparent age; in step 1002, the first facial image The four local areas (mouth, forehead, eyes, and cheeks) of the image are occluded and filled with the average value of the image to obtain the corresponding second facial image; in step 1003, these four occluded images are input into the trained ResNet18 model to obtain The corresponding second apparent age; in step 1004, the difference between the apparent age prediction results before and after the occlusion is used as the aging degree of each area, and the order is sorted (as shown in Figure 8c, the value corresponding to the color is the value after the occlusion and The predicted age difference before occlusion, the stronger the blue, the more aging, the stronger the red, the younger).
六、结果验证6. Result verification
为验证深度学习可视化结果的有效性,分别采集了感知实验数据、眼动实验数据以及面部局部区域衰老评分数据。In order to verify the effectiveness of the visualization results of deep learning, sensory experiment data, eye movement experiment data, and facial aging score data were collected respectively.
(1)感知实验数据(人工评测)(1) Perception experiment data (manual evaluation)
感知实验中,对于挑选出的1014张照片,选取22名评测者需要观察图片,预测年龄并记录判断所依据的面部区域。据此,我们统计了各区域被选中的次数,每个样本每个区域被几个评测者勾选就记为几,用该数值来定量面部区域对面部衰老评定的影响程度。In the perception experiment, for the 1014 selected photos, 22 reviewers were selected to observe the pictures, predict the age and record the facial area on which the judgment was based. Based on this, we counted the number of times that each area was selected. Each area of each sample was checked by several reviewers and counted as a few. This value was used to quantify the degree of influence of facial areas on facial aging.
(2)眼动数据(机器测量)(2) Eye movement data (machine measurement)
另选取了20位评测者,针对18例受试者面部二维图像进行眼动实验。实验选用Eyelink 1000眼动仪观察并记录了评测者注视点在每个面部分区(图11)的停留时间与注视总时间比值,较为客观的定量面部局部区域对整体面部衰老评测的影响程度。In addition, 20 evaluators were selected to conduct eye movement experiments on the two-dimensional images of the faces of 18 subjects. In the experiment, Eyelink 1000 eye tracker was used to observe and record the ratio of the staying time of the evaluator's gaze point in each facial area (Figure 11) to the total gaze time, to objectively quantify the degree of influence of local facial areas on the overall facial aging evaluation.
(3)一致性检验方法(3) Consistency inspection method
为评价不同方法间的一致性(人工评测、眼动实验及深度学习可视化),本项目基于不同方法的数值特征,设计了三个一致性检验指标。利用该指标,结合人工评测及眼动实验的结果,验证本申请的深度学习可视化的可靠性。指标介绍如下:In order to evaluate the consistency between different methods (manual evaluation, eye movement experiment and deep learning visualization), this project designed three consistency test indicators based on the numerical characteristics of different methods. Using this indicator, combined with the results of manual evaluation and eye movement experiments, verifies the reliability of the deep learning visualization of this application. The indicators are introduced as follows:
完全匹配率:不同方法中,区域重要性排序完全匹配的样本数与总样本数的比值(图11B)。该比值说明两个方法间具有完全一致性。Perfect match rate: The ratio of the number of samples that exactly match the regional importance ranking to the total number of samples in different methods (Figure 11B). This ratio shows that there is complete consistency between the two methods.
Top1匹配率:不同方法中,最重要区域相互匹配的样本个数(图11C)与总样本数的比值。该比值说明,当探寻对面部整体衰老影响最显著的局部区域时,两个方法具有完全一致性。Top1 matching rate: The ratio of the number of samples that match each other in the most important area (Figure 11C) to the total number of samples in different methods. This ratio shows that the two methods are completely consistent when looking for the most significant local areas that have the most significant impact on the overall aging of the face.
Top2匹配率:不同方法中,最重要区域相互匹配的样本个数(图11C)和最重要匹配次重要区域的样本(图11D)个数之和与总样本数的比值。该比值越高,说明两个方法间一致性越强。Top2 matching rate: In different methods, the ratio of the number of samples matching the most important regions (Figure 11C) and the number of samples matching the most important regions (Figure 11D) to the total number of samples in different methods. The higher the ratio, the stronger the consistency between the two methods.
(4)利用深度学习可视化方法定位影响面部衰老的关键区域,并验证了该结果的有效性(4) Use deep learning visualization methods to locate the key areas that affect facial aging, and verify the validity of the results
深度学习可视化分别采用了像素求导、区域遮盖两种方法,根据完全一致率、重点匹配率及次重点匹配率三个评价标准对可视化效果进行评价。Deep learning visualization uses two methods: pixel derivation and area concealment, and the visualization effect is evaluated according to three evaluation criteria: perfect agreement rate, key matching rate and secondary key matching rate.
评价结果如下表3所示。The evaluation results are shown in Table 3 below.
表3table 3
Figure PCTCN2021095753-appb-000005
Figure PCTCN2021095753-appb-000005
结果表明,与区域遮盖法相比,使用像素求导获得了更好的结果。其中,深度学习-像素求导可视化结果与人工评测的完全一致率最高(0.18vs0.13),与区域遮盖法相比提高了约38%。The results show that, compared with the area masking method, the use of pixel derivation obtains better results. Among them, the deep learning-pixel derivation visualization result has the highest coincidence rate with manual evaluation (0.18vs0.13), which is about 38% higher than the area masking method.
此外,采用深度学习-像素求导可视化时,Top1匹配率为0.52,Top2匹配率则达到0.89;其与眼动实验的Top1匹配率为0.61,Top2匹配率为0.85。In addition, when using deep learning-pixel derivation visualization, the matching rate of Top1 is 0.52, and the matching rate of Top2 reaches 0.89; the matching rate of Top1 with the eye movement experiment is 0.61, and the matching rate of Top2 is 0.85.
上述结果说明深度学习可视化与两种人工方法有极高的一致性,进而验证了深度学习可视化方法的可靠性。The above results indicate that the deep learning visualization has extremely high consistency with the two manual methods, and further verify the reliability of the deep learning visualization method.
本申请的第二实施方式涉及一种用于确定影响面部衰老程度的局部区域的装置,其结构如图12所示,该用于确定影响面部衰老程度的局部区域的装置包括:The second embodiment of the present application relates to a device for determining the local area that affects the degree of facial aging. Its structure is shown in FIG. 12. The device for determining the local area that affects the degree of facial aging includes:
图像获取模块,用于获取一对象的第一面部图像;An image acquisition module for acquiring a first facial image of an object;
图像处理模块,用于对该第一面部图像进行图像处理,改变该第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像;An image processing module, configured to perform image processing on the first facial image, change a predetermined number of pixels and/or a predetermined area in the first facial image, to obtain a second facial image;
年龄预测模块,用于将该第一面部图像输入人面部的表观年龄预测模型,以获得第一表观年龄;An age prediction module, configured to input the first facial image into the apparent age prediction model of the human face to obtain the first apparent age;
影响程度确定模块,用于根据该第一表观年龄、该表观年龄预测模型和该第二面部图像确定该第二面部图像中发生改变的像素点和/或区域对于该第一表观年龄的影响程度。The influence degree determination module is configured to determine, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or areas in the second facial image are relative to the first apparent age The degree of influence.
第一实施方式是与本实施方式相对应的方法实施方式,第一实施方式中的技术细节可以应用于本实施方式,本实施方式中的技术细节也可以应用于第一实施方式。The first embodiment is a method embodiment corresponding to this embodiment. The technical details in the first embodiment can be applied to this embodiment, and the technical details in this embodiment can also be applied to the first embodiment.
需要说明的是,本领域技术人员应当理解,上述用于确定影响面部衰老程度的局部区域的装置的实施方式中所示的各模块的实现功能可参照前述确定影响面部衰老程度的局部区域的方法的相关描述而理解。上述用于确定影响面部衰老程度的局部区域的装置的实施方式中所示的各模块的功能可通过运行于处理器上的程序(可执行指令)而实现,也可通过具体的逻辑电路而实现。本申请实施例上述用于确定影响面部衰老程度的局部区域的装置如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例该方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。It should be noted that those skilled in the art should understand that the implementation functions of the modules shown in the implementation of the device for determining the local area that affects the degree of facial aging can refer to the aforementioned method for determining the local area that affects the degree of facial aging. The relevant description and understanding. The function of each module shown in the above implementation of the device for determining the local area that affects the degree of facial aging can be realized by a program (executable instruction) running on the processor, or can be realized by a specific logic circuit . If the device for determining the local area that affects the degree of facial aging in the embodiment of the present application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the method in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM, Read Only Memory), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
相应地,本申请实施方式还提供一种计算机可读存储介质,其中存储有计算机可执行指令,该计算机可执行指令被处理器执行时实现本申请的各方法实施方式。计算机可读存储介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机 可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于,相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读存储介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Correspondingly, the embodiments of the present application also provide a computer-readable storage medium in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, each method implementation of the present application is implemented. Computer-readable storage media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. Information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable storage media does not include transitory media, such as modulated data signals and carrier waves.
此外,本申请实施方式还提供一种用于确定影响面部衰老程度的局部区域的装置,其中包括用于存储计算机可执行指令的存储器,以及,处理器;该处理器用于在执行该存储器中的计算机可执行指令时实现上述各方法实施方式中的步骤。其中,该处理器可以是中央处理单元(Central Processing Unit,简称“CPU”),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,简称“DSP”)、专用集成电路(Application Specific Integrated Circuit,简称“ASIC”)等。前述的存储器可以是只读存储器(read-only memory,简称“ROM”)、随机存取存储器(random access memory,简称“RAM”)、快闪存储器(Flash)、硬盘或者固态硬盘等。本发明各实施方式所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。In addition, the embodiments of the present application also provide a device for determining a local area that affects the degree of facial aging, which includes a memory for storing computer-executable instructions, and a processor; the processor is used to execute data in the memory The computer-executable instructions implement the steps in the foregoing method implementation manners. Among them, the processor can be a central processing unit (Central Processing Unit, "CPU"), other general-purpose processors, digital signal processors (Digital Signal Processor, "DSP"), and application specific integrated circuits (Application Specific Integrated Circuits). Integrated Circuit, referred to as "ASIC"), etc. The aforementioned memory may be a read-only memory (read-only memory, "ROM" for short), random access memory (random access memory, "RAM" for short), flash memory (Flash), hard disk or solid state hard disk, etc. The steps of the method disclosed in the various embodiments of the present invention may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
需要说明的是,在本专利的申请文件中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个”限定的要素,并不排除在包括该要素的过程、方法、物品或者设备中还存在另外的相同要素。本专利的申请文件中,如果提到根 据某要素执行某行为,则是指至少根据该要素执行该行为的意思,其中包括了两种情况:仅根据该要素执行该行为、和根据该要素和其它要素执行该行为。多个、多次、多种等表达包括2个、2次、2种以及2个以上、2次以上、2种以上。It should be noted that in the application documents of this patent, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these There is any such actual relationship or sequence between entities or operations. Moreover, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or device that includes a series of elements includes not only those elements, but also those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including one" does not exclude the existence of other identical elements in the process, method, article or equipment that includes the element. In the application documents of this patent, if it is mentioned that an act is performed based on a certain element, it means that the act is performed at least based on that element. It includes two situations: performing the act only based on the element, and performing the act based on the element and Other elements perform the behavior. Multiple, multiple, multiple, etc. expressions include 2, 2, 2 and more than 2, 2 or more, and 2 or more.
在本申请提及的所有文献都被认为是整体性地包括在本申请的公开内容中,以便在必要时可以作为修改的依据。此外应理解,以上该仅为本说明书的较佳实施例而已,并非用于限定本说明书的保护范围。凡在本说明书一个或多个实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本说明书一个或多个实施例的保护范围之内。All documents mentioned in this application are considered to be included in the disclosure of this application as a whole, so that they can be used as a basis for modification when necessary. In addition, it should be understood that the above are only preferred embodiments of this specification, and are not used to limit the protection scope of this specification. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of this specification shall be included in the protection scope of one or more embodiments of this specification.

Claims (14)

  1. 一种用于确定影响面部衰老程度的局部区域的方法,其特征在于,包括:A method for determining the local area that affects the degree of facial aging, which is characterized in that it includes:
    获取一对象的第一面部图像;Acquiring a first facial image of an object;
    将所述第一面部图像输入人面部的表观年龄预测模型,以获得第一表观年龄;Inputting the first facial image into the apparent age prediction model of the human face to obtain the first apparent age;
    对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像;Performing image processing on the first facial image, and changing a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image;
    根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度。According to the first apparent age, the apparent age prediction model, and the second facial image, determine the effect of changed pixels and/or regions in the second facial image on the first apparent age degree.
  2. 如权利要求1所述用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述图像处理采用选自下组的方法:The method for determining the local area that affects the degree of facial aging according to claim 1, wherein the image processing adopts a method selected from the following group:
    像素求导法、区域遮盖法、或其组合。Pixel derivation method, area masking method, or a combination thereof.
  3. 如权利要求1所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述预定数量的像素点为所述第一面部图像的所有像素点。The method for determining a local area that affects the degree of facial aging according to claim 1, wherein the predetermined number of pixels are all the pixels of the first facial image.
  4. 如权利要求1所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述预定区域选自下组的一个或多个:The method for determining the local area that affects the degree of facial aging according to claim 1, wherein the predetermined area is selected from one or more of the following group:
    眼部区域、面颊区域、嘴部区域、前额部区域。Eye area, cheek area, mouth area, forehead area.
  5. 如权利要求2所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述图像处理采用像素求导法;The method for determining the local area that affects the degree of facial aging according to claim 2, wherein the image processing adopts a pixel derivation method;
    所述对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像,进一步包括:The performing image processing on the first facial image and changing a predetermined number of pixels and/or predetermined areas in the first facial image to obtain a second facial image further includes:
    采用像素求导法对所述第一面部图像进行图像处理,在所述预定数量的像素点上添加高斯噪声,以获得第二面部图像;Performing image processing on the first facial image by using a pixel derivation method, and adding Gaussian noise to the predetermined number of pixels to obtain a second facial image;
    根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像 确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度,进一步包括:According to the first apparent age, the apparent age prediction model, and the second facial image, determine the effect of changed pixels and/or regions in the second facial image on the first apparent age The degree further includes:
    使用所述表观年龄预测模型对所述第二面部图像求导得到对应于第二面部图像上每一像素点的导数值,基于各像素点的导数值计算发生改变的像素点对于所述第一表观年龄的影响程度。Use the apparent age prediction model to derive the second facial image to obtain the derivative value corresponding to each pixel on the second facial image, and calculate the changed pixel based on the derivative value of each pixel for the first The degree of influence of apparent age.
  6. 如权利要求5所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述使用所述表观年龄预测模型对所述第二面部图像求导得到对应于第二面部图像上每一像素点的导数值,基于各像素点的导数值计算发生改变的像素点对于所述第一表观年龄的影响程度之后,还包括:The method for determining a local area that affects the degree of facial aging according to claim 5, wherein the use of the apparent age prediction model to derive the second facial image to obtain a second facial image The derivative value of each pixel above, after calculating the degree of influence of the changed pixel on the first apparent age based on the derivative value of each pixel, also includes:
    将所述第一面部图像划分成多个局部区域,分别统计每个局部区域的所有像素点的导数值的和作为该局部区域对整体面部衰老程度的影响权重系数;Dividing the first facial image into a plurality of partial regions, and separately counting the sum of the derivative values of all pixels in each partial region as the weight coefficient of the influence of the partial region on the overall facial aging;
    基于每个局部区域的所述影响权重系数,在所述第一面部图像上对所述每个局部区域进行标注,以获得第三面部图像。Based on the influence weight coefficient of each partial area, mark each partial area on the first facial image to obtain a third facial image.
  7. 如权利要求2所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述图像处理采用区域遮盖法;The method for determining the local area that affects the degree of facial aging according to claim 2, wherein the image processing adopts an area masking method;
    所述对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像进一步包括:The performing image processing on the first facial image and changing a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image further includes:
    采用区域遮盖法对所述第一面部图像进行图像处理,用所述第一面部图像的像素均值遮盖所述预定区域,以获得第二面部图像;Performing image processing on the first facial image by using an area covering method, and covering the predetermined area with an average pixel value of the first facial image to obtain a second facial image;
    所述根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度,进一步包括:The determining, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or areas in the second facial image are relative to the first apparent age The degree of influence further includes:
    将所述第二面部图像输入所述表观年龄预测模型,以获得第二表观年龄;Inputting the second facial image into the apparent age prediction model to obtain a second apparent age;
    比对所述第二表观年龄和所述第一表观年龄,并基于比对结果计算所述预设区域对于人面部的表观年龄的影响程度。The second apparent age and the first apparent age are compared, and the degree of influence of the predetermined area on the apparent age of the human face is calculated based on the comparison result.
  8. 如权利要求7所述的用于确定影响面部衰老程度的局部区域的方 法,其特征在于,所述采用区域遮盖法对所述第一面部图像进行图像处理,用所述第一面部图像的像素均值遮盖所述预定区域,以获得第二面部图像进一步包括:The method for determining the local area that affects the degree of facial aging according to claim 7, wherein the first facial image is processed by the area masking method, and the first facial image is used for image processing. The average value of pixels covering the predetermined area to obtain the second facial image further includes:
    将所述第一面部图像划分为多个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像;Divide the first facial image into multiple partial regions, use the region masking method to perform image processing on the first facial image, and sequentially cover each partial region with the average pixel value of the first facial image to Obtain a corresponding second facial image covering each local area;
    所述将所述第一面部图像划分为多个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像之后,还包括:Said dividing the first facial image into a plurality of partial regions, performing image processing on the first facial image by using a region masking method, and sequentially covering each partial region with the average pixel value of the first facial image , After obtaining the corresponding second facial image covering each local area, it also includes:
    分别统计每个局部区域对应的所述第二表观年龄和所述第一表观年龄的差作为该局部区域对整体面部衰老程度的影响权重系数;Separately counting the difference between the second apparent age and the first apparent age corresponding to each local area as the weight coefficient of the influence of the local area on the overall facial aging;
    基于每个局部区域的所述影响权重系数,在所述第一面部图像上对所述每个局部区域进行标注,以获得第三面部图像。Based on the influence weight coefficient of each partial area, mark each partial area on the first facial image to obtain a third facial image.
  9. 如权利要求8所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述将所述第一面部图像划分为多个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像进一步包括:The method for determining the local area that affects the degree of facial aging according to claim 8, wherein the first facial image is divided into a plurality of local areas, and the first facial image is covered by a region masking method. Performing image processing on the facial image, and sequentially covering each partial area with the pixel average of the first facial image to obtain a corresponding second facial image covering each partial area further includes:
    将所述第一面部图像划分为眼部区域、面颊区域、嘴部区域、前额部区域的四个局部区域,采用区域遮盖法对所述第一面部图像进行图像处理,依次用所述第一面部图像的像素均值遮盖每个局部区域,以获得对应的遮盖每个局部区域的第二面部图像。The first facial image is divided into four local areas of eye area, cheek area, mouth area, and forehead area, and image processing is performed on the first facial image by using the area masking method. The pixel average of the first facial image covers each local area to obtain a corresponding second facial image that covers each local area.
  10. 如权利要求1-9中任意一项所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述表观年龄预测模型是采用包括以下步骤的方法获得:The method for determining the local area that affects the degree of facial aging according to any one of claims 1-9, wherein the apparent age prediction model is obtained by a method including the following steps:
    利用感知实验定量面部样本图像的年龄分布、年龄均值或年龄中位数作为深度学习训练标签,得到训练样本集;和Use perception experiments to quantify the age distribution, age mean or age median of facial sample images as deep learning training labels to obtain a training sample set; and
    用所述训练样本集训练卷积神经网络模型得到所述表观年龄预测模型。The convolutional neural network model is trained with the training sample set to obtain the apparent age prediction model.
  11. 如权利要求10所述的用于确定影响面部衰老程度的局部区域的方法,其特征在于,所述卷积神经网络模型是ResNet18模型。The method for determining the local area that affects the degree of facial aging according to claim 10, wherein the convolutional neural network model is a ResNet18 model.
  12. 一种用于确定影响面部衰老程度的局部区域的装置,其特征在于,包括:A device for determining the local area that affects the degree of facial aging, which is characterized in that it comprises:
    图像获取模块,用于获取一对象的第一面部图像;An image acquisition module for acquiring a first facial image of an object;
    图像处理模块,用于对所述第一面部图像进行图像处理,改变所述第一面部图像中预定数量的像素点和/或预定区域,以获得第二面部图像;An image processing module, configured to perform image processing on the first facial image, and change a predetermined number of pixels and/or a predetermined area in the first facial image to obtain a second facial image;
    年龄预测模块,用于将所述第一面部图像输入人面部的表观年龄预测模型,以获得第一表观年龄;An age prediction module, configured to input the first facial image into an apparent age prediction model of a human face to obtain the first apparent age;
    影响程度确定模块,用于根据所述第一表观年龄、所述表观年龄预测模型和所述第二面部图像确定所述第二面部图像中发生改变的像素点和/或区域对于所述第一表观年龄的影响程度。The influence degree determination module is configured to determine, according to the first apparent age, the apparent age prediction model, and the second facial image, that the changed pixels and/or regions in the second facial image are relevant to the The degree of influence of the first apparent age.
  13. 一种用于确定影响面部衰老程度的局部区域的装置,其特征在于,包括:A device for determining the local area that affects the degree of facial aging, which is characterized in that it comprises:
    存储器,用于存储计算机可执行指令;以及,Memory for storing computer executable instructions; and,
    处理器,用于在执行所述计算机可执行指令时实现如权利要求1至11中任意一项所述的方法中的步骤。The processor is configured to implement the steps in the method according to any one of claims 1 to 11 when executing the computer-executable instructions.
  14. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现如权利要求1至11中任意一项所述的方法中的步骤。A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the computer-readable The steps in the method described.
PCT/CN2021/095753 2020-06-05 2021-05-25 Method and apparatus for determining local area that affects degree of facial aging WO2021244352A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010505109.6A CN113761985A (en) 2020-06-05 2020-06-05 Method and apparatus for determining local regions affecting the degree of facial aging
CN202010505109.6 2020-06-05

Publications (1)

Publication Number Publication Date
WO2021244352A1 true WO2021244352A1 (en) 2021-12-09

Family

ID=78784947

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/095753 WO2021244352A1 (en) 2020-06-05 2021-05-25 Method and apparatus for determining local area that affects degree of facial aging

Country Status (2)

Country Link
CN (1) CN113761985A (en)
WO (1) WO2021244352A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1870047A (en) * 2006-06-15 2006-11-29 西安交通大学 Human face image age changing method based on average face and senile proportional image
CN107315987A (en) * 2016-04-27 2017-11-03 伽蓝(集团)股份有限公司 Assess facial apparent age, the method and its application of facial aging degree
US20170351905A1 (en) * 2016-06-06 2017-12-07 Samsung Electronics Co., Ltd. Learning model for salient facial region detection
CN108140110A (en) * 2015-09-22 2018-06-08 韩国科学技术研究院 Age conversion method based on face's each position age and environmental factor, for performing the storage medium of this method and device
CN110709856A (en) * 2017-05-31 2020-01-17 宝洁公司 System and method for determining apparent skin age

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1870047A (en) * 2006-06-15 2006-11-29 西安交通大学 Human face image age changing method based on average face and senile proportional image
CN108140110A (en) * 2015-09-22 2018-06-08 韩国科学技术研究院 Age conversion method based on face's each position age and environmental factor, for performing the storage medium of this method and device
CN107315987A (en) * 2016-04-27 2017-11-03 伽蓝(集团)股份有限公司 Assess facial apparent age, the method and its application of facial aging degree
US20170351905A1 (en) * 2016-06-06 2017-12-07 Samsung Electronics Co., Ltd. Learning model for salient facial region detection
CN110709856A (en) * 2017-05-31 2020-01-17 宝洁公司 System and method for determining apparent skin age

Also Published As

Publication number Publication date
CN113761985A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
Wang et al. Human visual system-based fundus image quality assessment of portable fundus camera photographs
Elangovan et al. Glaucoma assessment from color fundus images using convolutional neural network
US20190191988A1 (en) Screening method for automated detection of vision-degenerative diseases from color fundus images
RU2648836C2 (en) Systems, methods and computer-readable media for identifying when a subject is likely to be affected by medical condition
CN102567734B (en) Specific value based retina thin blood vessel segmentation method
WO2021068781A1 (en) Fatigue state identification method, apparatus and device
WO2021190656A1 (en) Method and apparatus for localizing center of macula in fundus image, server, and storage medium
Hatamizadeh et al. Deep dilated convolutional nets for the automatic segmentation of retinal vessels
CN113782184A (en) Cerebral apoplexy auxiliary evaluation system based on facial key point and feature pre-learning
CN114694236A (en) Eyeball motion segmentation positioning method based on cyclic residual convolution neural network
Li et al. BrainK for structural image processing: creating electrical models of the human head
Feng et al. Using eye aspect ratio to enhance fast and objective assessment of facial paralysis
Jiang et al. Improving the generalizability of infantile cataracts detection via deep learning-based lens partition strategy and multicenter datasets
Maillard et al. A deep residual learning implementation of metamorphosis
Muramatsu Diagnosis of glaucoma on retinal fundus images using deep learning: detection of nerve fiber layer defect and optic disc analysis
Tsietso et al. Multi-Input deep learning approach for breast cancer screening using thermal infrared imaging and clinical data
Wan et al. A novel system for measuring pterygium's progress using deep learning
Vamsi et al. Early Detection of Hemorrhagic Stroke Using a Lightweight Deep Learning Neural Network Model.
Joshi et al. Graph deep network for optic disc and optic cup segmentation for glaucoma disease using retinal imaging
CN106446805A (en) Segmentation method and system for optic cup in eye ground photo
Zhang et al. Critical element prediction of tracheal intubation difficulty: Automatic Mallampati classification by jointly using handcrafted and attention-based deep features
Trotta et al. A neural network-based software to recognise blepharospasm symptoms and to measure eye closure time
WO2021244352A1 (en) Method and apparatus for determining local area that affects degree of facial aging
Goceri et al. Automated Detection of Facial Disorders (ADFD): a novel approach based-on digital photographs
Carrasco Limeros et al. Assessing GAN-Based Generative Modeling on Skin Lesions Images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21818666

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21818666

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21818666

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 15.06.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21818666

Country of ref document: EP

Kind code of ref document: A1