CN113435428A - Photo album-based photo sticker selection method, electronic equipment and storage medium - Google Patents

Photo album-based photo sticker selection method, electronic equipment and storage medium Download PDF

Info

Publication number
CN113435428A
CN113435428A CN202110991809.5A CN202110991809A CN113435428A CN 113435428 A CN113435428 A CN 113435428A CN 202110991809 A CN202110991809 A CN 202110991809A CN 113435428 A CN113435428 A CN 113435428A
Authority
CN
China
Prior art keywords
face
value
eye
angle
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110991809.5A
Other languages
Chinese (zh)
Other versions
CN113435428B (en
Inventor
林鸿飞
周有喜
乔国坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Core Computing Integrated Shenzhen Technology Co ltd
Original Assignee
Shenzhen Aishen Yingtong Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Aishen Yingtong Information Technology Co Ltd filed Critical Shenzhen Aishen Yingtong Information Technology Co Ltd
Priority to CN202110991809.5A priority Critical patent/CN113435428B/en
Publication of CN113435428A publication Critical patent/CN113435428A/en
Application granted granted Critical
Publication of CN113435428B publication Critical patent/CN113435428B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a photo album-based method for selecting photo stickers, electronic equipment and a storage medium. The photo album-based method for selecting the photo stickers obtains the face comprehensive score according to the face size, the face similarity, the face brightness, the face definition and the face angle in the face picture, and then selects the photo stickers according to the face comprehensive score, so that the problems of face blurring and too high or too low brightness of the selected photo stickers are reduced.

Description

Photo album-based photo sticker selection method, electronic equipment and storage medium
Technical Field
The application relates to the technical field of image processing, in particular to a photo album-based method for selecting a photo sticker, electronic equipment and a storage medium.
Background
The photo sticker is also called a sticker photo, and is a popular photographing mode, and most of the photo stickers are self-portrait face photos. The electronic photo album can gather photos of the same person together through the face clustering function to form a personal electronic photo album. In some of these scenarios, the personal electronic album needs to select one photo from the photos and place the photo on the cover of the personal album.
However, in the prior art, some fuzzy, too bright and too dark faces are easily selected from photos to be used as stickers to be placed on the cover of the personal photo album.
Disclosure of Invention
Based on this, in order to solve or improve the problems in the prior art, the present application provides a method, an electronic device, and a storage medium for selecting a photo sticker based on an album, which can reduce the problems of face blur and over-high or over-low brightness of the selected photo sticker.
In a first aspect, a method for selecting a photo sticker based on an album is provided, which includes:
acquiring a plurality of face pictures in an electronic photo album;
acquiring a face comprehensive score according to the face size, the face similarity, the face brightness, the face definition and the face angle in each face picture;
detecting whether the face in each face picture is an eye-opening face or not according to the sequence of the comprehensive face score from high to low;
when the eye-opening face is detected for the first time, extracting the eye-opening face in the corresponding face picture, and taking the eye-opening face as a sticker face.
In one embodiment, the method for selecting the photo stickers based on the photo album further includes:
when the human faces in all the human face pictures are detected to be the closed-eye human faces according to the sequence of the human face comprehensive scores from high to low, the human face in the human face picture with the highest comprehensive score is extracted to be used as the sticker human face.
In one embodiment, the obtaining of the comprehensive face score according to the face size, the face similarity, the face brightness, the face sharpness, and the face angle in each of the face pictures includes:
selecting a first face picture from the plurality of face pictures;
acquiring the size of a face frame of the first face picture, and determining the size of the largest face frame in the plurality of face pictures; dividing the size of the face frame of the first face picture by the maximum size of the face frame to obtain a face frame size ratio; multiplying the face frame size ratio by a face size weight value to obtain a face size dimension value of the first face picture;
acquiring a characteristic value of a face in the first face picture and a characteristic value of a face in each residual face picture according to a face recognition model, wherein the residual face pictures are the face pictures of the plurality of face pictures except the first face picture; acquiring an average value of similarity between the face in the first face picture and the faces in the residual face pictures according to the feature values of the faces in the first face picture and the feature values of the faces in the residual face pictures; multiplying the average value of the face similarity by a face similarity weight value to obtain a face similarity dimension value of the first face picture;
converting the face area in the first face picture into a gray image; acquiring an average value of gray points in a face area as a brightness value of the face; acquiring an absolute value of a difference between a brightness value of a human face and a preset brightness value; dividing the absolute value by the preset brightness value to obtain a brightness deviation degree; multiplying the brightness deviation degree by a face brightness weight value to obtain a face brightness dimension value of the first face picture;
acquiring a definition degree value of the face in the first face picture through a definition classification model; multiplying the sharpness score value by a face sharpness weighted value to obtain a face sharpness dimensional value of the first face picture;
acquiring a left-right inclination angle, a left-right deflection angle and a pitching angle of the face in the first face picture through a face angle classification model; adding the product of the left and right deflection angles and the left and right deflection weight values, the product of the left and right inclination angles and the left and right inclination weight values, and the product of the pitching angles and the pitching weight values to obtain a face angle dimension value of the first face picture;
and adding the face size dimension value, the face similarity dimension value, the face brightness dimension value, the face definition dimension value and the face angle dimension value of the first face picture to obtain the face comprehensive score of the first face picture.
In one embodiment of the method, an open-close eye classification model is adopted to detect whether the face in each face picture is an open-eye face;
before the detecting whether the face in each of the face pictures is an eye-opening face or not, the method further comprises the step of training the eye-opening and closing classification model:
acquiring a plurality of face images as sample images, wherein the face images comprise eye-opening face images and eye-closing face images;
marking the eye-opening face image by adopting an eye-opening label, and marking the eye-closing face image by adopting an eye-closing label to obtain an eye-opening and eye-closing training image;
and training the opening and closing eye classification model by using the opening and closing eye training image, wherein when the opening and closing eye classification model is trained, the feature difference of the face images with the opening eye labels and the closing eye labels is expanded, the feature difference between the face images with the opening eye labels is reduced, and the feature difference of the face images with the closing eye labels is reduced at the same time until the loss value of the opening and closing eye classification model is smaller than a preset value.
In one embodiment, the face angle classification model includes: a left and right deflection angle model, a left and right inclination angle model and a pitching angle model;
before the left-right inclination angle, the left-right deflection angle and the pitching angle of the face in the face picture are obtained through the face angle classification model, the left-right deflection angle model, the left-right inclination angle model and the pitching angle model are respectively trained;
wherein the training step of the left and right deflection angle models comprises: acquiring a plurality of face images; marking the face image by adopting a left deflection angle marking value and a right deflection angle marking value to obtain a left deflection training image and a right deflection training image, wherein the left deflection angle marking value and the right deflection angle marking value are obtained by dividing the difference value between a preset left deflection angle and a preset right deflection angle and the left and right inclination angles of the face in the face image by the preset left deflection angle and the preset right deflection angle; training the left and right deflection angle model through the left and right deflection training images;
the training step of the left-right inclination angle model comprises the following steps: acquiring a plurality of face images; marking the face image by adopting a left and right inclination angle marking value to obtain a left and right inclination training image, wherein the left and right inclination angle marking value is calculated by dividing the difference value between a preset left and right inclination angle and the left and right inclination angle of the face in the face image by the preset left and right inclination angle; training the left and right inclination angle model through the left and right inclination training images;
the step of training the pitch angle model comprises: acquiring a plurality of face images; labeling the face image by using a pitch angle labeling value to obtain a pitch training image, wherein the pitch angle labeling value is calculated by dividing a difference value between a preset pitch angle and a face pitch angle in the face image by the preset pitch angle; and training the pitch angle model through the pitch training image.
In one embodiment, the face size weight is 0.1, the face similarity weight is 0.2, the face brightness weight is 0.1, the face sharpness weight is 0.15, the yaw weight is 0.15, the pitch weight is 0.15.
In one embodiment of the present invention, the step of obtaining the size of the face frame of the face picture includes:
preprocessing the face picture, wherein the preprocessing comprises face righting processing and face image enhancement processing;
and inputting the preprocessed face picture into a face detection model to obtain the size of a face frame.
In one embodiment, before the inputting the preprocessed human face picture into the human face detection model, training the human face detection model is further included;
the loss function loss adopted for training the face detection model is as follows:
loss=L1+L2+L3+L4+β×L5
wherein, L is1For face frame coordinate offset loss, L2Scaling loss for the face bounding box, L3For face bounding box confidence loss, L4For classification loss, said L5For ambiguity loss, the beta is a preset coefficient;
wherein L is5=(1+S(L3+L4))×C(Bt,Bp)
S is an S-type function, and S (L)3+L4) An S-shaped function value which is the sum of the face frame confidence loss and the classification loss, BtAs a true ambiguity label, said BpFor prediction ambiguity labeling, C is a two-class cross entropy function, C (B)t,Bp) Two-class cross entropy function values for the true ambiguity label and the predicted ambiguity label.
In a second aspect, an electronic device is provided, which includes a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the method for selecting photo albums based on photo albums.
In a third aspect, one or more non-transitory readable storage media storing computer-readable instructions are provided, which when executed by one or more processors, cause the one or more processors to perform the steps of the album selection-based photo method as described above.
According to the photo album-based method for selecting the photo albums, the comprehensive face score is obtained according to the face size, the face similarity, the face brightness, the face definition and the face angle in the face picture, and the photo albums are selected according to the comprehensive face score, so that the problems of face blurring and over-high or over-low brightness of the selected photo albums are reduced. Moreover, whether the face in the face picture is an eye-opening face is detected according to the sequence of the comprehensive face score from high to low; when the eye-opening face is detected for the first time, the eye-opening face is used as a photo sticker face, and eye-opening face detection cannot be performed on a subsequent face picture after eye opening is detected for the first time, so that the time for selecting the photo stickers is reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is to be understood that the drawings in the following description are illustrative only and are not restrictive of the invention.
Fig. 1 is a schematic diagram of an internal structure of an electronic device according to an embodiment of the present application.
Fig. 2 is a flowchart of a method for selecting a photo sticker based on an album in an embodiment of the present application.
Fig. 3 is another flowchart of a method for photo album-based photo sticker selection in an embodiment of the present application.
Fig. 4 is a schematic diagram of a face frame in a face picture according to an embodiment of the present application.
Fig. 5 is a schematic diagram of three dimensions, namely, a left-right deflection angle dimension y (yaw), a left-right inclination angle dimension r (roll), and a pitch dimension p (pitch), in a face picture according to an embodiment of the present application.
Fig. 6 is a schematic diagram illustrating left and right deflection of a face in a face picture according to an embodiment of the present application.
Fig. 7 is a schematic diagram illustrating a left-right inclination of a face in a face picture according to an embodiment of the present application.
Fig. 8 is a schematic view of the pitch of the face in the face picture according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application are clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Fig. 1 is a schematic diagram of an internal structure of an electronic device in one embodiment. As shown in fig. 1, the terminal includes a processor, a memory, and a network interface connected by a system bus. Wherein, the processor is used for providing calculation and control capability and supporting the operation of the whole electronic equipment. The memory is used for storing data, programs and the like, and the memory stores at least one computer program which can be executed by the processor to realize the wireless network communication method suitable for the electronic device provided by the embodiment of the application. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program may be executed by a processor to implement a method for selecting photo albums based on photo albums provided by the following embodiments. The internal memory provides a cached execution environment for the operating system computer programs in the non-volatile storage medium. The network interface may be an ethernet card or a wireless network card, etc. for communicating with an external electronic device.
The electronic devices described in the present application may include mobile terminals such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, and fixed terminals such as a Digital TV, a desktop computer, and the like.
The following description will be given taking a mobile terminal as an example, and it will be understood by those skilled in the art that the configuration according to the embodiment of the present application can be applied to a fixed type terminal in addition to elements particularly used for mobile purposes.
Referring to fig. 2, the method for selecting a photo sticker based on an album includes:
s10, acquiring a plurality of face pictures in the electronic album;
s20, obtaining face comprehensive scores according to the face size, the face similarity, the face brightness, the face definition and the face angle in each face picture;
s30, detecting whether the face in each face picture is an eye-opening face or not according to the sequence of the comprehensive face score from high to low;
and S40, when the eye-opening face is detected for the first time, extracting the eye-opening face in the corresponding face picture, and taking the eye-opening face as a sticker face.
In the method for selecting the photo stickers based on the photo album, the comprehensive face score is obtained according to the face size, the face similarity, the face brightness, the face definition and the face angle in the face picture, so that the problems of face blurring and over-high or over-low brightness of the selected photo stickers are reduced. Moreover, whether the face in the face picture is an eye-opening face is detected by sequencing the face comprehensive score from high to low; when the eye-opening face is detected for the first time, the eye-opening face is used as a photo sticker face, and eye-opening face detection cannot be performed on a subsequent face picture after eye opening is detected for the first time, so that the time for selecting the photo stickers is reduced.
Referring to fig. 3, in an embodiment, after detecting whether the face in each face picture is an open-eye face according to the ranking of the face synthesis score from high to low, the method further includes:
and S50, when the human faces in all the human face pictures are detected to be eye-closed human faces according to the sequence of the comprehensive human face scores from high to low, extracting the human face in the human face picture with the highest comprehensive human face score as a sticker human face.
In the embodiment, when the eye-opening face detection is completed on all the face pictures, the eye-opening face is still not detected, that is, all the faces are detected to be eye-closing faces, the face with the highest comprehensive score is used as the sticker face, and the situation that the selection of the stickers fails when all the eye-closing face pictures are closed can be reduced.
In some embodiments, a preset composite score threshold may be set, so that when the face composite score of the detected face pictures reaches the preset composite score threshold and no open-eye face is detected yet, that is, all the faces in the face pictures with the face composite score greater than or equal to the preset composite score threshold are closed-eye faces, the face with the highest composite score is taken as the sticker face.
The album in step S10 may be an electronic album, and includes a plurality of face pictures selected as the photo stickers. Optionally, the electronic album has a face clustering function, and the face clustering function can be used for clustering the pictures of the same person from the provided pictures of different persons and putting the pictures into the electronic album. Namely, through the clustering function, the electronic photo album can be a personal electronic photo composed of a plurality of face pictures of the same person. In addition, if the provided album is not an electronic album, the non-electronic album needs to be converted into the electronic album. And taking a plurality of face pictures obtained from the electronic photo album as a face picture set for selecting the photo stickers.
In step S20, a face comprehensive score is obtained according to the face size, the face similarity, the face brightness, the face sharpness, and the face angle in the face picture. The size of the face, the similarity of the face, the brightness of the face, the definition of the face and the angle of the face all affect the level of the comprehensive face score, that is, the comprehensive face score is related to the size of the face, the similarity of the face, the brightness of the face, the definition of the face and the angle of the face, and can be specifically realized by a comprehensive face score calculation method.
In one embodiment, a method for calculating a comprehensive face score is provided, and specifically, obtaining a comprehensive face score according to the face size, the face similarity, the face brightness, the face definition, and the face angle in each face picture includes:
s201, selecting a first face picture from a plurality of face pictures;
s202, obtaining the size of a face frame (the size of the face frame) of a first face picture, and determining the size of the largest face frame in a plurality of face pictures; dividing the size of the face frame of the first face picture by the maximum size of the face frame to obtain a size ratio of the face frame; multiplying the size ratio of the face frame by the face size weight value to obtain a face size dimension value of the first face picture;
s203, obtaining a characteristic value of a face in a first face picture and a characteristic value of a face in each residual face picture according to the face recognition model, wherein the residual face pictures are the residual face pictures except the first face picture in the plurality of face pictures; acquiring an average value of the similarity between the face in the first face picture and the face in each of the rest face pictures according to the feature value of the face in the first face picture and the feature values of the faces in the rest face pictures; multiplying the average value of the face similarity by the face similarity weight value to obtain a face similarity dimension value of the first face picture;
s204, converting a face area in the first face picture into a gray image; acquiring an average value of gray points in a face area as a brightness value of the face; acquiring an absolute value of a difference between a brightness value of a human face and a preset brightness value; dividing the absolute value by a preset brightness value to obtain a brightness deviation degree; multiplying the brightness deviation degree by the face brightness weight value to obtain a face brightness dimension value of the first face picture;
s205, obtaining a definition degree value of the face in the first face picture through a definition classification model; multiplying the definition degree value by a face definition weight value to obtain a face definition dimension value of the first face picture;
s206, acquiring a left-right inclination angle, a left-right deflection angle and a pitching angle of the face in the first face picture through the face angle classification model; adding the product of the left and right deflection angles and the left and right deflection weight values, the product of the left and right inclination angles and the left and right inclination weight values, and the product of the pitching angle and the pitching weight value to obtain a face angle dimension value of the first face picture;
and S207, adding the face size dimension value, the face similarity dimension value, the face brightness dimension value, the face definition dimension value and the face angle dimension value of the first face picture to obtain the face comprehensive score of the first face picture.
In step S201, the first face picture is one of the plurality of face pictures that is currently subjected to face comprehensive grading, and at this time, the other remaining face pictures that are not subjected to face comprehensive grading in the plurality of face pictures are called remaining face pictures. And grading each face picture in the plurality of face pictures according to a face comprehensive grading method of the first face picture in sequence to finish face comprehensive grading of each face picture in the plurality of face pictures.
Step S202 provides a method for obtaining a face size dimension value, where the face size dimension value is a score used for evaluating the size of a face in a comprehensive score of the face. The size dimension of the face is used as the standard for selecting the photo stickers, so that the photo stickers with large faces can be obtained by preferential screening. The larger the face size dimension value is, the larger the face is, and the better the face size dimension is. Optionally, the face size dimension value ranges from 0 to a face size weight value, for example, if the face size weight value is 0.1, the face size dimension value ranges from 0 to 0.1.
In the method for obtaining the face size dimension value, the size of the face in the picture can be represented by the size of the face frame. The frame of the face is a frame generated by the face detection model, as shown in fig. 4, the face frame 22 is located around the face 21. The size of the face frame can be understood as the size of the face frame, and can be specifically represented by the length of the side length of the frame, and the length is represented by taking pixels as a unit. In one embodiment, the step of obtaining the size of the face frame of the face picture includes:
s2021, preprocessing the face image, wherein the preprocessing comprises face righting processing and face image enhancement processing;
s2022, inputting the preprocessed human face picture into a human face detection model to obtain the size of a human face frame.
Step S2021 is to perform preprocessing on the face image, such as face straightening processing and face image enhancement processing, to reduce the problem of false face borders appearing in the face detection model. Specifically, the face centering is to obtain a face image with a correct face position; the human face image enhancement is to improve the quality of the human face image, so that the image is clearer visually and is more beneficial to the processing and recognition of a computer.
Optionally, the face righting processing specifically includes the steps of: obtaining an affine transformation matrix according to a plurality of (such as 5) feature points of the face in the face picture and a plurality of (such as 5) reference feature point coordinates of the standard face; and performing rotation translation correction on the face through the radiation transformation matrix to obtain the righted face.
Optionally, the enhancement processing of the face image specifically includes the steps of: counting the number of each pixel in the gray level in the whole image; calculating the probability distribution of each gray level in the image; calculating the cumulative distribution probability; calculating the gray value after equalization; the pixel values of the coordinates of the original pixels are mapped back.
Step S2022 is to obtain a face frame and a confidence corresponding to the face frame through a face detection model (face detection model). The face detection model is a deep learning neural network, such as a yolov 3-based neural network, and can acquire face borders and confidence degrees from face pictures.
Further, the training method of the face detection model based on yolov3 is improved, so that the face ambiguity can be scored, more accurate face ambiguity and confidence can be obtained, and the processing speed of the whole face detection model is increased. Specifically, the training set used in the training method of the face detection model increases the dimension of the ambiguity, that is, the ambiguity of the face in the pictures of the training set is marked to judge whether the face is a blurred face.
Further, the loss function loss adopted for training the face detection model is as follows:
loss=L1+L2+L3+L4+β×L5
wherein L is1For the loss of the coordinate offset of the face frame, L2Is the scaling loss of the human face frame, L3For face bounding box confidence loss, L4To classify the loss, L5For ambiguity loss, β is a preset coefficient;
wherein L is5=(1+S(L3+L4))×C(Bt,Bp)
S is a function of the S type, S (L)3+L4) S-shaped function value which is the sum of confidence loss and classification loss of the human face frame, BtAs a true ambiguity label, BpFor prediction ambiguity labels, C is a two-class cross entropy function, C (B)t,Bp) Two-class cross entropy function values for the true ambiguity label and the predicted ambiguity label.
In particular, L1(L1May be referred to as xy _ loss) may be a binary cross entropy loss designed based on the offset of the grid point coordinates at the top left of the center point of the face bounding box. L is2(L2Which may be referred to as wh _ loss) is a binary cross entropy penalty based on the face bounding box width and height design. L is3(L3Can be called confidence _ loss) is a binary cross entropy loss based on obj and no _ obj, and is specifically divided into two cases of obj and no _ obj to calculate the loss, wherein the binary cross entropy is calculated for obj (the face frame has a corresponding real frame); for no _ obj (the face bounding box has no corresponding real box), for example, when iou (Intersection over Union) of the face bounding box and the real box is lower than 0.5, a binary Intersection entropy corresponding to no _ obj needs to be calculated. L is4(L4May be referred to as class _ loss) may be a class loss based on two-class cross entropy, and further, for n classes, n two-class cross entropy loss functions are used.
Increasing L in loss function adopted by face detection model5(L5May be referred to as blu _ loss) as the loss of ambiguity. Specifically, in the training process, the ambiguity dimension is added to the dimension of yolov3 category feature, and the dimension of feature graph output can be N × N × [ a × (b + c + d)]And calculating, wherein NxN is the number of lattice points of the output feature map, a is the number of the preset anchor frames, b is the number of the prediction frame values of each face frame, c is the confidence coefficient of the prediction frame, and d is the class feature dimension number.
In the loss function adopted by the face detection model, β is a preset coefficient, also called an adjustable parameter, and is used for coordinating the relationship between blu _ loss and other losses.
S represents an S-type function, and is an activation function, specifically a sigmoid function, and it can be understood that S (L)3+L4) An S-shaped function value which is the sum of the confidence coefficient loss and the classification loss of the human face frame; b ist(true _ blu _ label) is a true ambiguity label, Bp(pred _ blu _ label) is a prediction ambiguity label; c is a binary cross entropy function, in particular a binary _ cross _ entropy function; c (B)t,Bp) Two-class cross entropy function values for the true ambiguity label and the predicted ambiguity label. That is, the loss of ambiguity is calculated by the following formula when the loss is calculated for each lattice point:
Figure 255891DEST_PATH_IMAGE001
the sigmoid is an S-type activation function, the sigmoid (confidence _ loss + class _ loss) is a sigmoid function value of the sum of the confidence _ loss and the class _ loss, the binary _ cross _ entry is a binary loss function, the true _ blue _ label is a true ambiguity label, the pred _ blue _ label is a predicted ambiguity label, and the binary _ cross _ entry (true _ blue _ label, pred _ blue _ label) is a binary cross entropy function value of the true _ blue _ label and the pred _ blue _ label.
Wherein, confidence _ loss and class _ loss assist to adjust the ambiguity loss, sigmoid maps the value of confidence _ loss + class _ loss as a part of the ambiguity score loss adjusting coefficient, therefore, the three have strong correlation. It can be understood that if the confidence of the face frame and the confidence of the face category are low, i.e. the loss values of the face frame and the face category are high, the ambiguity will also obtain a large loss (not more than twice the directly calculated ambiguity loss); if the confidence of the face frame and the confidence of the face category are high, the loss values of the face frame and the face category are low, and the ambiguity loss value is close to a direct loss obtained by binary _ cross _ entry (true _ blue _ label, pred _ blue _ label), wherein the binary _ cross _ entry is a binary cross entropy loss function and is used for calculating binary-type losses of a prediction ambiguity label and a real ambiguity label, the true _ blue _ label is a real ambiguity label and is also called a prediction ambiguity score, and the pred _ blue _ label is a prediction ambiguity label and is also called a real ambiguity score. During the training process, pred _ blu _ label (prediction ambiguity label) will be gradually optimized to the true ambiguity label given in the training set.
The loss function of the face detection model comprises ambiguity loss, so that the detected face directly has the fuzzification attribute and does not need to be subjected to ambiguity judgment again; moreover, in order to reduce the loss function of the ambiguity, the confidence coefficient of the face frame needs to be lost, and blu _ loss is that the ambiguity loss is reduced as much as possible, so that the parameters of the face detection model in the process of the detritus training are better, and the confidence coefficients of the trained face frame and the trained face type are more accurate.
In addition, in order to make the face detection model converge more effectively, avoid the problem of gradient explosion caused by the incoordination of one or more scoring dimensions, and reduce the workload of parameter adjustment, the embodiment provides a dynamic gradient clipping method. Specifically, in the dynamic gradient clipping method, to prevent the situation of gradient explosion during the back propagation of the optimized neural network, a larger gradient is clipped, that is, an upper gradient limit is set to 1, and then a gradient larger than 1 is forcibly set to 1 to update the parameters.
In the dynamic gradient clipping method, it is found that there is no difference in gradients larger than a specified threshold, and a problem may occur in convergence of a face detection model due to a problem of threshold setting. That is, in the dynamic gradient clipping method, there is a problem that the gradient threshold is difficult to select.
For the problem that the gradient threshold is difficult to select, in this embodiment, the gradient is mapped to the specified range by using a preset function (preset mapping function), so that the effect of reflecting the difference in size is achieved, and the problem that the too large gradient causes gradient explosion is solved.
The preset gradient mapping function may be:
Figure 999856DEST_PATH_IMAGE002
where z is the original gradient (true _ gradient), hcIs the mapped gradient (clip _ gradient).
Specifically, after the first face picture is obtained, a face region in the first face picture is obtained by using a face detection model, wherein the face region is a region of the face picture corresponding to a face frame;
the face detection model is obtained by training through the following steps of:
inputting the training picture into a face detection model, and acquiring the descending gradient of the face detection model in the back propagation process;
mapping the descending gradient to a preset specified range through a mapping function to obtain a mapped descending gradient;
updating parameters of the face detection model through the mapped descending gradient;
wherein the gradient mapping function is:
Figure 236802DEST_PATH_IMAGE003
where z is the descent gradient, e is the base of the natural logarithmic function, hcIs the mapped falling gradient.
For example, in updating the parameters of the face detection model through the mapped descending gradient, the following formula may be used to update the parameters of the face detection model:
Figure 306389DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 628786DEST_PATH_IMAGE005
in order to obtain the updated parameters, the parameters are,
Figure 504339DEST_PATH_IMAGE006
in order to obtain the parameters before the update,
Figure 471157DEST_PATH_IMAGE007
in order to preset the learning rate,
Figure 836280DEST_PATH_IMAGE008
in order to be a function of the gradient mapping,
Figure 521339DEST_PATH_IMAGE009
based on the sample data
Figure 997320DEST_PATH_IMAGE010
The prediction function of (a) is determined,
Figure 818645DEST_PATH_IMAGE010
in order to select one of the sample data,
Figure 557931DEST_PATH_IMAGE011
in order to be a function of the gradient operator,
Figure 595200DEST_PATH_IMAGE012
representing a prediction function
Figure 484659DEST_PATH_IMAGE009
And (4) calculating the partial derivative of theta, wherein t is the updating times or the training times.
The embodiment may further include S2023, adjusting the size of the face frame according to the confidence corresponding to the face frame, and using the adjusted size of the face frame as the size of the face frame of the face picture, for example, if the confidence is lower than the preset confidence, reducing the size of the face frame according to a preset ratio.
In step S203, a face similarity value is obtained by using the face recognition model. Specifically, extracting a characteristic value of a face in each face picture through a face recognition model; sequentially selecting one face picture, sequentially comparing the features of the face in the face picture with the faces of the rest of other face pictures to obtain similarity values, summing all the obtained similarity values to obtain an average value, and multiplying the average value by a face similarity weighted value to respectively obtain each face similarity dimension value.
The similarity condition of the face in the face picture and the faces in other residual face pictures can be judged by utilizing the face similarity dimension value, wherein the higher the face similarity dimension value is, the more representative the face is in all the faces, for example, in a face sequence, most of the faces have higher image quality, and then the score value of the similarity dimension value corresponding to the face with poor image quality is lower. Optionally, the face similarity dimension value ranges from 0 to the face similarity weight value, and the larger the face similarity dimension value is, the better the face similarity is, and the higher the face image quality is.
In step S204, the brightness of the human face is scored. Specifically, a formula for converting an RGB color space into a gray scale space: the Gray = R × 0.299 + G × 0.587 + B × 0.114 converts the RGB image of the face area in the face picture into a Gray image, and then sums up the Gray points in the face area to obtain the average value, so as to obtain the face brightness value. Optionally, the brightness value of the face ranges from 0 to 255, and the smaller the brightness value of the face is, the darker the face is, and the larger the face is, the brighter the face is. The face region may specifically be a region corresponding to a face frame obtained by processing a face image with a face detection model.
And acquiring a face brightness dimension value through the face brightness value. Specifically, the face brightness value may be preset to represent the optimal brightness, for example, the preset face brightness value is 128, the brightness value of each face is subtracted from the preset face brightness value 128 to obtain an absolute value, the absolute value is then divided by the preset face brightness value 128 to obtain a brightness deviation, the brightness deviation may indirectly reflect the degree of the brightness deviation from the optimal brightness of the face, and finally, the brightness deviation is multiplied by the face brightness weight value to obtain a face brightness dimension value.
The face brightness dimension value can evaluate the brightness quality of the face. It can be understood that the range of the face brightness dimension value is from 0 to the face brightness weight value, and the larger the face brightness dimension value is, the better the face brightness dimension is.
Step S205 is to obtain a face definition dimension value, and score the face definition in the face picture according to the face definition dimension value. Optionally, the face sharpness dimension value corresponding to the blurred face is low. That is, the larger the face definition dimension value is, the clearer the face is.
Specifically, the neural network is trained by adopting a face training image with a definition label to obtain a definition classification model, corresponding definition values can be obtained after face pictures are input, and then the definition values are multiplied by face definition weight values to obtain face definition dimension values. Optionally, the sharpness score value is between 0 and 1, and a lower sharpness score value represents blurriness, and a higher sharpness score value represents sharpness, that is, the range of the face sharpness dimension value is between 0 and the face sharpness weight value.
In addition, in order to improve the effect of the definition classification accuracy of the definition classification model, the method for training the definition classification model comprises the following steps: and acquiring 5 thousands of clear face samples, marking the samples as clear, acquiring 5 thousands of fuzzy face samples, marking the samples as fuzzy, and training a neural network on the 10 thousands of sample data to obtain a definition classification model.
Step S206 is to calculate a face angle dimension value for evaluating the orientation of the face in the face picture. Referring to fig. 5, the face angle dimension can be specifically divided into three dimensions: the left and right deflection angle dimension Y (yaw), the left and right inclination angle dimension R (roll) and the pitch dimension P (pitch) respectively correspond to a left and right deflection angle dimension value, a left and right inclination angle dimension value and a pitch dimension value.
Specifically, the input face picture is processed by using a face angle classification model based on a neural network (the face angle classification model comprises a left-right deflection angle model, a left-right inclination angle model and a pitching angle model), so that a left-right deflection angle dimension value, a left-right inclination angle dimension value and a pitching angle dimension value corresponding to the face can be obtained.
In one embodiment, the face angle classification model includes: a left and right deflection angle model, a left and right inclination angle model and a pitching angle model;
before acquiring a left-right inclination angle, a left-right deflection angle and a pitching angle of a face in a face picture through a face angle classification model, respectively training a left-right deflection angle model, a left-right inclination angle model and a pitching angle model;
the training step of the left and right deflection angle model comprises the following steps: acquiring a plurality of face images; marking the face image by adopting a left deflection angle marking value and a right deflection angle marking value to obtain a left deflection training image and a right deflection training image, wherein the left deflection angle marking value and the right deflection angle marking value are obtained by dividing the difference value between the preset left deflection angle and the preset right deflection angle and the left and right inclination angle of the face in the face image by the preset left deflection angle and the preset right deflection angle; training a left deflection angle model and a right deflection angle model through a left deflection training image and a right deflection training image;
the method includes acquiring a plurality of face images, namely images of a plurality of faces under various light rays, wherein the face angle deviates from 90 degrees from the left to 90 degrees from the right (which can be recorded as-90 degrees), as shown in fig. 6. The face angle is a deflection angle of a face in the face image relative to the front face, and specifically, the front face may be set to 0 °. The number of the acquired face images can be 2 ten thousand.
Optionally, the left and right deflection angles are preset to be 90 degrees, the range of the left and right deflection angle marking values is 0-1, and the smaller the left and right deflection angle marking values, the larger the deviation of the face angle is.
The training step of the left and right inclination angle model comprises the following steps: acquiring a plurality of face images; marking the face image by adopting a left and right inclination angle marking value to obtain a left and right inclination training image, wherein the left and right inclination angle marking value is calculated by dividing the difference value of a preset left and right inclination angle and the left and right inclination angle of the face in the face image by the preset left and right inclination angle; training a left and right inclination angle model through a left and right inclination training image;
the obtained multiple face images are face images of multiple faces under multiple light rays (the multiple light rays may be light rays with different intensities or colors), wherein the face angle is inclined between 60 degrees from left to 60 degrees from right (which may be recorded as-60 degrees), as shown in fig. 7. The face angle is an inclination angle of a face in the face image relative to the frontal face, and specifically, the frontal face may be set to 0 °. The number of face images acquired may be 2 ten thousand.
Optionally, the left and right inclination angles are preset to be 60 degrees, the range of the left and right inclination angle marking values is 0-1, and the smaller the left and right inclination angle marking values, the larger the deviation of the face angle is.
The training step of the pitch angle model comprises the following steps: acquiring a plurality of face images; marking the face image by using a pitch angle marking value to obtain a pitch training image, wherein the pitch angle marking value is calculated by dividing the difference value between a preset pitch angle and the face pitch angle in the face image by the preset pitch angle; and training a pitch angle model through the pitch training image.
The multiple face images are obtained by taking images of multiple faces under multiple light rays, wherein the elevation angle of the face is 60 degrees to the depression angle of 60 degrees (which can be recorded as-60 degrees), as shown in fig. 8. The face inclination angle is an inclination angle of a face in the face image relative to the front face, and specifically, the front face may be set to 0 °. The number of face images acquired may be 2 ten thousand.
Optionally, the preset pitch angle is 60 degrees, the range of the pitch angle marking value is 0-1, and the smaller the pitch angle marking value is, the larger the deviation of the face angle is.
The range of the face angle dimension value is between 0 and 1, the smaller the face angle dimension value is, the larger the face angle deviation is, and the larger the face angle dimension value is, the smaller the face angle deviation is. And multiplying the left and right deflection angle dimension value, the left and right inclination angle dimension value and the pitching angle dimension value by the corresponding face angle weight value respectively, and finally summing to obtain the face angle dimension value. It is understood that the face angle dimension value ranges between 0 and the corresponding face angle weight value. Whether the face is a positive face or not can be evaluated through the face angle weighted value, and the larger the face angle weighted value is, the more positive the face in the face picture is.
When corresponding weights are set for all scoring dimensions, particularly, higher weights are set for more important dimensions so as to enhance the influence on comprehensive scoring. In one embodiment, the face size weight value is 0.08 to 0.12, the face similarity weight value is 0.15 to 0.25, the face brightness weight value is 0.08 to 0.12, the face definition weight value is 0.12 to 0.18, the yaw weight value is 0.12 to 0.18, the pitch weight value is 0.12 to 0.18, specifically, the face size weight value is 0.1, the face similarity weight value is 0.2, the face brightness weight value is 0.1, the face definition weight value is 0.15, the yaw weight value is 0.15, the pitch weight value is 0.15, and the low-quality sticker can be reduced.
Step S207 is to calculate the total dimension value of the face, specifically, sum the dimension values of the face size, the face similarity, the face brightness, the face sharpness, and the face angle of each face picture to obtain a face comprehensive score of the face picture. It is understood that a higher face composite score indicates a higher face quality.
In step S30, the open/close state of the face in the face picture is determined. In one embodiment, an open-close eye classification model is adopted to detect whether the face in the face picture is an open-eye face;
before detecting whether the face in the face picture is an eye-opening face or not, the method also comprises the step of training an eye-opening and eye-closing classification model:
s301, acquiring a plurality of face images, wherein the face images comprise eye-opening face images and eye-closing face images;
s302, labeling the eye-opening face image by adopting an eye-opening label, and labeling the eye-closing face image by adopting an eye-closing label to obtain an eye-opening and eye-closing training image;
and S303, training an opening and closing eye classification model by using the opening and closing eye training image, wherein when the opening and closing eye classification model is trained, the feature difference of the face images with the opening eye labels and the closing eye labels is expanded, the feature difference between the face images with the opening eye labels is reduced, and the feature difference of the face images with the closing eye labels is reduced until the loss value of the opening and closing eye classification model is smaller than a preset value.
The open-close eye classification model is a neural network model, and can judge the open-close eye state of the human face for the input human face picture.
Before judging the eye opening and closing state of the human face by using the input human face picture, training an eye opening and closing classification model by using an eye opening and closing human face image, wherein the human face image for training the eye opening and closing classification model is an eye opening human face and an eye closing human face under various light rays, carrying out eye opening labeling (setting a label as eye opening) on 5 thousands of eye opening human face images, and carrying out eye closing labeling (setting a label as eye closing) on the 5 thousands of eye closing human face images to obtain an eye opening and closing training image; and then training an open-closed eye classification model by using the open-closed eye training image.
In step S30, detecting whether the face in the face picture is an open-eye face according to the ranking of the comprehensive face score from high to low specifically includes: and sequencing all the face pictures from high to low according to the face comprehensive score, and then judging the eyes to be opened and closed through a high-to-low sequence face picture by using an eye opening and closing classification model.
In the step S40, in the process of sequentially judging the face pictures, if the face of the face picture is judged as an eye-open face for the first time, the face in the face picture is taken as a photo sticker, and the judgment of the following face picture is stopped; in step S50, in the process of determining the face picture, if the faces in all the face pictures are closed-eye faces, the face in the face picture with the highest comprehensive face score is selected as the photo sticker by default.
The photo album-based photo sticker selecting method can select a picture with better quality from the photo album, extract the face in the face picture as a photo sticker, and use the photo sticker as a cover of the personal photo album.
It should be understood that, although the steps in the flowcharts of fig. 2 and 3 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2 and 3 may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
The embodiment of the present application further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the method for selecting a photo sticker based on an album in any of the above embodiments.
The embodiment of the application also provides a computer readable storage medium. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform the steps of the method for photo album-based selection of a photo sticker.
A computer program product containing instructions which, when run on a computer, cause the computer to perform a method of selecting a sticker.
Any reference to memory, storage, database, or other medium used herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A photo album-based method for selecting photo stickers is characterized by comprising the following steps:
acquiring a plurality of face pictures in an electronic photo album;
acquiring a face comprehensive score according to the face size, the face similarity, the face brightness, the face definition and the face angle in each face picture;
detecting whether the face in each face picture is an eye-opening face or not according to the sequence of the comprehensive face score from high to low;
when the eye-opening face is detected for the first time, extracting the eye-opening face in the corresponding face picture, and taking the eye-opening face as a sticker face.
2. The photo album based method of selecting photo stickers according to claim 1, further comprising:
when the human faces in all the human face pictures are detected to be the eye-closed human faces according to the sequence of the human face comprehensive scores from high to low, the human face in the human face picture with the highest human face comprehensive score is extracted to be used as the sticker human face.
3. The photo album based photo album:
selecting a first face picture from the plurality of face pictures;
acquiring the size of a face frame of the first face picture, and determining the size of the largest face frame in the plurality of face pictures; dividing the size of the face frame of the first face picture by the maximum size of the face frame to obtain a face frame size ratio; multiplying the face frame size ratio by a face size weight value to obtain a face size dimension value of the first face picture;
acquiring a characteristic value of a face in the first face picture and a characteristic value of a face in each residual face picture according to a face recognition model, wherein the residual face pictures are the residual face pictures except the first face picture in the plurality of face pictures; acquiring an average value of similarity between the face in the first face picture and the faces in the residual face pictures according to the feature values of the faces in the first face picture and the feature values of the faces in the residual face pictures; multiplying the average value of the face similarity by a face similarity weight value to obtain a face similarity dimension value of the first face picture;
converting the face area in the first face picture into a gray image; acquiring an average value of gray points in a face area as a brightness value of the face; acquiring an absolute value of a difference between a brightness value of a human face and a preset brightness value; dividing the absolute value by the preset brightness value to obtain a brightness deviation degree; multiplying the brightness deviation degree by a face brightness weight value to obtain a face brightness dimension value of the first face picture;
acquiring a definition degree value of the face in the first face picture through a definition classification model; multiplying the sharpness score value by a face sharpness weighted value to obtain a face sharpness dimensional value of the first face picture;
acquiring a left-right inclination angle, a left-right deflection angle and a pitching angle of the face in the first face picture through a face angle classification model; adding the product of the left and right deflection angles and the left and right deflection weight values, the product of the left and right inclination angles and the left and right inclination weight values, and the product of the pitching angles and the pitching weight values to obtain a face angle dimension value of the first face picture;
and adding the face size dimension value, the face similarity dimension value, the face brightness dimension value, the face definition dimension value and the face angle dimension value of the first face picture to obtain the face comprehensive score of the first face picture.
4. The photo album based photo sticker selection method according to claim 1, wherein an open-closed eye classification model is used to detect whether the face in each of the face pictures is an open-eye face;
before the detecting whether the face in each of the face pictures is an eye-opening face or not, the method further comprises the step of training the eye-opening and closing classification model:
acquiring a plurality of face images as sample images, wherein the face images comprise eye-opening face images and eye-closing face images;
marking the eye-opening face image by adopting an eye-opening label, and marking the eye-closing face image by adopting an eye-closing label to obtain an eye-opening and eye-closing training image;
and training the opening and closing eye classification model by using the opening and closing eye training image, wherein when the opening and closing eye classification model is trained, the feature difference of the face images with the opening eye labels and the closing eye labels is expanded, the feature difference between the face images with the opening eye labels is reduced, and the feature difference of the face images with the closing eye labels is reduced at the same time until the loss value of the opening and closing eye classification model is smaller than a preset value.
5. The photo album-based photo sticker selection method according to claim 3, wherein the face angle classification model comprises: a left and right deflection angle model, a left and right inclination angle model and a pitching angle model;
before the left-right inclination angle, the left-right deflection angle and the pitching angle of the face in the face picture are obtained through the face angle classification model, the left-right deflection angle model, the left-right inclination angle model and the pitching angle model are respectively trained;
wherein the training step of the left and right deflection angle models comprises: acquiring a plurality of face images; marking the face image by adopting a left deflection angle marking value and a right deflection angle marking value to obtain a left deflection training image and a right deflection training image, wherein the left deflection angle marking value and the right deflection angle marking value are obtained by dividing the difference value between a preset left deflection angle and a preset right deflection angle and the left and right inclination angles of the face in the face image by the preset left deflection angle and the preset right deflection angle; training the left and right deflection angle model through the left and right deflection training images;
the training step of the left-right inclination angle model comprises the following steps: acquiring a plurality of face images; marking the face image by adopting a left and right inclination angle marking value to obtain a left and right inclination training image, wherein the left and right inclination angle marking value is calculated by dividing the difference value between a preset left and right inclination angle and the left and right inclination angle of the face in the face image by the preset left and right inclination angle; training the left and right inclination angle model through the left and right inclination training images;
the step of training the pitch angle model comprises: acquiring a plurality of face images; labeling the face image by using a pitch angle labeling value to obtain a pitch training image, wherein the pitch angle labeling value is calculated by dividing a difference value between a preset pitch angle and a face pitch angle in the face image by the preset pitch angle; and training the pitch angle model through the pitch training image.
6. The photo album based photo sticker selection method of claim 3, wherein the face size weight value is 0.1, the face similarity weight value is 0.2, the face brightness weight value is 0.1, the face sharpness weight value is 0.15, the yaw weight value is 0.15, the bank weight value is 0.15, and the pitch weight value is 0.15.
7. The photo album-based photo sticker selection method according to claim 3, wherein the step of obtaining the size of the face frame of the face picture comprises:
preprocessing the face picture, wherein the preprocessing comprises face righting processing and face image enhancement processing;
and inputting the preprocessed face picture into a face detection model to obtain the size of a face frame.
8. The photo album based photo sticker selection method according to claim 7, further comprising training the face detection model before inputting the preprocessed face picture into the face detection model;
the loss function loss adopted for training the face detection model is as follows:
loss=L1+L2+L3+L4+β×L5
wherein, L is1For face frame coordinate offset loss, L2Scaling loss for the face bounding box, L3For face bounding box confidence loss, L4For classification loss, said L5For ambiguity loss, the beta is a preset coefficient;
wherein L is5=(1+S(L3+L4))×C(Bt,Bp)
S is an S-type function, thenS (L)3+L4) An S-shaped function value which is the sum of the face frame confidence loss and the classification loss, BtAs a true ambiguity label, said BpFor prediction ambiguity labeling, C is a two-class cross entropy function, C (B)t,Bp) Two-class cross entropy function values for the true ambiguity label and the predicted ambiguity label.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to perform the steps of the photo album selection photo sticker-based method according to any one of claims 1 to 8.
10. One or more non-transitory storage media storing computer-readable instructions thereon that, when executed by one or more processors, cause the one or more processors to perform the steps of the album selection sticker-based method of any one of claims 1-8.
CN202110991809.5A 2021-08-27 2021-08-27 Photo album-based photo sticker selection method, electronic equipment and storage medium Active CN113435428B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110991809.5A CN113435428B (en) 2021-08-27 2021-08-27 Photo album-based photo sticker selection method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110991809.5A CN113435428B (en) 2021-08-27 2021-08-27 Photo album-based photo sticker selection method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113435428A true CN113435428A (en) 2021-09-24
CN113435428B CN113435428B (en) 2021-12-31

Family

ID=77798157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110991809.5A Active CN113435428B (en) 2021-08-27 2021-08-27 Photo album-based photo sticker selection method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113435428B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224921A (en) * 2015-09-17 2016-01-06 桂林远望智能通信科技有限公司 A kind of facial image preferentially system and disposal route
WO2018049952A1 (en) * 2016-09-14 2018-03-22 厦门幻世网络科技有限公司 Photo acquisition method and device
CN108960087A (en) * 2018-06-20 2018-12-07 中国科学院重庆绿色智能技术研究院 A kind of quality of human face image appraisal procedure and system based on various dimensions evaluation criteria
CN109784230A (en) * 2018-12-29 2019-05-21 中国科学院重庆绿色智能技术研究院 A kind of facial video image quality optimization method, system and equipment
CN111160284A (en) * 2019-12-31 2020-05-15 苏州纳智天地智能科技有限公司 Method, system, equipment and storage medium for evaluating quality of face photo
US20210166003A1 (en) * 2018-08-22 2021-06-03 Zhejiang Dahua Technology Co., Ltd. Systems and methods for selecting a best facial image of a target human face

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224921A (en) * 2015-09-17 2016-01-06 桂林远望智能通信科技有限公司 A kind of facial image preferentially system and disposal route
WO2018049952A1 (en) * 2016-09-14 2018-03-22 厦门幻世网络科技有限公司 Photo acquisition method and device
CN108960087A (en) * 2018-06-20 2018-12-07 中国科学院重庆绿色智能技术研究院 A kind of quality of human face image appraisal procedure and system based on various dimensions evaluation criteria
US20210166003A1 (en) * 2018-08-22 2021-06-03 Zhejiang Dahua Technology Co., Ltd. Systems and methods for selecting a best facial image of a target human face
CN109784230A (en) * 2018-12-29 2019-05-21 中国科学院重庆绿色智能技术研究院 A kind of facial video image quality optimization method, system and equipment
CN111160284A (en) * 2019-12-31 2020-05-15 苏州纳智天地智能科技有限公司 Method, system, equipment and storage medium for evaluating quality of face photo

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李秋珍 等: "基于卷积神经网络的人脸图像质量评价", 《计算机应用》 *

Also Published As

Publication number Publication date
CN113435428B (en) 2021-12-31

Similar Documents

Publication Publication Date Title
CN109492643B (en) Certificate identification method and device based on OCR, computer equipment and storage medium
US11657602B2 (en) Font identification from imagery
US9536293B2 (en) Image assessment using deep convolutional neural networks
US9953425B2 (en) Learning image categorization using related attributes
CN111178183B (en) Face detection method and related device
US9104914B1 (en) Object detection with false positive filtering
US8638993B2 (en) Segmenting human hairs and faces
US7400761B2 (en) Contrast-based image attention analysis framework
CN111126258A (en) Image recognition method and related device
CN111080628A (en) Image tampering detection method and device, computer equipment and storage medium
CN109271930B (en) Micro-expression recognition method, device and storage medium
US20110268319A1 (en) Detecting and tracking objects in digital images
CN112101359B (en) Text formula positioning method, model training method and related device
US8094971B2 (en) Method and system for automatically determining the orientation of a digital image
CN111553438A (en) Image identification method based on convolutional neural network
CN114821778A (en) Underwater fish body posture dynamic recognition method and device
CN113378812A (en) Digital dial plate identification method based on Mask R-CNN and CRNN
CN112465709A (en) Image enhancement method, device, storage medium and equipment
WO2019223066A1 (en) Global enhancement method, device and equipment for iris image, and storage medium
CN113435428B (en) Photo album-based photo sticker selection method, electronic equipment and storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN116798041A (en) Image recognition method and device and electronic equipment
CN114283431B (en) Text detection method based on differentiable binarization
CN112699809B (en) Vaccinia category identification method, device, computer equipment and storage medium
US11893784B2 (en) Assessment of image quality for optical character recognition using machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230710

Address after: 13C-18, Caihong Building, Caihong Xindu, No. 3002, Caitian South Road, Gangsha Community, Futian Street, Futian District, Shenzhen, Guangdong 518033

Patentee after: Core Computing Integrated (Shenzhen) Technology Co.,Ltd.

Address before: 518000 1001, building G3, TCL International e city, Shuguang community, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Patentee before: Shenzhen Aishen Yingtong Information Technology Co.,Ltd.