WO2019205605A1 - 人脸特征点的定位方法及装置 - Google Patents

人脸特征点的定位方法及装置 Download PDF

Info

Publication number
WO2019205605A1
WO2019205605A1 PCT/CN2018/116779 CN2018116779W WO2019205605A1 WO 2019205605 A1 WO2019205605 A1 WO 2019205605A1 CN 2018116779 W CN2018116779 W CN 2018116779W WO 2019205605 A1 WO2019205605 A1 WO 2019205605A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
face
feature line
fused
feature
Prior art date
Application number
PCT/CN2018/116779
Other languages
English (en)
French (fr)
Inventor
钱晨
吴文岩
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to SG11201912428TA priority Critical patent/SG11201912428TA/en
Priority to MYPI2019007719A priority patent/MY201922A/en
Priority to KR1020197037564A priority patent/KR102334279B1/ko
Priority to JP2019568632A priority patent/JP7042849B2/ja
Publication of WO2019205605A1 publication Critical patent/WO2019205605A1/zh
Priority to US16/720,124 priority patent/US11314965B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/803Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular, to a method and apparatus for locating facial feature points.
  • Face feature point location is an important category of face-related computer vision problems.
  • the task of face feature point location is to calculate the position of several personal face feature points in the face image. For example, the position of the face feature points such as the corners of the eyes, the corners of the mouth, and the tip of the nose in the face image is calculated.
  • the problem of face feature point location can be solved by deep neural networks.
  • the loss of facial structure information becomes severe.
  • the face in the face image is severely occluded, the face is a large angle side face or the expression of the face is exaggerated, the accuracy of the face feature point location is seriously degraded.
  • the present disclosure proposes a method and apparatus for locating facial feature points.
  • a method for locating a facial feature point including:
  • the face image is merged with the face feature line image to obtain position information of the face feature point.
  • the method before the merging the face image and the face feature line image, the method further includes:
  • the merging the face image with the face feature line image to obtain location information of the face feature point includes:
  • the face image is merged with the optimized face feature line image to obtain position information of the face feature point.
  • the performing edge detection on the face image to obtain the facial feature line image includes:
  • the feature line image is optimized to acquire the face feature line image.
  • the feature line feature extraction on the face image to obtain a feature line image includes:
  • the convolution, the residual operation, the downsampling, and the residual operation are sequentially performed on the face image to acquire the feature line image.
  • the optimizing the feature line image to obtain the facial feature line image includes:
  • each optimized network includes an hourglass type network for implementing residual operation and information line information transmission Information transfer layer.
  • the merging the face image with the face feature line image to obtain location information of a face feature point includes:
  • the second fused image is mapped to obtain a position vector of the feature point, and the position vector is used as position information of the face feature point.
  • the method before the first fused image and the facial feature line image are merged with the at least one edge image, the method further includes:
  • Optimizing the first fused image to obtain an optimized first fused image wherein the optimization process sequentially includes convolution, downsampling, and residual operations.
  • the inputting the image into the input image to obtain the first fused image comprises:
  • the first fused image and the facial feature line image are fused by at least one edge image to obtain a second fused image, including:
  • the method further includes: performing a residual operation on the result of each level boundary fusion.
  • the mapping the second fused image to obtain a position vector of the feature point includes:
  • the second fused image is sequentially subjected to a residual operation and a full connection operation to obtain a position vector of the feature point.
  • a positioning device for a face feature point including:
  • An edge detection module is configured to perform edge detection on the face image to obtain a facial feature line image
  • the fusion module is configured to fuse the face image with the face feature line image to obtain location information of the face feature point.
  • the device further includes:
  • a discriminating module configured to perform validity discrimination on the facial feature line image to obtain an optimized facial feature line image
  • the fusion module is used to:
  • the face image is merged with the optimized face feature line image to obtain position information of the face feature point.
  • the edge detection module includes:
  • a feature extraction sub-module configured to perform feature line feature extraction on the face image, and acquire a feature line image
  • a first optimization submodule configured to optimize the feature line image to obtain the facial feature line image.
  • the feature extraction submodule is used to:
  • the convolution, the residual operation, the downsampling, and the residual operation are sequentially performed on the face image to acquire the feature line image.
  • the first optimization submodule is used to:
  • each optimized network includes an hourglass type network for implementing residual operation and information line information transmission Information transfer layer.
  • the fusion module includes:
  • a first fusion sub-module configured to perform the input image fusion on the face image to obtain a first fused image
  • a second fusion sub-module configured to perform at least one edge image fusion on the first fused image and the facial feature line image to obtain a second fused image
  • mapping submodule configured to map the second fused image to obtain a position vector of the feature point, and use the position vector as position information of the facial feature point.
  • the fusion module further includes:
  • a second optimization sub-module configured to perform optimization processing on the first fused image to obtain an optimized first fused image, where the optimization processing includes convolution, downsampling, and residual operations in sequence.
  • the first fusion submodule includes:
  • a first multiplying unit configured to multiply the face image and each of the predefined feature line images pixel by pixel to obtain a plurality of boundary features corresponding to each of the predefined feature line images
  • a first superimposing unit configured to superimpose the plurality of the boundary features with the face image to obtain a first fused image.
  • the second fusion submodule includes:
  • a second superimposing unit configured to superimpose the first fused image and the facial feature line image to obtain a third fused image
  • a residual operation unit configured to perform a residual operation on the third fused image to obtain a fourth fused image having the same size as the face feature line image
  • a second multiplying unit configured to multiply the first fused image and the fourth fused image pixel by pixel to obtain a fifth fused image
  • a third superimposing unit configured to superimpose the first fused image and the fifth fused image to obtain the second fused image.
  • the fusion module further includes:
  • a residual operation sub-module for performing a residual operation on the result of each level of boundary fusion is a residual operation sub-module for performing a residual operation on the result of each level of boundary fusion.
  • mapping submodule is used to:
  • the second fused image is sequentially subjected to a residual operation and a full connection operation to obtain a position vector of the feature point.
  • an electronic device comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the method described above.
  • a computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions are implemented by a processor to implement the above method.
  • the method and device for locating facial feature points of various aspects of the present disclosure obtains a facial feature line image by performing edge detection on a face image, and fuses the face image and the facial feature line image to obtain a facial feature point.
  • the position information which is combined with the face feature line for the face feature point positioning, can improve the accuracy of the face feature point positioning, even if the face in the face image is occluded, the face is a larger angle side Faces or face expressions are more exaggerated and other complex situations, and can still accurately locate face feature points.
  • FIG. 1 illustrates a flowchart of a method for locating a face feature point according to an embodiment of the present disclosure
  • FIG. 2 illustrates an exemplary flowchart of a method of locating a face feature point according to an embodiment of the present disclosure
  • FIG. 3 illustrates an exemplary flowchart of a step S11 of a method for locating a face feature point according to an embodiment of the present disclosure
  • FIG. 4 illustrates an exemplary flowchart of a method S9 of positioning a face feature point according to an embodiment of the present disclosure
  • FIG. 5 illustrates an exemplary flowchart of step S121 of a method for locating a face feature point according to an embodiment of the present disclosure
  • FIG. 6 illustrates an exemplary flowchart of step S122 of a method for locating a facial feature point according to an embodiment of the present disclosure
  • FIG. 7 illustrates a block diagram of a positioning device for a face feature point according to an embodiment of the present disclosure
  • FIG. 8 illustrates an exemplary block diagram of a positioning device for a face feature point according to an embodiment of the present disclosure
  • FIG. 9 is a block diagram of an apparatus 800 for positioning of facial feature points, according to an exemplary embodiment
  • FIG. 10 is a block diagram of an apparatus 1900 for positioning of facial feature points, according to an exemplary embodiment.
  • FIG. 1 illustrates a flow chart of a method of locating a facial feature point in accordance with an embodiment of the present disclosure. As shown in FIG. 1, the method includes step S11 and step S12.
  • step S11 edge detection is performed on the face image to acquire a face feature line image.
  • the face image may refer to an image including a face, or the face image may refer to an image in which face feature point positioning is required.
  • the edge detection may be performed by using a Sobel operator or a Canny operator in the related art, which is not limited herein.
  • the facial image is edge-detected by a convolutional neural network to obtain a facial feature line image.
  • step S12 the face image is merged with the face feature line image to obtain position information of the face feature point.
  • the face feature points may include one or more of a face contour feature point, an eyebrow feature point, an eye feature point, a nose feature point, and a lip feature point.
  • the eye feature points may include eyelid line feature points; the eyelid line feature points may include eye corner feature points; the nose feature points may include nose bridge feature points; and the lip feature points may include lip line feature points.
  • the face image is merged with the face feature line image by the feature point prediction network to obtain position information of the face feature point.
  • the fusion of the face image and the face feature line image may indicate that the information in the face image is combined with the information in the face feature line image.
  • the pixels and/or features in the face image are combined with the pixels and/or features in the face feature line image in some manner.
  • the face image is obtained by performing edge detection on the face image, and the face image and the face feature line image are merged to obtain the position information of the face feature point, thereby combining the face feature line.
  • the positioning of the face feature points can improve the accuracy of the face feature point positioning, even in the complex case where the face in the face image is occluded, the face is a larger angle side face or the face face is more exaggerated. , still able to accurately perform face feature point positioning.
  • FIG. 2 illustrates an exemplary flowchart of a method of locating a face feature point according to an embodiment of the present disclosure. As shown in FIG. 2, the method may include steps S21 to S23.
  • step S21 edge detection is performed on the face image to acquire a face feature line image.
  • step S21 refer to the description of step S11 above.
  • step S22 the face feature line image is discriminated for validity, and an optimized face feature line image is obtained.
  • the convolutional neural network based on the confrontation generation model is used to discriminate the facial feature line image to obtain an optimized facial feature line image.
  • the discriminant model in the confrontation generation model can be used to discriminate the facial feature line image, that is, the discriminant model can be used to determine whether the face feature line image is valid; the generation model in the confrontation generation model Can be used to generate optimized face feature line images.
  • step S23 the face image is merged with the optimized face feature line image to obtain position information of the face feature point.
  • the detection result of the face feature line image has a great influence on the accuracy of the final face feature point location. Therefore, by optimizing the face feature line image, an optimized face feature line image is obtained, and the face image is merged with the optimized face feature line image to obtain the position information of the face feature point. The quality of the face feature line image can be greatly improved, thereby further improving the accuracy of the face feature point location.
  • FIG. 3 illustrates an exemplary flowchart of the step S11 of the method for locating a face feature point according to an embodiment of the present disclosure.
  • step S11 may include step S111 and step S112.
  • step S111 feature line feature extraction is performed on the face image to acquire a feature line image.
  • the feature line may include a face contour feature line, a left eyebrow feature line, a right eyebrow feature line, a nose beam feature line, a left eye upper eyelid feature line, a left eye lower eyelid feature line, a right eye upper eyelid feature line, One or more of the right eyelid eyelid feature line, the upper edge feature line of the upper lip, the lower edge feature line of the upper lip, the upper edge feature line of the lower lip, and the lower edge feature line of the lower lip.
  • a convolutional neural network is used to extract feature line features of a face image to acquire a feature line image.
  • ResNet18 can be used to perform feature line feature extraction on a face image to acquire a feature line image.
  • feature line feature extraction is performed on the face image, and the feature line image is acquired, including: performing convolution, residual operation, downsampling, and residual operation on the face image in sequence to acquire features Line image.
  • step S112 the feature line image is optimized to acquire a face feature line image.
  • the feature line image is optimized to obtain a facial feature line image, including: passing the feature line image through at least one level optimization network to obtain a facial feature line image, wherein each level optimization network includes An hourglass type network for implementing residual operations and an information transfer layer for implementing feature line information transfer. For example, if a first-level optimization network is included, the feature line image is sequentially optimized through an hourglass network and an information delivery layer to obtain a facial feature line image. If the secondary optimization network is included, the feature line image is sequentially processed through the first hourglass type network, the first information transmission layer, the second hourglass type network, and the second information transmission layer to obtain a facial feature line image. In other embodiments, if three or more optimized networks are included, the same is followed.
  • FIG. 4 illustrates an exemplary flowchart of a step S12 of positioning method of a face feature point according to an embodiment of the present disclosure. As shown in FIG. 4, step S12 may include steps S121 to S123.
  • step S121 the face image is subjected to input image fusion to obtain a first fused image.
  • the first fused image may embody boundary features of each feature line in the face image.
  • step S122 the first fused image and the facial feature line image are merged with at least one level of edge image to obtain a second fused image.
  • step S123 the second fused image is mapped to obtain a position vector of the feature point, and the position vector is used as position information of the face feature point.
  • mapping the second fused image to obtain a position vector of the feature point comprises: sequentially passing the second fused image through a residual operation and a full connection operation to obtain a position vector of the feature point.
  • the method before the first fused image and the facial feature line image are merged with the at least one edge image, the method further includes: optimizing the first fused image to obtain the optimized first fused image.
  • the optimization process includes convolution, downsampling, and residual operations in sequence.
  • the method further comprises: performing a residual operation on the result of each level boundary fusion.
  • FIG. 5 illustrates an exemplary flowchart of step S121 of a method for locating a face feature point according to an embodiment of the present disclosure.
  • step S121 may include step S1211 and step S1212.
  • step S1211 the face image is multiplied pixel by pixel with each of the predefined feature line images to obtain a plurality of boundary features corresponding to each of the predefined feature line images.
  • step S1212 a plurality of boundary features are superimposed on the face image to obtain a first fused image.
  • the first fused image F can be obtained by using Equation 1,
  • I represents a face image
  • M i represents an i-th predefined feature line image
  • K represents a number of predefined feature line images.
  • the implementation obtains a plurality of boundary features corresponding to each of the predefined feature line images by multiplying the face image by each of the predefined feature line images, and the plurality of boundary features and the face
  • the image is superimposed to obtain a first fused image, and the first fused image thus obtained only pays attention to the structurally rich part and the characteristic part of the face image, ignoring the background part and the structure rich in the face image, thereby being able to greatly improve
  • the first fused image is used as the input of the subsequent network.
  • This implementation also takes into account the original face image so that valuable feature information in the face image can be used for subsequent feature point prediction.
  • the method further includes: marking a face feature point in the training image for any one of the training image sets; and interpolating the face feature points in the training image to obtain the training image Face feature line information; a convolutional neural network for acquiring a predefined feature line image is trained based on each training image in the training image set and the face feature line information in each training image.
  • the training image set may include a plurality of training images, and 106 personal feature points may be respectively labeled in each training image.
  • interpolation may be performed between adjacent facial feature points in the training image, and the interpolated curve may be used as a facial feature line in the training image.
  • the implementation method performs the interpolation of the facial feature points in the training image by extracting the facial feature points in the training image by using any one of the training images in the training image set, and obtaining the facial feature line information in the training image, and according to the training.
  • Each training image in the image set, and the face feature line information in each training image training a convolutional neural network for acquiring a predefined feature line image, thereby interpolating the adult face feature line with the labeled face feature point Supervised training to obtain a convolutional neural network of predefined feature line images.
  • FIG. 6 illustrates an exemplary flowchart of step S122 of a method for locating a facial feature point according to an embodiment of the present disclosure. As shown in FIG. 6, step S122 may include steps S1221 through S1224.
  • step S1221 the first fused image and the facial feature line image are superimposed to obtain a third fused image.
  • step S1222 the third fused image is subjected to a residual operation to obtain a fourth fused image having the same size as the face feature line image.
  • step S1223 the first fused image and the fourth fused image are multiplied pixel by pixel to obtain a fifth fused image.
  • step S1224 the first fused image and the fifth fused image are superimposed to obtain a second fused image.
  • the second fused image H can be obtained by using Equation 2,
  • F represents the first fused image and M represents the facial feature line image.
  • M represents the facial feature line image.
  • the conversion structure T can adopt an hourglass type network. Representing the first fused image F and the fourth fused image Multiply by pixel, For the fifth fused image. Representing the first fused image F and the fifth fused image Superimposed.
  • the method further includes: using each training image in the training image set and the face feature line information in each training image as an input of the feature point prediction network, and selecting a facial feature point in each training image.
  • the location information is used as a feature point prediction network output, and the training feature point prediction network.
  • the number of face feature points in each training image may be 106.
  • the position information of the face feature points in each training image is used as the output of the feature point prediction network by using the face training line information in each training image set and the face feature line information in each training image as the input of the feature point prediction network.
  • the training feature point prediction network is used to fuse the facial feature line information, and the facial feature points in the face image are used for supervised training.
  • the feature point prediction network obtained by the training can obtain the positioning result of the face feature points with higher precision because the face feature line information is merged.
  • FIG. 7 illustrates a block diagram of a positioning device for a face feature point according to an embodiment of the present disclosure.
  • the device includes: an edge detection module 71 configured to perform edge detection on a face image to acquire a face feature line image; and a fusion module 72 configured to fuse the face image and the face feature line image , to get the location information of the face feature points.
  • FIG. 8 illustrates an exemplary block diagram of a positioning device for a face feature point according to an embodiment of the present disclosure. As shown in Figure 8:
  • the device further includes: a discriminating module 73, configured to perform validity discrimination on the facial feature line image to obtain an optimized facial feature line image; and the fusion module 72 is configured to: face the facial image The image is merged with the optimized face feature line image to obtain the position information of the face feature point.
  • a discriminating module 73 configured to perform validity discrimination on the facial feature line image to obtain an optimized facial feature line image
  • the fusion module 72 is configured to: face the facial image The image is merged with the optimized face feature line image to obtain the position information of the face feature point.
  • the edge detection module 71 includes: a feature extraction sub-module 711, configured to perform feature line feature extraction on the face image, and acquire a feature line image; and a first optimization sub-module 712 for the feature line The image is optimized to obtain a facial feature line image.
  • the feature extraction sub-module 711 is configured to: perform operations of convolution, residual operation, downsampling, and residual operation on the face image in sequence to acquire the feature line image.
  • the first optimization sub-module 712 is configured to: obtain a facial feature line image by passing the feature line image through at least one level optimization network, where each level optimization network includes a residual operation An hourglass type network and an information transfer layer for implementing feature line information transfer.
  • the fusion module 72 includes: a first fusion sub-module 721, configured to perform an input image fusion on the face image to obtain a first fusion image; and a second fusion sub-module 722, configured to be the first The fused image and the facial feature line image are merged with at least one edge image to obtain a second fused image; the mapping sub-module 723 is configured to map the second fused image to obtain a position vector of the feature point, and the position vector is used as a person The location information of the face feature points.
  • the fusion module 72 further includes: a second optimization sub-module 724, configured to perform optimization processing on the first fused image to obtain an optimized first fused image, where the optimization process includes convolution in sequence , downsampling and residual operations.
  • the first fusion sub-module 721 includes: a first multiplication unit, configured to multiply a face image with each predefined feature line image pixel by pixel to obtain multiple and each pre-preparation
  • the defined feature line image has a one-to-one corresponding boundary feature
  • the first superimposing unit is configured to superimpose the plurality of boundary features with the face image to obtain a first fused image.
  • the second fusion sub-module 722 includes: a second superimposing unit, configured to superimpose the first fused image and the facial feature line image to obtain a third fused image; and a residual computing unit. Performing a residual operation on the third fused image to obtain a fourth fused image having the same size as the face feature line image; and a second multiplying unit for multiplying the first fused image and the fourth fused image by pixels, to obtain a fifth merging unit, configured to superimpose the first fused image and the fifth fused image to obtain a second fused image.
  • the fusion module 72 further includes: a residual operation sub-module 725, configured to perform a residual operation on the result of each level of boundary fusion.
  • mapping sub-module 723 is configured to: sequentially pass the second fused image through a residual operation and a full connection operation to obtain a position vector of the feature point.
  • the face image is obtained by performing edge detection on the face image, and the face image and the face feature line image are merged to obtain the position information of the face feature point, thereby combining the face feature line.
  • the positioning of the face feature points can improve the accuracy of the face feature point positioning, even in the complex case where the face in the face image is occluded, the face is a larger angle side face or the face face is more exaggerated. , still able to accurately perform face feature point positioning.
  • FIG. 9 is a block diagram of an apparatus 800 for positioning of facial feature points, according to an exemplary embodiment.
  • device 800 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • device 800 can include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, And a communication component 816.
  • Processing component 802 typically controls the overall operation of device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 802 can include one or more processors 820 to execute instructions to perform all or part of the steps of the above described methods.
  • processing component 802 can include one or more modules to facilitate interaction between component 802 and other components.
  • processing component 802 can include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operation at device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM Electrically erasable programmable read only memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 806 provides power to various components of device 800.
  • Power component 806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 800.
  • the multimedia component 808 includes a screen between the device 800 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 810 is configured to output and/or input an audio signal.
  • the audio component 810 includes a microphone (MIC) that is configured to receive an external audio signal when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 804 or transmitted via communication component 816.
  • the audio component 810 also includes a speaker for outputting an audio signal.
  • the I/O interface 812 provides an interface between the processing component 802 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 814 includes one or more sensors for providing device 800 with a status assessment of various aspects.
  • sensor assembly 814 can detect an open/closed state of device 800, relative positioning of components, such as the display and keypad of device 800, and sensor component 814 can also detect a change in position of one component of device 800 or device 800. The presence or absence of user contact with device 800, device 800 orientation or acceleration/deceleration, and temperature variation of device 800.
  • Sensor assembly 814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communication between device 800 and other devices.
  • the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • communication component 816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • a non-transitory computer readable storage medium such as a memory 804 comprising computer program instructions executable by processor 820 of apparatus 800 to perform the above method.
  • FIG. 10 is a block diagram of an apparatus 1900 for positioning of facial feature points, according to an exemplary embodiment.
  • device 1900 can be provided as a server.
  • apparatus 1900 includes a processing component 1922 that further includes one or more processors, and memory resources represented by memory 1932 for storing instructions executable by processing component 1922, such as an application.
  • An application stored in memory 1932 can include one or more modules each corresponding to a set of instructions.
  • processing component 1922 is configured to execute instructions to perform the methods described above.
  • Apparatus 1900 can also include a power supply component 1926 configured to perform power management of apparatus 1900, a wired or wireless network interface 1950 configured to connect apparatus 1900 to the network, and an input/output (I/O) interface 1958.
  • Device 1900 can operate based on an operating system stored in memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
  • a non-transitory computer readable storage medium such as a memory 1932 comprising computer program instructions executable by processing component 1922 of apparatus 1900 to perform the above method.
  • the present disclosure can be a system, method, and/or computer program product.
  • the computer program product can comprise a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
  • the computer readable storage medium can be a tangible device that can hold and store the instructions used by the instruction execution device.
  • the computer readable storage medium can be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, for example, with instructions stored thereon A raised structure in the hole card or groove, and any suitable combination of the above.
  • a computer readable storage medium as used herein is not to be interpreted as a transient signal itself, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (eg, a light pulse through a fiber optic cable), or through a wire The electrical signal transmitted.
  • the computer readable program instructions described herein can be downloaded from a computer readable storage medium to various computing/processing devices or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine related instructions, microcode, firmware instructions, state setting data, or in one or more programming languages.
  • Source code or object code written in any combination including object oriented programming languages such as Smalltalk, C++, etc., as well as conventional procedural programming languages such as the "C" language or similar programming languages.
  • the computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer, partly on the remote computer, or entirely on the remote computer or server. carried out.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computer (eg, using an Internet service provider to access the Internet) connection).
  • the customized electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by utilizing state information of computer readable program instructions.
  • Computer readable program instructions are executed to implement various aspects of the present disclosure.
  • the computer readable program instructions can be provided to a general purpose computer, a special purpose computer, or a processor of other programmable data processing apparatus to produce a machine such that when executed by a processor of a computer or other programmable data processing apparatus Means for implementing the functions/acts specified in one or more of the blocks of the flowcharts and/or block diagrams.
  • the computer readable program instructions can also be stored in a computer readable storage medium that causes the computer, programmable data processing device, and/or other device to operate in a particular manner, such that the computer readable medium storing the instructions includes An article of manufacture that includes instructions for implementing various aspects of the functions/acts recited in one or more of the flowcharts.
  • the computer readable program instructions can also be loaded onto a computer, other programmable data processing device, or other device to perform a series of operational steps on a computer, other programmable data processing device or other device to produce a computer-implemented process.
  • instructions executed on a computer, other programmable data processing apparatus, or other device implement the functions/acts recited in one or more of the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagram can represent a module, a program segment, or a portion of an instruction that includes one or more components for implementing the specified logical functions.
  • Executable instructions can also occur in a different order than those illustrated in the drawings. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or function. Or it can be implemented by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种人脸特征点的定位方法及装置。该方法包括:对人脸图像进行边缘检测,获取人脸特征线图像(S11);将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息(S12)。通过结合人脸特征线进行人脸特征点的定位,能够提高人脸特征点定位的准确性,即使在人脸图像中的人脸被遮挡、人脸为较大角度的侧脸或者人脸的表情较夸张等复杂情况下,仍然能够准确地进行人脸特征点定位。

Description

人脸特征点的定位方法及装置
本申请要求在2018年4月24日提交中国专利局、申请号为201810373871.6、申请名称为“人脸特征点的定位方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及一种人脸特征点的定位方法及装置。
背景技术
人脸特征点定位是人脸相关的计算机视觉问题中重要的一类。人脸特征点定位的任务是计算出人脸图像中若干个人脸特征点的位置。例如,计算出人脸图像中眼角、嘴角、鼻尖等人脸特征点的位置。
人脸特征点定位的问题可以通过深度神经网络来解决。然而,随着深度神经网络的层数的加深,人脸结构信息的丢失变严重。在人脸图像中的人脸被严重遮挡、人脸为大角度的侧脸或者人脸的表情夸张等复杂情况下,人脸特征点定位的准确性严重下降。
发明内容
有鉴于此,本公开提出了一种人脸特征点的定位方法及装置。
根据本公开的一方面,提供了一种人脸特征点的定位方法,包括:
对人脸图像进行边缘检测,获取人脸特征线图像;
将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息。
在一种可能的实现方式中,在所述将所述人脸图像与所述人脸特征线图像进行融合之前,还包括:
对所述人脸特征线图像进行有效性判别,得到优化的人脸特征线图像;
所述将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息,包括:
将所述人脸图像与所述优化的人脸特征线图像进行融合,得到人脸特征点的位置信息。
在一种可能的实现方式中,所述对人脸图像进行边缘检测,获取人脸特征线图像,包括:
对所述人脸图像进行特征线特征提取,获取特征线图像;
对所述特征线图像进行优化,获取所述人脸特征线图像。
在一种可能的实现方式中,所述对所述人脸图像进行特征线特征提取,获取特征线图像,包括:
对所述人脸图像依次执行卷积、残差运算、下采样和残差运算的操作,获取所述特征线图像。
在一种可能的实现方式中,所述对所述特征线图像进行优化,获取所述人脸特征线图像,包括:
将所述特征线图像经过至少一级优化网络,获取所述人脸特征线图像,其中,每级所述优化网络包括用于实现残差运算的沙漏型网络和用于实现特征线信息传递的信息传递层。
在一种可能的实现方式中,所述将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息,包括:
将所述人脸图像进行输入图像融合,得到第一融合图像;
将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像;
将所述第二融合图像进行映射,得到特征点的位置向量,并将所述位置向量作为人脸特征点的位置信息。
在一种可能的实现方式中,在将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合之前,还包括:
对所述第一融合图像进行优化处理,得到优化后的第一融合图像,其中,所述优化处理依次包括卷积、下采样和残差运算。
在一种可能的实现方式中,所述将所述人脸图像进行输入图像融合,得到第一融合图像,包括:
将所述人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征;
将多个所述边界特征与所述人脸图像叠加,得到第一融合图像。
在一种可能的实现方式中,所述将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像,包括:
将所述第一融合图像与所述人脸特征线图像进行叠加,得到第三融合图像;
将所述第三融合图像进行残差运算,得到与所述人脸特征线图像大小相同的第四融合图像;
将所述第一融合图像与所述第四融合图像逐像素相乘,得到第五融合图像;
将所述第一融合图像与所述第五融合图像叠加,得到所述第二融合图像。
在一种可能的实现方式中,在每级所述边界图像融合之间,还包括:对每级边界融合的结果进行残差运算。
在一种可能的实现方式中,所述将所述第二融合图像进行映射,得到特征点的位置向量,包括:
将所述第二融合图像依次经过残差运算和全连接操作,得到所述特征点的位置向量。
根据本公开的另一方面,提供了一种人脸特征点的定位装置,包括:
边缘检测模块,用于对人脸图像进行边缘检测,获取人脸特征线图像;
融合模块,用于将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息。
在一种可能的实现方式中,所述装置还包括:
判别模块,用于对所述人脸特征线图像进行有效性判别,得到优化的人脸特征线图像;
所述融合模块用于:
将所述人脸图像与所述优化的人脸特征线图像进行融合,得到人脸特征点的位置信息。
在一种可能的实现方式中,所述边缘检测模块包括:
特征提取子模块,用于对所述人脸图像进行特征线特征提取,获取特征线图像;
第一优化子模块,用于对所述特征线图像进行优化,获取所述人脸特征线图像。
在一种可能的实现方式中,所述特征提取子模块用于:
对所述人脸图像依次执行卷积、残差运算、下采样和残差运算的操作,获取所述特征线图像。
在一种可能的实现方式中,所述第一优化子模块用于:
将所述特征线图像经过至少一级优化网络,获取所述人脸特征线图像,其中,每级所述优化网络包括用于实现残差运算的沙漏型网络和用于实现特征线信息传递的信息传递层。
在一种可能的实现方式中,所述融合模块包括:
第一融合子模块,用于将所述人脸图像进行输入图像融合,得到第一融合图像;
第二融合子模块,用于将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像;
映射子模块,用于将所述第二融合图像进行映射,得到特征点的位置向量,并将所 述位置向量作为人脸特征点的位置信息。
在一种可能的实现方式中,所述融合模块还包括:
第二优化子模块,用于对所述第一融合图像进行优化处理,得到优化后的第一融合图像,其中,所述优化处理依次包括卷积、下采样和残差运算。
在一种可能的实现方式中,所述第一融合子模块包括:
第一相乘单元,用于将所述人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征;
第一叠加单元,用于将多个所述边界特征与所述人脸图像叠加,得到第一融合图像。
在一种可能的实现方式中,所述第二融合子模块包括:
第二叠加单元,用于将所述第一融合图像与所述人脸特征线图像进行叠加,得到第三融合图像;
残差运算单元,用于将所述第三融合图像进行残差运算,得到与所述人脸特征线图像大小相同的第四融合图像;
第二相乘单元,用于将所述第一融合图像与所述第四融合图像逐像素相乘,得到第五融合图像;
第三叠加单元,用于将所述第一融合图像与所述第五融合图像叠加,得到所述第二融合图像。
在一种可能的实现方式中,所述融合模块还包括:
残差运算子模块,用于对每级边界融合的结果进行残差运算。
在一种可能的实现方式中,所述映射子模块用于:
将所述第二融合图像依次经过残差运算和全连接操作,得到所述特征点的位置向量。
根据本公开的另一方面,提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行上述方法。
根据本公开的另一方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述计算机程序指令被处理器执行时实现上述方法。
本公开的各方面的人脸特征点的定位方法及装置通过对人脸图像进行边缘检测,获取人脸特征线图像,并将人脸图像和人脸特征线图像进行融合,得到人脸特征点的位置信息,由此结合人脸特征线进行人脸特征点的定位,能够提高人脸特征点定位的准确性,即使在人脸图像中的人脸被遮挡、人脸为较大角度的侧脸或者人脸的表情较夸张等复杂情况下,仍然能够准确地进行人脸特征点定位。
根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。
附图说明
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本公开的示例性实施例、特征和方面,并且用于解释本公开的原理。
图1示出根据本公开一实施例的人脸特征点的定位方法的流程图;
图2示出根据本公开一实施例的人脸特征点的定位方法的一示例性的流程图;
图3示出根据本公开一实施例的人脸特征点的定位方法步骤S11的一示例性的流程图;
图4示出根据本公开一实施例的人脸特征点的定位方法步骤S12的一示例性的流程图;
图5示出根据本公开一实施例的人脸特征点的定位方法步骤S121的一示例性的流程图;
图6示出根据本公开一实施例的人脸特征点的定位方法步骤S122的一示例性的流程图;
图7示出根据本公开一实施例的人脸特征点的定位装置的框图;
图8示出根据本公开一实施例的人脸特征点的定位装置的一示例性的框图;
图9是根据一示例性实施例示出的一种用于人脸特征点的定位的装置800的框图;
图10是根据一示例性实施例示出的一种用于人脸特征点的定位的装置1900的框图。
具体实施方式
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
另外,为了更好的说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。
图1示出根据本公开一实施例的人脸特征点的定位方法的流程图。如图1所示,该方法包括步骤S11和步骤S12。
在步骤S11中,对人脸图像进行边缘检测,获取人脸特征线图像。
在本实施例中,人脸图像可以指包含人脸的图像,或者,人脸图像可以指需要进行人脸特征点定位的图像。
本公开实施例可以采用相关技术中的Sobel算子或者Canny算子等进行边缘检测,在此不作限定。
在一种可能的实现方式中,通过卷积神经网络对人脸图像进行边缘检测,获取人脸特征线图像。
在步骤S12中,将人脸图像与人脸特征线图像进行融合,得到人脸特征点的位置信息。
在本实施例中,基于人脸特征线图像提供的鲁棒的人脸结构信息,能够对人脸图像进行精准的人脸特征点的定位。
在本实施例中,人脸特征点可以包括人脸轮廓特征点、眉毛特征点、眼睛特征点、鼻子特征点和嘴唇特征点等中的一种或多种。其中,眼睛特征点可以包括眼睑线特征点;眼睑线特征点可以包括眼角特征点;鼻子特征点可以包括鼻梁特征点;嘴唇特征点可以包括唇线特征点。
在一种可能的实现方式中,通过特征点预测网络将人脸图像与人脸特征线图像进行融合,得到人脸特征点的位置信息。
在本公开实施例中,将人脸图像与人脸特征线图像进行融合,可以表示将人脸图像中的信息与人脸特征线图像中的信息相结合。例如,可以表示将人脸图像中的像素和/或特征与人脸特征线图像中的像素和/或特征以某种方式相结合。
本实施例通过对人脸图像进行边缘检测,获取人脸特征线图像,并将人脸图像和人脸特征线图像进行融合,得到人脸特征点的位置信息,由此结合人脸特征线进行人脸特征点的定位,能够提高人脸特征点定位的准确性,即使在人脸图像中的人脸被遮挡、人脸为较大角度的侧脸或者人脸的表情较夸张等复杂情况下,仍然能够准确地进行人脸特征点定位。
图2示出根据本公开一实施例的人脸特征点的定位方法的一示例性的流程图。如图2所示,该方法可以包括步骤S21至步骤S23。
在步骤S21中,对人脸图像进行边缘检测,获取人脸特征线图像。
其中,对步骤S21参见上文对步骤S11的描述。
在步骤S22中,对人脸特征线图像进行有效性判别,得到优化的人脸特征线图像。
在一种可能的实现方式中,采用基于对抗生成模型的卷积神经网络对人脸特征线图像进行有效性判别,得到优化的人脸特征线图像。在该实现方式中,对抗生成模型中的判别模型可以用于对人脸特征线图像进行有效性判别,即,判别模型可以用于判别人脸特征线图像是否有效;对抗生成模型中的生成模型可以用于生成优化的人脸特征线图像。
在步骤S23中,将人脸图像与优化的人脸特征线图像进行融合,得到人脸特征点的位置信息。
在本实施例中,人脸特征线图像的检测结果对最终的人脸特征点定位的准确性的影响较大。因此,通过对人脸特征线图像进行有效性判别,得到优化的人脸特征线图像,并将人脸图像与优化的人脸特征线图像进行融合,得到人脸特征点的位置信息,由此能够大大提高人脸特征线图像的质量,从而能够进一步提高人脸特征点定位的准确性。
图3示出根据本公开一实施例的人脸特征点的定位方法步骤S11的一示例性的流程图。如图3所示,步骤S11可以包括步骤S111和步骤S112。
在步骤S111中,对人脸图像进行特征线特征提取,获取特征线图像。
在本实施例中,特征线可以包括人脸轮廓特征线、左眉毛特征线、右眉毛特征线、鼻梁特征线、左眼上眼睑特征线、左眼下眼睑特征线、右眼上眼睑特征线、右眼下眼睑特征线、上嘴唇的上边缘特征线、上嘴唇的下边缘特征线、下嘴唇的上边缘特征线和下嘴唇的下边缘特征线等中的一项或多项。
在一种可能的实现方式中,采用卷积神经网络对人脸图像进行特征线特征提取,获取特征线图像。例如,可以采用ResNet18对人脸图像进行特征线特征提取,获取特征线图像。
在一种可能的实现方式中,对人脸图像进行特征线特征提取,获取特征线图像,包括:对人脸图像依次执行卷积、残差运算、下采样和残差运算的操作,获取特征线图像。
在步骤S112中,对特征线图像进行优化,获取人脸特征线图像。
在一种可能的实现方式中,对特征线图像进行优化,获取人脸特征线图像,包括:将特征线图像经过至少一级优化网络,获取人脸特征线图像,其中,每级优化网络包括用于实现残差运算的沙漏型网络和用于实现特征线信息传递的信息传递层。例如,若包含一级优化网络,则将特征线图像依次通过沙漏型网络和信息传递层进行优化处理,获取人脸特征线图像。若包含二级优化网络,则将特征线图像依次通过第一沙漏型网络、第一信息传递层、第二沙漏型网络和第二信息传递层进行优化处理,获取人脸特征线图 像。在其他实施例中,若包含三级及以上优化网络,则按照前述方式以此类推。
图4示出根据本公开一实施例的人脸特征点的定位方法步骤S12的一示例性的流程图。如图4所示,步骤S12可以包括步骤S121至步骤S123。
在步骤S121中,将人脸图像进行输入图像融合,得到第一融合图像。
在本实施例中,第一融合图像可以体现人脸图像中各条特征线的边界特征。
在步骤S122中,将第一融合图像与人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像。
在步骤S123中,将第二融合图像进行映射,得到特征点的位置向量,并将位置向量作为人脸特征点的位置信息。
在一种可能的实现方式中,将第二融合图像进行映射,得到特征点的位置向量,包括:将第二融合图像依次经过残差运算和全连接操作,得到特征点的位置向量。
在一种可能的实现方式中,在将第一融合图像与人脸特征线图像进行至少一级边缘图像融合之前,还包括:对第一融合图像进行优化处理,得到优化后的第一融合图像,其中,优化处理依次包括卷积、下采样和残差运算。
在一种可能的实现方式中,在每级边界图像融合之间,还包括:对每级边界融合的结果进行残差运算。
图5示出根据本公开一实施例的人脸特征点的定位方法步骤S121的一示例性的流程图。如图5所示,步骤S121可以包括步骤S1211和步骤S1212。
在步骤S1211中,将人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征。
在步骤S1212中,将多个边界特征与人脸图像叠加,得到第一融合图像。
在一种可能的实现方式中,可以采用式1得到第一融合图像F,
Figure PCTCN2018116779-appb-000001
其中,I表示人脸图像,M i表示第i个预定义的特征线图像,K表示预定义的特征线图像的个数。
Figure PCTCN2018116779-appb-000002
表示将M i与I逐像素相乘,
Figure PCTCN2018116779-appb-000003
表示叠加操作。
该实现方式通过将人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征,并将多个边界特征与人脸图像叠加,得到第一融合图像,由此得到的第一融合图像仅关注人脸图像中结构丰富的部分和特征部分,忽略人脸图像中的背景部分和结构不丰富的部分,从而能够大大提高第一融合图像作为后续网络的输入的有效性。该实现方式还考虑了原始的人脸图像,从而能够利用人脸图 像中有价值的信息进行后续的特征点预测。
在一种可能的实现方式中,该方法还包括:对于训练图像集中的任意一个训练图像,在训练图像中标注人脸特征点;对训练图像中的人脸特征点进行插值,得到训练图像中的人脸特征线信息;根据训练图像集中的各个训练图像,以及各个训练图像中的人脸特征线信息,训练用于获取预定义的特征线图像的卷积神经网络。在该实现方式中,训练图像集可以包括多个训练图像,可以在各个训练图像中分别标注106个人脸特征点。在该实现方式中,可以在训练图像中相邻的人脸特征点之间进行插值得到曲线,并可以将插值得到的曲线作为该训练图像中的人脸特征线。该实现方式通过对于训练图像集中的任意一个训练图像,在训练图像中标注人脸特征点,对训练图像中的人脸特征点进行插值,得到训练图像中的人脸特征线信息,并根据训练图像集中的各个训练图像,以及各个训练图像中的人脸特征线信息,训练用于获取预定义的特征线图像的卷积神经网络,由此利用标注的人脸特征点插值成人脸特征线作为监督进行训练用于获取预定义的特征线图像的卷积神经网络。
图6示出根据本公开一实施例的人脸特征点的定位方法步骤S122的一示例性的流程图。如图6所示,步骤S122可以包括步骤S1221至步骤S1224。
在步骤S1221中,将第一融合图像与人脸特征线图像进行叠加,得到第三融合图像。
在步骤S1222中,将第三融合图像进行残差运算,得到与人脸特征线图像大小相同的第四融合图像。
在步骤S1223中,将第一融合图像与第四融合图像逐像素相乘,得到第五融合图像。
在步骤S1224中,将第一融合图像与第五融合图像叠加,得到第二融合图像。
在一种可能的实现方式中,可以采用式2得到第二融合图像H,
Figure PCTCN2018116779-appb-000004
其中,F表示第一融合图像,M表示人脸特征线图像,
Figure PCTCN2018116779-appb-000005
表示将第一融合图像与人脸特征线图像进行叠加,
Figure PCTCN2018116779-appb-000006
为第三融合图像。
Figure PCTCN2018116779-appb-000007
表示将第三融合图像进行残差运算,
Figure PCTCN2018116779-appb-000008
为第四融合图像。在本实施例中,由于人脸特征线图像M的通道数根据预定义的特征线的数量决定,因此,需要通过转换结构T使人脸特征线图像M与第一融合图像F的通道数相同。其中,转换结构T可以采用沙漏型网络。
Figure PCTCN2018116779-appb-000009
表示将第一融合图像F与第四融合图像
Figure PCTCN2018116779-appb-000010
逐像素相乘,
Figure PCTCN2018116779-appb-000011
为第五融合图像。
Figure PCTCN2018116779-appb-000012
表示将第一融合图像F与第五融合图像
Figure PCTCN2018116779-appb-000013
叠加。
在一种可能的实现方式中,该方法还包括:将训练图像集中的各个训练图像和各个训练图像中的人脸特征线信息作为特征点预测网络的输入,将各个训练图像中人脸特征点的位置信息作为特征点预测网络的输出,训练特征点预测网络。其中,每个训练图像中的人脸特征点的数量可以均为106个。该实现方式通过将训练图像集中的各个训练图像和各个训练图像中的人脸特征线信息作为特征点预测网络的输入,将各个训练图像中人脸特征点的位置信息作为特征点预测网络的输出,训练特征点预测网络,由此融合人脸特征线信息,并采用人脸图像中的人脸特征点进行监督训练。训练得到的特征点预测网络由于融合了人脸特征线信息,因此可以得到精度更高的人脸特征点的定位结果。
图7示出根据本公开一实施例的人脸特征点的定位装置的框图。如图7所示,该装置包括:边缘检测模块71,用于对人脸图像进行边缘检测,获取人脸特征线图像;融合模块72,用于将人脸图像与人脸特征线图像进行融合,得到人脸特征点的位置信息。
图8示出根据本公开一实施例的人脸特征点的定位装置的一示例性的框图。如图8所示:
在一种可能的实现方式中,该装置还包括:判别模块73,用于对人脸特征线图像进行有效性判别,得到优化的人脸特征线图像;融合模块72用于:将人脸图像与优化的人脸特征线图像进行融合,得到人脸特征点的位置信息。
在一种可能的实现方式中,边缘检测模块71包括:特征提取子模块711,用于对人脸图像进行特征线特征提取,获取特征线图像;第一优化子模块712,用于对特征线图像进行优化,获取人脸特征线图像。
在一种可能的实现方式中,特征提取子模块711用于:对人脸图像依次执行卷积、残差运算、下采样和残差运算的操作,获取特征线图像。
在一种可能的实现方式中,第一优化子模块712用于:将特征线图像经过至少一级优化网络,获取人脸特征线图像,其中,每级优化网络包括用于实现残差运算的沙漏型网络和用于实现特征线信息传递的信息传递层。
在一种可能的实现方式中,融合模块72包括:第一融合子模块721,用于将人脸图像进行输入图像融合,得到第一融合图像;第二融合子模块722,用于将第一融合图像与人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像;映射子模块723,用于将第二融合图像进行映射,得到特征点的位置向量,并将位置向量作为人脸特征点的位置信息。
在一种可能的实现方式中,融合模块72还包括:第二优化子模块724,用于对第一融合图像进行优化处理,得到优化后的第一融合图像,其中,优化处理依次包括卷积、下采样和残差运算。
在一种可能的实现方式中,第一融合子模块721包括:第一相乘单元,用于将人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征;第一叠加单元,用于将多个边界特征与人脸图像叠加,得到第一融合图像。
在一种可能的实现方式中,第二融合子模块722包括:第二叠加单元,用于将第一融合图像与人脸特征线图像进行叠加,得到第三融合图像;残差运算单元,用于将第三融合图像进行残差运算,得到与人脸特征线图像大小相同的第四融合图像;第二相乘单元,用于将第一融合图像与第四融合图像逐像素相乘,得到第五融合图像;第三叠加单元,用于将第一融合图像与第五融合图像叠加,得到第二融合图像。
在一种可能的实现方式中,融合模块72还包括:残差运算子模块725,用于对每级边界融合的结果进行残差运算。
在一种可能的实现方式中,映射子模块723用于:将第二融合图像依次经过残差运算和全连接操作,得到特征点的位置向量。
本实施例通过对人脸图像进行边缘检测,获取人脸特征线图像,并将人脸图像和人脸特征线图像进行融合,得到人脸特征点的位置信息,由此结合人脸特征线进行人脸特征点的定位,能够提高人脸特征点定位的准确性,即使在人脸图像中的人脸被遮挡、人脸为较大角度的侧脸或者人脸的表情较夸张等复杂情况下,仍然能够准确地进行人脸特征点定位。
图9是根据一示例性实施例示出的一种用于人脸特征点的定位的装置800的框图。例如,装置800可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图9,装置800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出(I/O)的接口812,传感器组件814,以及通信组件816。
处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指 令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在装置800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件806为装置800的各种组件提供电力。电源组件806可以包括电源管理系统,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。
多媒体组件808包括在所述装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当装置800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。
I/O接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到装置800的打开/关闭状态,组件的相对定位,例如所述组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或装置800一个组 件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器804,上述计算机程序指令可由装置800的处理器820执行以完成上述方法。
图10是根据一示例性实施例示出的一种用于人脸特征点的定位的装置1900的框图。例如,装置1900可以被提供为一服务器。参照图10,装置1900包括处理组件1922,其进一步包括一个或多个处理器,以及由存储器1932所代表的存储器资源,用于存储可由处理组件1922的执行的指令,例如应用程序。存储器1932中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件1922被配置为执行指令,以执行上述方法。
装置1900还可以包括一个电源组件1926被配置为执行装置1900的电源管理,一个有线或无线网络接口1950被配置为将装置1900连接到网络,和一个输入输出(I/O)接口1958。装置1900可以操作基于存储在存储器1932的操作系统,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器1932,上述计算机程序指令可由装置1900的处理组件1922执行以完成 上述方法。
本公开可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (24)

  1. 一种人脸特征点的定位方法,其特征在于,包括:
    对人脸图像进行边缘检测,获取人脸特征线图像;
    将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息。
  2. 根据权利要求1所述的方法,其特征在于,在所述将所述人脸图像与所述人脸特征线图像进行融合之前,还包括:
    对所述人脸特征线图像进行有效性判别,得到优化的人脸特征线图像;
    所述将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息,包括:
    将所述人脸图像与所述优化的人脸特征线图像进行融合,得到人脸特征点的位置信息。
  3. 根据权利要求1所述的方法,其特征在于,所述对人脸图像进行边缘检测,获取人脸特征线图像,包括:
    对所述人脸图像进行特征线特征提取,获取特征线图像;
    对所述特征线图像进行优化,获取所述人脸特征线图像。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述人脸图像进行特征线特征提取,获取特征线图像,包括:
    对所述人脸图像依次执行卷积、残差运算、下采样和残差运算的操作,获取所述特征线图像。
  5. 根据权利要求3所述的方法,其特征在于,所述对所述特征线图像进行优化,获取所述人脸特征线图像,包括:
    将所述特征线图像经过至少一级优化网络,获取所述人脸特征线图像,其中,每级所述优化网络包括用于实现残差运算的沙漏型网络和用于实现特征线信息传递的信息传递层。
  6. 根据权利要求1所述的方法,其特征在于,所述将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息,包括:
    将所述人脸图像进行输入图像融合,得到第一融合图像;
    将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像;
    将所述第二融合图像进行映射,得到特征点的位置向量,并将所述位置向量作为人脸特征点的位置信息。
  7. 根据权利要求6所述的方法,其特征在于,在将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合之前,还包括:
    对所述第一融合图像进行优化处理,得到优化后的第一融合图像,其中,所述优化处理依次包括卷积、下采样和残差运算。
  8. 根据权利要求6所述的方法,其特征在于,所述将所述人脸图像进行输入图像融合,得到第一融合图像,包括:
    将所述人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征;
    将多个所述边界特征与所述人脸图像叠加,得到第一融合图像。
  9. 根据权利要求6所述的方法,其特征在于,所述将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像,包括:
    将所述第一融合图像与所述人脸特征线图像进行叠加,得到第三融合图像;
    将所述第三融合图像进行残差运算,得到与所述人脸特征线图像大小相同的第四融合图像;
    将所述第一融合图像与所述第四融合图像逐像素相乘,得到第五融合图像;
    将所述第一融合图像与所述第五融合图像叠加,得到所述第二融合图像。
  10. 根据权利要求6所述的方法,其特征在于,在每级所述边界图像融合之间,还包括:对每级边界融合的结果进行残差运算。
  11. 根据权利要求6所述的方法,其特征在于,所述将所述第二融合图像进行映射,得到特征点的位置向量,包括:
    将所述第二融合图像依次经过残差运算和全连接操作,得到所述特征点的位置向量。
  12. 一种人脸特征点的定位装置,其特征在于,包括:
    边缘检测模块,用于对人脸图像进行边缘检测,获取人脸特征线图像;
    融合模块,用于将所述人脸图像与所述人脸特征线图像进行融合,得到人脸特征点的位置信息。
  13. 根据权利要求12所述的装置,其特征在于,所述装置还包括:
    判别模块,用于对所述人脸特征线图像进行有效性判别,得到优化的人脸特征线图像;
    所述融合模块用于:
    将所述人脸图像与所述优化的人脸特征线图像进行融合,得到人脸特征点的位置信 息。
  14. 根据权利要求12所述的装置,其特征在于,所述边缘检测模块包括:
    特征提取子模块,用于对所述人脸图像进行特征线特征提取,获取特征线图像;
    第一优化子模块,用于对所述特征线图像进行优化,获取所述人脸特征线图像。
  15. 根据权利要求14所述的装置,其特征在于,所述特征提取子模块用于:
    对所述人脸图像依次执行卷积、残差运算、下采样和残差运算的操作,获取所述特征线图像。
  16. 根据权利要求14所述的装置,其特征在于,所述第一优化子模块用于:
    将所述特征线图像经过至少一级优化网络,获取所述人脸特征线图像,其中,每级所述优化网络包括用于实现残差运算的沙漏型网络和用于实现特征线信息传递的信息传递层。
  17. 根据权利要求12所述的装置,其特征在于,所述融合模块包括:
    第一融合子模块,用于将所述人脸图像进行输入图像融合,得到第一融合图像;
    第二融合子模块,用于将所述第一融合图像与所述人脸特征线图像进行至少一级边缘图像融合,得到第二融合图像;
    映射子模块,用于将所述第二融合图像进行映射,得到特征点的位置向量,并将所述位置向量作为人脸特征点的位置信息。
  18. 根据权利要求17所述的装置,其特征在于,所述融合模块还包括:
    第二优化子模块,用于对所述第一融合图像进行优化处理,得到优化后的第一融合图像,其中,所述优化处理依次包括卷积、下采样和残差运算。
  19. 根据权利要求17所述的装置,其特征在于,所述第一融合子模块包括:
    第一相乘单元,用于将所述人脸图像与每个预定义的特征线图像逐像素相乘,得到多个与每个预定义的特征线图像一一对应的边界特征;
    第一叠加单元,用于将多个所述边界特征与所述人脸图像叠加,得到第一融合图像。
  20. 根据权利要求17所述的装置,其特征在于,所述第二融合子模块包括:
    第二叠加单元,用于将所述第一融合图像与所述人脸特征线图像进行叠加,得到第三融合图像;
    残差运算单元,用于将所述第三融合图像进行残差运算,得到与所述人脸特征线图像大小相同的第四融合图像;
    第二相乘单元,用于将所述第一融合图像与所述第四融合图像逐像素相乘,得到第 五融合图像;
    第三叠加单元,用于将所述第一融合图像与所述第五融合图像叠加,得到所述第二融合图像。
  21. 根据权利要求17所述的装置,其特征在于,所述融合模块还包括:
    残差运算子模块,用于对每级边界融合的结果进行残差运算。
  22. 根据权利要求17所述的装置,其特征在于,所述映射子模块用于:
    将所述第二融合图像依次经过残差运算和全连接操作,得到所述特征点的位置向量。
  23. 一种电子设备,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为执行权利要求1至11中任意一项所述的方法。
  24. 一种计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1至11中任意一项所述的方法。
PCT/CN2018/116779 2018-04-24 2018-11-21 人脸特征点的定位方法及装置 WO2019205605A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
SG11201912428TA SG11201912428TA (en) 2018-04-24 2018-11-21 Method and apparatus for positioning face feature points
MYPI2019007719A MY201922A (en) 2018-04-24 2018-11-21 Method and apparatus for positioning face feature points
KR1020197037564A KR102334279B1 (ko) 2018-04-24 2018-11-21 얼굴 특징점 위치결정 방법 및 장치
JP2019568632A JP7042849B2 (ja) 2018-04-24 2018-11-21 顔特徴点の測位方法及び装置
US16/720,124 US11314965B2 (en) 2018-04-24 2019-12-19 Method and apparatus for positioning face feature points

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810373871.6A CN108596093B (zh) 2018-04-24 2018-04-24 人脸特征点的定位方法及装置
CN201810373871.6 2018-04-24

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/720,124 Continuation US11314965B2 (en) 2018-04-24 2019-12-19 Method and apparatus for positioning face feature points

Publications (1)

Publication Number Publication Date
WO2019205605A1 true WO2019205605A1 (zh) 2019-10-31

Family

ID=63614398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/116779 WO2019205605A1 (zh) 2018-04-24 2018-11-21 人脸特征点的定位方法及装置

Country Status (7)

Country Link
US (1) US11314965B2 (zh)
JP (1) JP7042849B2 (zh)
KR (1) KR102334279B1 (zh)
CN (1) CN108596093B (zh)
MY (1) MY201922A (zh)
SG (1) SG11201912428TA (zh)
WO (1) WO2019205605A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836654A (zh) * 2021-02-07 2021-05-25 上海卓繁信息技术股份有限公司 一种基于融合的表情识别方法、装置和电子设备

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596093B (zh) * 2018-04-24 2021-12-03 北京市商汤科技开发有限公司 人脸特征点的定位方法及装置
CN109285182A (zh) * 2018-09-29 2019-01-29 北京三快在线科技有限公司 模型生成方法、装置、电子设备和计算机可读存储介质
CN109522910B (zh) * 2018-12-25 2020-12-11 浙江商汤科技开发有限公司 关键点检测方法及装置、电子设备和存储介质
CN109461188B (zh) * 2019-01-30 2019-04-26 南京邮电大学 一种二维x射线头影测量图像解剖特征点自动定位方法
CN111553865B (zh) * 2020-04-30 2023-08-22 深圳市商汤科技有限公司 图像修复方法及装置、电子设备和存储介质
CN115564837B (zh) * 2022-11-17 2023-04-18 歌尔股份有限公司 一种视觉定位方法、装置和系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090220157A1 (en) * 2008-02-29 2009-09-03 Canon Kabushiki Kaisha Feature point location determination method and apparatus
CN106156692A (zh) * 2015-03-25 2016-11-23 阿里巴巴集团控股有限公司 一种用于人脸边缘特征点定位的方法及装置
CN106951840A (zh) * 2017-03-09 2017-07-14 北京工业大学 一种人脸特征点检测方法
CN107832741A (zh) * 2017-11-28 2018-03-23 北京小米移动软件有限公司 人脸特征点定位的方法、装置及计算机可读存储介质
CN108596093A (zh) * 2018-04-24 2018-09-28 北京市商汤科技开发有限公司 人脸特征点的定位方法及装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4217664B2 (ja) * 2004-06-28 2009-02-04 キヤノン株式会社 画像処理方法、画像処理装置
US8620038B2 (en) * 2006-05-05 2013-12-31 Parham Aarabi Method, system and computer program product for automatic and semi-automatic modification of digital images of faces
JP4811259B2 (ja) 2006-12-11 2011-11-09 日産自動車株式会社 視線方向推定装置及び視線方向推定方法
JP5153434B2 (ja) 2008-04-22 2013-02-27 キヤノン株式会社 情報処理装置及び情報処理方法
CN103679158B (zh) * 2013-12-31 2017-06-16 北京天诚盛业科技有限公司 人脸认证方法和装置
US10198624B2 (en) * 2016-02-18 2019-02-05 Pinscreen, Inc. Segmentation-guided real-time facial performance capture
KR101785661B1 (ko) * 2016-12-06 2017-10-17 인천대학교 산학협력단 회색 값 분산을 이용한 얼굴 윤곽 인식방법 및 그 장치
WO2018144537A1 (en) * 2017-01-31 2018-08-09 The Regents Of The University Of California Machine learning based driver assistance

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090220157A1 (en) * 2008-02-29 2009-09-03 Canon Kabushiki Kaisha Feature point location determination method and apparatus
CN106156692A (zh) * 2015-03-25 2016-11-23 阿里巴巴集团控股有限公司 一种用于人脸边缘特征点定位的方法及装置
CN106951840A (zh) * 2017-03-09 2017-07-14 北京工业大学 一种人脸特征点检测方法
CN107832741A (zh) * 2017-11-28 2018-03-23 北京小米移动软件有限公司 人脸特征点定位的方法、装置及计算机可读存储介质
CN108596093A (zh) * 2018-04-24 2018-09-28 北京市商汤科技开发有限公司 人脸特征点的定位方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836654A (zh) * 2021-02-07 2021-05-25 上海卓繁信息技术股份有限公司 一种基于融合的表情识别方法、装置和电子设备

Also Published As

Publication number Publication date
MY201922A (en) 2024-03-23
KR102334279B1 (ko) 2021-12-02
JP2020523694A (ja) 2020-08-06
SG11201912428TA (en) 2020-01-30
CN108596093B (zh) 2021-12-03
CN108596093A (zh) 2018-09-28
US20200125833A1 (en) 2020-04-23
JP7042849B2 (ja) 2022-03-28
KR20200010397A (ko) 2020-01-30
US11314965B2 (en) 2022-04-26

Similar Documents

Publication Publication Date Title
CN110647834B (zh) 人脸和人手关联检测方法及装置、电子设备和存储介质
WO2019205605A1 (zh) 人脸特征点的定位方法及装置
JP7262659B2 (ja) 目標対象物マッチング方法及び装置、電子機器並びに記憶媒体
TWI724736B (zh) 圖像處理方法及裝置、電子設備、儲存媒體和電腦程式
CN111310616B (zh) 图像处理方法及装置、电子设备和存储介质
CN110287874B (zh) 目标追踪方法及装置、电子设备和存储介质
CN107692997B (zh) 心率检测方法及装置
WO2020156009A1 (zh) 视频修复方法及装置、电子设备和存储介质
US11288531B2 (en) Image processing method and apparatus, electronic device, and storage medium
CN111243011A (zh) 关键点检测方法及装置、电子设备和存储介质
TW202032425A (zh) 圖像處理方法及裝置、電子設備和儲存介質
CN111553864A (zh) 图像修复方法及装置、电子设备和存储介质
WO2021208666A1 (zh) 字符识别方法及装置、电子设备和存储介质
CN110933488A (zh) 视频剪辑方法及装置
CN109344703B (zh) 对象检测方法及装置、电子设备和存储介质
CN113486830A (zh) 图像处理方法及装置、电子设备和存储介质
CN111311588B (zh) 重定位方法及装置、电子设备和存储介质
WO2023155393A1 (zh) 特征点匹配方法、装置、电子设备、存储介质和计算机程序产品
CN110969569A (zh) 试镜视频的生成方法及装置
CN113538310A (zh) 图像处理方法及装置、电子设备和存储介质
CN109543544B (zh) 跨光谱图像匹配方法及装置、电子设备和存储介质
CN111753596A (zh) 神经网络的训练方法及装置、电子设备和存储介质
CN117893591A (zh) 光幕模板识别方法及装置、设备、存储介质和程序产品
CN115035441A (zh) 高光视频识别方法及装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18916183

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019568632

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20197037564

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 28.01.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18916183

Country of ref document: EP

Kind code of ref document: A1