WO2019029486A1 - 人脸图像处理方法、装置和电子设备 - Google Patents
人脸图像处理方法、装置和电子设备 Download PDFInfo
- Publication number
- WO2019029486A1 WO2019029486A1 PCT/CN2018/098999 CN2018098999W WO2019029486A1 WO 2019029486 A1 WO2019029486 A1 WO 2019029486A1 CN 2018098999 W CN2018098999 W CN 2018098999W WO 2019029486 A1 WO2019029486 A1 WO 2019029486A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- face
- information
- key point
- key
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/84—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
Definitions
- the present application relates to computer vision technology, and in particular, to a face image processing method, apparatus, and electronic device.
- the key point of the face is an indispensable part of many applications such as face recognition. Accurately determining the key points of the face not only helps to correct the relevant position of the face, but also helps to enhance the semantic information of the face.
- Embodiments of the present application provide a face image processing technical solution.
- a method for processing a face image comprising: cutting a face in an image to be processed, obtaining at least one organ image block; and inputting the at least one organ image block separately At least one first neural network, wherein at least two different classes of organs correspond to different first neural networks; and key points information of organs of the respective input organ image blocks are respectively extracted by the at least one first neural network, respectively Key point information for at least one corresponding organ of the face.
- the method further includes: acquiring initial face key point information of the image to be processed; integrating the initial face key point information and the at least one corresponding organ Key point information, obtaining face key point information of the image to be processed.
- the cutting a face in the image to be processed, obtaining the at least one organ image block comprises: acquiring initial face key point information of the image to be processed; The initial face key point information cuts a face in the image to be processed to obtain at least one organ image block.
- the acquiring initial face key point information of the to-be-processed image includes: inputting the to-be-processed image into a second neural network; The network extracts face key point information of the image to be processed, and obtains the initial face key point information.
- the integrating the initial face key point information and the key point information of the at least one corresponding organ to obtain the face key point information of the to-be-processed image includes And replacing at least part of key point information of the same organ in the initial face key point with key point information of the at least one corresponding organ to obtain face key point information of the to-be-processed image.
- the integrating the initial face key point information and the key point information of the at least one corresponding organ to obtain the face key point information of the to-be-processed image includes : converting the position and number of the key point information of the at least one corresponding organ in the corresponding organ image block to the position and number of the key point information of the at least one corresponding organ in the image to be processed, respectively.
- the total number of key points included in the face key point information is greater than a total number of key points included in the initial face key point information; and/or the person
- the number of organ key points of an organ image block extracted by the first neural network included in the face key point information is greater than the number of organ key points of the corresponding organ image block included in the initial face key point information.
- the error degree of the organ curve obtained by fitting at least one organ key point of an organ image block extracted by the first neural network is the initial person
- the angle of error of the organ curve corresponding to the at least one organ key point of the organ image block included in the face key point information is 1/5-1/10.
- the at least one organ image block comprises at least one of: at least one eye image block, at least one mouth image block.
- the key point information of the at least one corresponding organ includes at least one of the following: eyelid line information, lip line information.
- the eyelid line information includes: trajectory information or a fitted line represented by 10-15 key points at the upper or lower eyelid of the monocular; and/or
- the lip line information includes: 16-21 key points at the contour of the single lip and trajectory information or a fitted line represented by 16-21 key points at the lower contour.
- the method before the inputting the at least one organ image block into the at least one first neural network, the method further includes: training the first neural network based on the sample data set,
- the sample data set includes key point annotation data of an organ image of a human face.
- the method before the training the first neural network based on the sample data set, the method further includes: acquiring key point annotation data of the organ image of the face by using the following steps Determining a curve control point of an organ of a human face; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by interpolation, the information of the inserted point is an annotation of the key point data.
- the error degree of the second curve formed by the inserted point fit relative to the organ curve of the face is the first curve relative to the face
- the error of the organ curve is 1/5-1/10.
- the number of key points included in the initial key point information is less than or equal to 106 points, and the number of key points included in the face key point information is greater than 106 points.
- the number of key points included in the face key point information is 186, 240, 220, or 274 points.
- the face key point information includes: 4-24 key points for locating an eye position, 44-48 key points included in eyelid line information; Positioning the 0-20 key points of the mouth position, the lip line includes 60-64 key points; the eyebrow area includes 26-58 key points; the nose area includes 15-27 key points; the facial contour 33- 37 key points.
- the method further includes: performing at least one of the following according to the key point information of the corresponding organ of the obtained face: image rendering of the face, face changing, beauty Processing, beauty treatment, face recognition, face state detection, expression detection, attribute detection.
- a method for processing a face image comprising: acquiring an image to be processed including at least a partial region of a face; extracting eyelid line information from the image to be processed based on a neural network,
- the eyelid line information includes trajectory information or a fitted line represented by 10-15 key points at the eyelid or lower eyelid of the monocular.
- the image to be processed is a monocular image or a binocular image; or the image to be processed includes a face image, and the acquiring includes at least part of a face of the face to be processed.
- the image includes: a monocular image block or a binocular image block that cuts a human face in the image to be processed, the monocular image block or the binocular image block being the image to be processed including at least a partial region of the human face.
- the method before the extracting the eyelid line information from the image to be processed based on the neural network, the method further includes: training the neural network based on the sample data set, wherein the sample The data set includes: eye key point annotation data.
- the eye key point annotation data is obtained by: determining a curve control point of the eyelid line; forming a first curve according to the curve control point; using interpolation A plurality of points are inserted into the first curve, and information of the inserted points is data for the eye key points.
- the error degree of the second curve formed by the inserted point fitting relative to the eyelid line is 1/5 of the error degree of the first curve relative to the eyelid line. -1/10.
- a method for processing a face image comprising: acquiring an image to be processed including at least a partial region of a face; extracting lip line information from the image to be processed based on a neural network,
- the lip line information includes: 16-21 key points at the contour of the single lip and trajectory information or a fitted line represented by 16-21 key points at the lower contour.
- the image to be processed is a single lip image or a lip image; or the image to be processed includes a face image, and the acquiring includes at least part of a face of the face.
- the image to be processed includes: a single lip image block or a lip image block that cuts a face in the image to be processed, and the single lip image block or the double lip image block is the image to be processed including at least a partial region of the face.
- the method before the extracting the lip line information from the image to be processed based on the neural network, the method further includes: training the neural network based on the sample data set, wherein the sample The data set includes: lip point keying data.
- the lip key point annotation data is obtained by: determining a curve control point of the lip line; forming a first curve according to the curve control point; using interpolation A plurality of points are inserted into the first curve, and the information of the inserted points is data for the lip key points.
- the error degree of the second curve formed by the inserted point fitting relative to the lip line is 1/5 of the error degree of the first curve relative to the lip line. -1/10.
- a face image processing apparatus comprising: a cutting module for cutting a face in an image to be processed, obtaining at least one organ image block; and an input module, Inputting the at least one organ image block into at least one first neural network, wherein at least two different categories of organs correspond to different first neural networks; extracting an organ key module for passing at least one first nerve The network respectively extracts key point information of the organ of the organ image block input by each, and obtains key point information of at least one corresponding organ of the face, respectively.
- the device further includes: a first acquiring module, configured to acquire initial face key point information of the image to be processed; and an integration module, configured to integrate the initial The face key point information and the key point information of the at least one corresponding organ obtain the face key point information of the image to be processed.
- the apparatus further includes: a first training module, configured to train the first neural network based on a sample data set, where the sample data set includes: a human face
- the key points of the organ image are labeled data.
- the apparatus further includes: a first labeling module for acquiring key point labeling data of the organ image of the face, the first labeling module is configured to: Determining a curve control point of the organ of the face; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by interpolation, the information of the inserted point is annotating data of the key point .
- the device further includes: an application processing module, configured to perform at least one of the following according to the key point information of the corresponding organ of the obtained face: image rendering of the face , face changing treatment, beauty treatment, beauty treatment, face recognition, face state detection, expression detection, attribute detection.
- an application processing module configured to perform at least one of the following according to the key point information of the corresponding organ of the obtained face: image rendering of the face , face changing treatment, beauty treatment, beauty treatment, face recognition, face state detection, expression detection, attribute detection.
- a face image processing apparatus comprising: a second acquiring module, configured to acquire a to-be-processed image including at least a partial area of a human face; and extract an eyelid line module,
- the eyelid line information is extracted from the image to be processed based on a neural network, and the eyelid line information includes: trajectory information or a fitted line represented by 10-15 key points at the eyelid or lower eyelid of the monocular.
- the apparatus further includes: a second training module, configured to train the neural network based on a sample data set, where the sample data set includes: an eye key point Label data.
- a second training module configured to train the neural network based on a sample data set, where the sample data set includes: an eye key point Label data.
- the device further includes: a second labeling module for acquiring the eye key point annotation data, the second labeling module is configured to: determine an eyelid line a curve control point; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by interpolation, and the information of the inserted point is data for the eye key point.
- a second labeling module for acquiring the eye key point annotation data, the second labeling module is configured to: determine an eyelid line a curve control point; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by interpolation, and the information of the inserted point is data for the eye key point.
- a face image processing apparatus comprising: a second acquiring module, configured to acquire a to-be-processed image including at least a partial area of a human face; and extracting a lip line module, Extracting lip line information from the image to be processed based on a neural network, the lip line information comprising: 16-21 key points at a contour on a single lip and track information represented by 16-21 key points at a lower contour Or fit the line.
- the apparatus further includes: a third training module, configured to train the neural network based on a sample data set, wherein the sample data set includes: a lip key point Label data.
- the device further includes: a third labeling module for acquiring the lip point key annotation data, the third labeling module is configured to: determine the lip line a curve control point; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by interpolation, and the information of the inserted point is data for the lip key point.
- a third labeling module for acquiring the lip point key annotation data, the third labeling module is configured to: determine the lip line a curve control point; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by interpolation, and the information of the inserted point is data for the lip key point.
- an electronic device comprising: a memory for storing a computer program; a processor for executing a computer program stored in the memory, and when the computer program is executed The instructions are executed: instructions for cutting a face in the image to be processed, obtaining at least one organ image block; instructions for inputting the at least one organ image block to the at least one first neural network, respectively, wherein At least two different categories of organs correspond to different first neural networks; for extracting key point information of organs of the respective input organ image blocks via the at least one first neural network, respectively obtaining at least one corresponding of the faces The instruction of the key point information of the organ.
- the executed instruction further includes: an instruction for acquiring initial face key point information of the to-be-processed image; and integrating the initial face key The point information and the key point information of the at least one corresponding organ obtain an instruction of the face key point information of the image to be processed.
- the executed instruction further includes: an instruction for training the first neural network based on a sample data set, wherein the sample data set includes: a human face The key points of the organ image are labeled data.
- the executed instruction further includes: an instruction for acquiring key point annotation data of an organ image of the face, the acquiring the face
- the instruction of the key point annotation data of the organ image includes: an instruction for determining a curve control point of the organ of the face; an instruction for forming a first curve according to the curve control point; and an interpolation method is used in the An instruction to insert a plurality of points in a curve, the information of the inserted points is data for the key points.
- the executed instruction further includes: an instruction for performing at least one of the following processing according to the key point information of the corresponding organ of the obtained human face: an image of the face Rendering, face changing, beauty processing, beauty processing, face recognition, face state detection, expression detection, attribute detection.
- an electronic device comprising: a memory for storing a computer program; a processor for executing a computer program stored in the memory, and when the computer program is executed
- the instructions are: an instruction for acquiring a to-be-processed image including at least a portion of a face; an instruction for extracting eyelid line information from the image to be processed based on a neural network, the eyelid line information comprising: a monocular Trajectory information or fitted lines represented by 10-15 key points at the upper or lower eyelid.
- the executed instruction further includes: an instruction for training the neural network based on a sample data set, wherein the sample data set includes: an eye key point Label data.
- the executed instruction further includes: an instruction for acquiring the eye key point annotation data, wherein the method is configured to acquire the eye key point annotation data.
- the instructions include: an instruction for determining a curve control point of the eyelid line; an instruction for forming a first curve according to the curve control point; an instruction for inserting a plurality of points in the first curve by interpolation, The information of the inserted point is data for the eye key point.
- an electronic device comprising: a memory for storing a computer program; a processor for executing a computer program stored in the memory, and when the computer program is executed
- the instructions are: an instruction for acquiring a to-be-processed image including at least a portion of a face; an instruction for extracting lip line information from the image to be processed based on a neural network, the lip-line information comprising: The trajectory information or the fitted line represented by 16-21 key points at the upper lip contour and 16-21 key points at the lower contour.
- the executed instructions further include: instructions for training the neural network based on a sample data set, wherein the sample data set includes: a lip key point Label data.
- the executed instruction further includes: an instruction for acquiring the lip point annotation data, wherein the lip point key annotation data is acquired
- the instructions include: an instruction for determining a curve control point of the lip line; an instruction for forming a first curve according to the curve control point; an instruction for inserting a plurality of points in the first curve by interpolation, The information of the inserted point is data for the lip key point.
- a computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, each step in the method embodiment of the present application is performed.
- cutting at least one organ image block of a face in the image to be processed cutting at least one organ image block of a face in the image to be processed; respectively inputting the at least one organ image block into at least one first neural network, wherein at least two different classes of organs correspond to different first nerves a network; extracting, by the at least one first neural network, key point information of the organs of the organ image block input by the respective ones, respectively, and obtaining key point information of at least one corresponding organ of the human face.
- acquiring an image to be processed including at least a partial area of the face; extracting eyelid line information from the image to be processed based on a neural network, the eyelid line information comprising: 10-15 at the upper or lower eyelid of the single eye The trajectory information or the fitted line represented by the key points.
- acquiring an image to be processed including at least a partial area of the face; extracting lip line information from the image to be processed based on a neural network, the lip line information comprising: 16-21 key points at a contour on the single lip and The trajectory information or the fitted line represented by 16-21 key points at the lower contour.
- the present application Based on the face image processing method, apparatus, and electronic device provided by the present application, the present application extracts at least one organ image block of a human face from an image to be processed and provides it to at least one first neural network, as this application
- the first neural network can determine the position of the organ that is accurate in position and can accurately meet the requirements of the organ shape for the input organ image block. Therefore, the key points of the organ obtained by using the first neural network in the embodiment of the present application have precise position. specialty.
- FIG. 1 is a flow chart of one embodiment of a method of the present application.
- FIG. 2 is a schematic diagram of key points of the eyelid line of the present application.
- Fig. 3 is a schematic view showing the key points of the upper lip line and the upper lip line of the upper lip of the present application.
- Figure 4 is a schematic view of the key points of the lower lip line and the lower lip line of the lower lip of the present application.
- FIG. 5 is a schematic diagram of key points of eyelid lines of both eyes in the image to be processed according to the present application.
- FIG. 6 is a flow chart of an embodiment of the method of the present application.
- FIG. 7 is a schematic diagram of initial eye key points in an initial face key point of the present application.
- Figure 8 is a flow chart of one embodiment of the method of the present application.
- 9 is a flow chart of one embodiment of the method of the present application.
- FIG. 10 is a schematic structural view of an embodiment of a device of the present application.
- Figure 11 is a schematic structural view of an embodiment of the device of the present application.
- FIG. 12 is a schematic structural view of an embodiment of the device of the present application.
- FIG. 13 is a block diagram of an exemplary apparatus that implements an embodiment of the present application.
- Embodiments of the present application can be applied to computer systems/servers that can operate with numerous other general purpose or special purpose computing system environments or configurations.
- Examples of well-known computing systems, environments, and/or configurations suitable for use with computer systems/servers include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, based on Microprocessor systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and the like.
- the computer system/server can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system.
- program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types.
- the computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network.
- program modules may be located on a local or remote computing system storage medium including storage devices.
- step S100 is a flow chart of an embodiment of a method of the present application. As shown in FIG. 1, the method of this embodiment includes: step S100, step S110, and step S120.
- the step S100 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a cutting module 1000 executed by the processor.
- the image to be processed in the present application may be an image such as a static picture or a photo, or a video frame in a dynamic video.
- the image to be processed may be an image including a human face, and the face in the image to be processed may be a positive face, or there may be a certain angle of deflection; in addition, the degree of clarity of the face in the image to be processed may be very Well, there can be a certain degree of deficiency.
- the present application may cut one or two or more organ image blocks from the image to be processed, and the organ image block cut by the present application may include, but is not limited to, an eye image block, an eyebrow. At least one of an image block and a mouth image block.
- the eye image block in the present application may be a left eye image block, a right eye image block, or a binocular image block.
- the eyebrow image block in the present application may be a left eye image block, a right eye image block or a double eyebrow image block.
- the mouth image block in the present application may be an upper lip image block, a lower lip image block, or an image block including an upper lip and a lower lip.
- the present application may perform a cutting process on the image to be processed based on the initial face key point information of the image to be processed to obtain at least one organ image block of the face in the image to be processed; for example, The application may determine an area in which the left eye, the right eye, or both eyes are located according to an eye key point (for a difference description, hereinafter referred to as an initial eye key point) in the initial face key point information of the image to be processed, and treat according to the area Processing the image to perform a cutting process to obtain an eye image block; for example, the present application may determine the eyebrow key point in the initial face key point information of the image to be processed (for the difference description, hereinafter referred to as the initial eyebrow key point) The area where the left eyebrow, the right eyebrow or the double eyebrow is located, and the cutting process is performed according to the area to obtain the eyebrow image block; for example, the present application according to the key point of the initial face key point information of the image to be processed (for the difference description, the following is called the initial mouth key
- the present application may enlarge or reduce the cut image of the organ image (if needed) so that the cut image of the organ image has a predetermined size, and the predetermined size of the organ image block may be input according to the The first neural network is determined by the requirements of the input image block.
- the present application can cut out a left eye image block, a right eye image block, and a mouth image block from the image to be processed.
- the initial face key point information of the to-be-processed image of the present application may include, but is not limited to, number information of the initial face key point and coordinate information of the initial face key point in the image to be processed.
- the number of initial face key points included in the initial face key point information in the present application is usually less than or equal to a certain set value, for example, the initial face key point information includes 21 or 68 or 106 initials. Face key points, etc.
- the present application can utilize an existing neural network (ie, a second neural network) to extract initial face key points of an image to be processed, and obtain initial face key point information.
- the foregoing second neural network may include: a face detection depth neural network for detecting a face position and a face key point deep neural network for detecting a face key point, which may be first Inputting the image to be processed into the face detection depth neural network, and outputting the face location information of the image to be processed (such as the face frame information) by the face detection depth neural network, and then, the application can process the image to be processed and the person
- the face position information is input into the face key point deep neural network, and the face key point depth neural network can determine the area to be detected in the image to be processed according to the face position information, and face the image of the area to be detected.
- the key point detection so that the face key point depth neural network outputs the face key point information for the image to be processed, the face key point information is the initial face key point information.
- the step S110 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the input module 1010 being executed by the processor.
- the application is provided with one or two or more first neural networks, and at least one first neural network is respectively used to extract corresponding organ key points from the input image block thereof;
- different types of organs correspond to different first neural networks, that is, if the organs in the two organ image blocks are not of the same type (eg, the categories of eyes and mouth are different), then the two organ image blocks are selected. Should be provided to two different first neural networks. If the organs in the two organ image blocks have the same category (such as the same category of the left eye and the right eye), the two organ image blocks can be provided to the same A neural network.
- the first neural network is a neural network that is pre-trained by means of supervised, semi-supervised or unsupervised methods for the key point information positioning task of an organ of a human face.
- the specific manner of training is not limited in the embodiment of the present application.
- the first neural network may be pre-trained in a supervised manner, such as pre-training the first neural network with annotated data for an organ of the face.
- the network structure of the first neural network can be flexibly designed according to the needs of the key point information locating task, and the embodiment of the present application is not limited.
- the first neural network may include, but is not limited to, a convolution layer, a non-linear Relu layer, a pooling layer, a fully connected layer, etc., the more the number of network layers, the deeper the network; for example, the network structure of the first neural network may be adopted.
- a network such as ALexNet, Deep Residual Network (ResNet) or VGGnet (Visual Geometry Group Network).
- the present application is provided with a first neural network for ocular key point information positioning for an eye image block, and the first neural network may also be referred to as an eye key point information positioning depth neural network, which may be directed to an eye.
- the key point information positioning task is pre-trained in a supervised, semi-supervised or unsupervised manner.
- the first nerve may be pre-trained using eye key point information and/or eyelid line related annotation data.
- the internet is directed to a first neural network for lip line positioning for a mouth image block, and the first neural network may also be referred to as a mouth key point information positioning depth neural network, which may be supervised and semi-positioned for lip key point information positioning tasks.
- Pre-training is done in a supervised or unsupervised manner, for example, in an alternative example, the first neural network may be pre-trained using lip line related annotation data.
- the left eye image block and the right eye image block cut out from the image to be processed are respectively input into the eye key point information to locate the deep neural network.
- a left eye image block, or a right eye image block, or a binocular image block cut out from the image to be processed is input into the eye key point information to locate the deep neural network.
- a block of mouth images cut from the image to be processed is input into the deep key neural network of the mouth key information.
- the image block of other organs of the face such as the left eyebrow image block, the right eyebrow image block, the nose image block, and the like may be cut out from the image to be processed, and the image block is preliminarily labeled with eyebrow mark data or nose data.
- the trained neural network extracts the key point information separately and will not repeat them.
- S120 Extract key point information of the organ of the organ image block input by each of the at least one first neural network, and obtain key point information of at least one corresponding organ of the human face, respectively.
- the step S120 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by an extracted organ key module 1020 that is executed by the processor.
- the keypoint information of the eye keypoint location depth neural network extracted from the eye image block includes: eye keypoint information and/or eyelid line information.
- Key point information such as eye corners, eye center, and other key point information.
- the eyelid line information is the trajectory information or the fitted line represented by 10-15 key points at the upper or lower eyelid of the monocular. For example, upper eyelid line information and/or lower eyelid line information.
- the more the number of key points the better the system performance.
- the number of key points is somewhat beneficial to improve the accuracy of describing the shape of the eye, but it also brings a large computational overhead and reduces the computational speed.
- the trajectory information represented by 10-15 key points at the upper or lower eyelid of the monocular eye or The fitted line is represented as eyelid line information, and the accuracy of the shape of the eye described by the eyelid line information can meet the needs of various applications that currently have precise requirements for eye shape, and is also useful for detecting eye condition, such as blinking. Detection of state, closed eye state, etc.
- intersection point of the upper eyelid line and the lower eyelid line in the inner corner of the eye is the key point of the inner corner of the eye
- the intersection point of the upper eyelid line and the lower eyelid line at the outer corner of the eye is the key point of the outer corner of the eye.
- the key points of the eye can be classified as the key points of the upper eyelid line, and can also be classified as the key point of the lower eyelid line.
- the key points of the outer corner of the eye can be classified as the key points of the upper eyelid line, and can also be classified as the key point of the lower eyelid line.
- the key points of the inner corner of the eye and the key points of the outer corner of the eye can also be neither the key points of the upper eyelid line nor the key points of the lower eyelid line, but exist independently.
- the eye-point key point information positioning depth neural network may extract the number of organ key points extracted from the left-eye image block and the number of organ key points extracted from the right eye image block.
- the key point numbered 10 can be called the inner corner key point
- the key point numbered 11 can be called the outer eye corner key point
- the upper eyelid line information is numbered 11-
- the lower eyelid line information is numbered 11-0-1-2-3-
- the number of key points included in the eyelid line information extracted by the deep key neural network of the present application is larger than the number of key points located at the eye position included in the initial face key point information.
- the present application can fit the curve A representing the shape of the eye by extracting at least one key point of the eyelid line extracted by the deep key network of the eye key information, such as the upper eyelid curve or the lower eyelid curve. It may be expressed as curve A; the curve B representing the eyelid can be fitted through the key points at the eye position in the 106 initial face key information, and the curve A and the actual eye line in the present application are detected by actual calculation.
- the degree of error of the curve of the shape is 1/5-1/10 of the degree of error of the curve of the curve B and the shape of the actual eyelid line.
- the present application can improve the accuracy of describing the shape of the eye by using the extracted key point information by separately extracting the eyelid line information for the eye image block.
- the technical solution of the present application is beneficial to improve the accuracy and accuracy of the key point information extraction of the eye, and improve the follow-up.
- eyelid line information can be used as an important reference factor to improve the accuracy of determining a person's facial expression; for example, when performing image rendering.
- the rendering information such as a sticker may be drawn on the eye of the image based on the eyelid line and using a computer drawing method to improve the accuracy of rendering the rendering information; for example, the image may be processed based on the eyelid line for beauty and/or beauty. Improve beauty and/or beauty.
- the keypoint information of the mouth is located in the depth of the neural network.
- the key points extracted from the mouth image block usually include: key points of the lip line, for example, the key points of the upper lip line and the key of the lower lip line. point.
- the key points of the upper lip line may include: a key point of the upper lip line of the upper lip, and may also include: a key point of the upper lip line of the upper lip and a key point of the lower lip line of the upper lip.
- the key points of the lower lip line may include: key points of the upper lip line of the lower lip, and may also include: a key point of the upper lip line of the lower lip and a key point of the lower lip line of the lower lip.
- the lip line information in the present application may include, but is not limited to, 16-21 key points at the contour of the single lip and trajectory information or fitted lines represented by 16-21 key points at the lower contour.
- the lip line of the upper lip is represented by 16-21 key points at the contour of the upper lip and 16-21 key points at the lower lip contour, an optional example is shown in Figure 3.
- the lip line of the lower lip is represented by 16-21 key points at the contour of the lower lip and 16-21 key points at the lower lip contour, an optional example is shown in FIG. 4. Shown.
- the 16-21 key points at the upper lip contour and the 16-21 key points at the lower contour can be represented.
- Lip line information Information or fit lines are expressed as lip line information, and the accuracy of the shape of the lips described using the lip line information can meet the needs of a variety of applications that currently have precise requirements for lip shape or mouth condition detection, and also facilitate mouth-to-mouth
- the state of the open state such as the mouth state, the yawn state, the shut-off state, and the like.
- the upper lip line of the upper lip, the lower lip line of the upper lip, the upper lip line of the lower lip, and the lower lip line of the lower lip at the corners of the mouth on both sides are the key points of the two corners of the mouth, and the key of any mouth is the key. Points can be classified as upper lip line, upper lip lower lip line, lower lip upper lip line or lower lip lower lip line.
- the two key points of the mouth can also belong to the upper lip line and the lower lip line of the upper lip, and do not belong to the lower lip line and the lower lip line of the lower lip, but exist independently.
- the number of key points of the lip line extracted by the deep key neural network by the mouth key information positioning of the present application is greater than the number of key points located at the mouth position included in the initial face key point information.
- the present application can locate the curve C of the shape of the upper lip by positioning the key points of the lip line extracted by the deep neural network through the key information of the mouth, such as fitting the upper lip line curve and the upper lip lower lip curve.
- the technical solution of the present application is beneficial to improve the accuracy and accuracy of the key point information extraction of the lip, and improve the subsequent information based on these key points.
- the accuracy of the application for example, when determining the facial expression of a person, the lip line can be used as an important reference factor to improve the accuracy of determining the facial expression of the person; for example, when performing image rendering, based on the lip line
- the beauty and/or beauty of the image can be improved based on the lip line to improve beauty and/or beauty. Makeup effect.
- the key point information of the organ may also include nose key point information, eyebrow key points, eye center, and the like.
- the key point information of the organ of the face obtained by the present application can be used for image rendering, face changing, beauty processing, beauty processing, face recognition, face state detection, and expression detection of the face.
- detection of face attributes such as attributes such as male/female, age, or ethnicity
- the present application does not limit the specific application range of the key point information of the obtained organ.
- step S600 is a flow chart of an embodiment of a method of the present application. As shown in FIG. 6, the method of this embodiment includes: step S600, step S610, step S620, and step S630.
- step S600 Cutting a face in the image to be processed to obtain at least one organ image block.
- step S100 in FIG. 1 For the content of this step, refer to the description of step S100 in FIG. 1 , which will not be described in detail herein.
- the step S600 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a cutting module 1000 executed by the processor.
- step S610 Input at least one organ image block into the at least one first neural network.
- step S110 For the content of this step, refer to the description of step S110 in FIG. 1 , which will not be described in detail herein.
- step S610 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the input module 1010 being executed by the processor.
- step S620 Extract key point information of the organ of the organ image block input by each of the at least one first neural network, and obtain key point information of at least one corresponding organ of the human face, respectively.
- the content of this step can be referred to the description of step S120 in Fig. 1, and will not be described in detail herein.
- the step S620 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by an extracted organ key module 1020 that is executed by the processor.
- step S630 may be performed by the processor invoking a corresponding instruction stored in the memory or by the integration module 1040 being executed by the processor.
- the present application can integrate the initial face key point information and the key point information of at least one corresponding organ through various integration methods.
- the following two simple integration methods are briefly introduced:
- the first optional example a union based integration method
- the initial face key points of the to-be-processed image extracted by the second neural network are obtained.
- the acquired initial face key points may be 21 or 68 or 106 initial face key points.
- the key points of the eyelid line, the key points of the lip line and the key points of the eyebrow line are respectively numbered and the position information is converted.
- the key point information of the eye is located to locate the key point information of the eyelid line output by the deep neural network.
- it includes: a number set for a predetermined arrangement order of key points of the eyelid line (such as number 0 to number 21 in FIG. 2) and coordinate information of a key point of the eyelid line in the eye image block, which can be pre-set according to the present application.
- the order of the key points of the face is numbered and converted, such as the number 0-21 in FIG. 2 is converted to the number 132-153 in FIG.
- the conversion of the position information in the present application is usually: the line of the eyelid
- the coordinate information of the key point in the eye image block is mapped to the coordinate information of the key point of the eyelid line in the preprocessed image.
- the key points of the lip line and the key points of the eyebrow line and the conversion of the position information reference may be made to the above-mentioned key points of the eyelid line and the conversion description of the position information, which will not be described in detail herein.
- this application usually also converts the number of some or all of the initial face key points.
- the key point information of the eyelid line after the numbering and position information conversion processing, the key point information of the lip line, and the key point information of the eyebrow line are merged with the initial face key point information after the number conversion processing to form a to-be-processed
- the face key point information of the image for example, forms a number of face key point information such as 186, 240, 220 or 274.
- the second optional example an integration based on partial replacement or full replacement
- the initial face key points of the to-be-processed image extracted by the second neural network are obtained.
- the acquired initial face key points may be 21 or 68 or 106 initial face key points.
- the key points of the eyelid line, the key points of the lip line and the key points of the eyebrow line are respectively numbered and the position information is converted, and the number of some or all of the initial face key points is converted, optionally, as described above.
- the description in the first alternative example will not be described in detail herein.
- the key point information of the eyelid line after the number and position information conversion processing is replaced with part of the key point information at the eye position in the initial face key point information, for example, using the eyelid line numbered 132-153 in FIG.
- the key point information replaces the key point information of the initial face key point information in FIG. 7 at the eye position numbered 52-57, 72, and 73; of course, the present application can also use the number and position information conversion processing.
- the key point information of the eyelid line replaces the key point information at the eye position in the initial face key point information, for example, the key point information of the eyelid line numbered 132-153 in FIG. 5 is substituted for the number 52- in FIG. Key information of 57, 72-74 and 104.
- the present application should also replace the key points at the lip position and the eyebrow position in the initial face key point information after the number conversion processing by using the key point information of the lip line after the number and position information conversion processing and the key point information of the eyebrow line.
- the keypoint information numbers 52-57, 72-74 and 104 in Figure 7 may also be retained for locating the eye position, such as for locating the eye area.
- the final face keypoint information extracted from the image may be a certain number of keypoints greater than 106, such as 186, 240, 220, or 274 keypoints may be extracted as needed by the business application.
- the face key information may include:
- Eye key information (48-72), including: used to locate the eye position (including the key points used to locate the key points and eye position of the eye area) (4-24) key points, eyelid line information Included (44-48) key points (such as four eyelid lines corresponding to both eyes);
- Key information of the mouth including: (0-20) key points for positioning the mouth position, and (60-64) key points included in the lip line (such as two lips corresponding to the lips) Lip line)
- the first neural network of the present application generally includes an input layer, a plurality of convolution layers for extracting features, and at least one fully connected layer for determining coordinate information of organ key points in the organ image block. And the output layer.
- a sample data set for training a first neural network typically includes a plurality of image samples. A face image is included in each image sample. Each image sample is labeled with face key point annotation data. For example, each image sample is labeled with a number of more than 106 key points and coordinate information of the key points in the image sample.
- an image sample is first selected from the sample data set, and the image sample is input to the first neural network, and the input layer of the first neural network cuts the image sample out of the eye image according to the annotation data of the image sample.
- An organ image block such as a block or an eyebrow image block or a mouth image block, and adjusting the size of the cut organ image block, after which the input layer converts the number and coordinate information of at least one key point in the organ image block, so that The number of the corresponding organ key point in the annotation data in at least one key point in the entire image sample and the coordinate information of the corresponding organ key point in the map sample are converted into numbers in at least one key point in the organ image block and Coordinate information in the organ image block; the cut and resized organ image block is supplied to the convolution layer for extracting the feature, and the image feature of the organ image block is extracted by the convolution layer, and then used to determine the organ key point
- the fully connected layer of the coordinate information in the organ image block determines the organ image block based on the extracted image
- the application can use the number of the key points after the input layer conversion process and the coordinate information of the key points to supervise and learn the multiple sets of data output by the output layer of the first neural network; repeat the above training process, and output at least the first neural network When the error of the coordinate information of a key point satisfies the predetermined error requirement, the first neural network is successfully trained.
- At least one of the organ keypoint annotation data in the image sample of the present application is labeled by the following process: First, a curve control point of the corresponding organ of the face is determined (eg, the upper/lower eyelid of the eye) Line control point, upper/lower lip upper/lower lip line control point of the mouth, etc.); secondly, a curve is formed according to the above curve control point; again, interpolation method (such as uniform interpolation method or non-uniform interpolation method) is used in the curve Insert a plurality of points, for example, if the curve is a monocular upper eyelid line or a lower eyelid line, insert 10-15 (such as 11) points; for another example, if the curve is the upper lip/lower lip line of the mouth, Then insert 16-21 (such as 17) points; for another example, if the curve is the lower lip/lower lip line of the mouth, insert 16-21 (such as 16) points.
- the coordinate information of the point inserted in the curve in the image sample is
- the number of points inserted in a curve for this curve can be determined according to actual needs, but the number of points inserted for one curve should ensure that the curve formed by the inserted points is relative to the person.
- the degree of error of the actual organ curve of the face is 1/5-1/10 of the error degree of the curve formed by the curve control point with respect to the actual organ curve of the face. It can be seen that the shape expressed by the data of the organ key points formed by the image sample in the present application can be closer to the actual organ shape, thereby facilitating the training of the first neural network.
- Figure 8 is a flow chart of one embodiment of the method of the present application. As shown in FIG. 8, the method of this embodiment includes: step S800 and step S810.
- S800 Acquire an image to be processed including at least a part of a face of the face.
- the step S800 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a second acquisition module 1100 being executed by the processor.
- the to-be-processed image of the present application including at least part of the face of the application may be a left-eye image or a right-eye image or a binocular image, and the image to be processed including at least a partial area of the face of the present application may also be included. Images of multiple different types of face organs.
- the present application may perform a cutting process on the image to be processed based on the initial face key point information of the image to be processed to obtain a monocular image block or a binocular image block, and obtain the monocular image block or the binocular image block. Become the image to be processed that is input to the neural network.
- the present application can adjust the size of the obtained monocular image block or binocular image block, and the resized monocular image block or the binocular image block becomes the image to be processed input to the neural network.
- step S810 can be performed by the processor invoking a corresponding instruction stored in the memory, or can be performed by the extraction eyelid line module 1110 being executed by the processor.
- the eyelid line information of the present application includes trajectory information or a fitted line represented by 10-15 key points at the eyelid or lower eyelid of a single eye.
- the neural network of the present application is a neural network trained based on a sample data set.
- the sample data set used to train the neural network includes: eye key point annotation data.
- the setting manner of the eye key point labeling data may include: first, determining a curve control point of the eyelid line; secondly, forming a first curve according to the curve control point; and finally, inserting the first curve into the first curve by using an interpolation method.
- the point, the information of the inserted point is the eye key point label data.
- the degree of error of the second curve formed by the inserted point fit relative to the true eyelid line is 1/5-1/10 of the error degree of the first curve relative to the true eyelid line.
- step S900 is a flow chart of an embodiment of a method of the present application. As shown in FIG. 9, the method of this embodiment includes: step S900 and step S910.
- S900 Acquire an image to be processed including at least a part of a face of the face.
- the step S900 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a second acquisition module 1100 being executed by the processor.
- the image to be processed of the present application including at least a partial region of the face may be an upper lip image or a lower lip image or a mouth image containing upper and lower lips, the present application including at least a partial region of the face
- the image to be processed may also be an image comprising a plurality of different types of face organs.
- the present application may perform cutting processing on the image to be processed based on the initial face key information of the image to be processed to obtain an upper lip image block or a lower lip image block or a lip image block (ie, including The mouth image block having the upper and lower lips) takes the obtained upper lip image block or lower lip image block or the double lip image block as input to the image to be processed of the neural network.
- the present application can adjust the size of the obtained upper lip image block or lower lip image block or the lip image block, and the resized upper lip image block or the lower lip image block or the mouth image block becomes the input to the neural network to be processed. image.
- step S910 can be performed by the processor invoking a corresponding instruction stored in the memory, or can be performed by the extraction lip module 1200 executed by the processor.
- the lip line information of the present application includes: 16-21 key points at the contour of the single lip and trajectory information or fitted lines represented by 16-21 key points at the lower contour.
- the neural network of the present application is a neural network trained based on a sample data set.
- the sample data set used to train the neural network includes: lip key annotation data.
- the setting manner of the key point data of the lip portion may include: first, determining a curve control point of the lip line; secondly, forming a first curve according to the curve control point; and finally, inserting the first curve into the first curve by interpolation.
- the point, the information of the inserted point is the key point data of the lip.
- the degree of error of the second curve formed by the inserted point fit relative to the true lip line is 1/5 - 1/10 of the error degree of the first curve relative to the true lip line.
- FIG. 10 is a schematic structural view of an embodiment of a device according to the present application.
- the apparatus of this embodiment includes a cutting module 1000, an input module 1010, and an extraction organ key module 1020.
- the apparatus of this embodiment may further include: a first obtaining module 1030, an integration module 1040, a first training module 1050, a first labeling module 1060, and an application processing module 1070.
- the cutting module 1000 is configured to cut a face in an image to be processed to obtain at least one organ image block.
- the input module 1010 is configured to input at least one organ image block into the at least one first neural network. Wherein at least two different classes of organs correspond to different first neural networks.
- the extraction organ key module 1020 is configured to extract key point information of the organs of the respective input organ image blocks through the at least one first neural network, respectively, to obtain key point information of at least one corresponding organ of the human face.
- the first obtaining module 1030 is configured to acquire initial face key point information of the image to be processed.
- the integration module 1040 is configured to integrate the initial face key point information and the key point information of the at least one corresponding organ to obtain the face key point information of the image to be processed.
- the first training module 1050 is for training the first neural network based on the sample data set.
- the sample data set includes: key point annotation data of the organ image of the face.
- the first labeling module 1060 is configured to acquire key point label data of an organ image of a human face.
- the operations that the first labeling module 1060 can perform include: determining a curve control point of the organ of the face; forming a first curve according to the curve control point; inserting a plurality of points in the first curve by using an interpolation method, wherein the information of the inserted point is Key points are labeled with data.
- the application processing module 1070 is configured to perform at least one of the following according to the key point information of the corresponding organ of the obtained face: image rendering of the face, face changing processing, beauty processing, beauty processing, face recognition, face state detection , expression detection, attribute detection.
- Figure 11 is a schematic structural view of an embodiment of the apparatus of the present application.
- the apparatus of this embodiment includes a second acquisition module 1100 and an extraction eyelid line module 1110.
- the device may further include: a second training module 1120 and a second labeling module 1130.
- the second obtaining module 1100 is configured to acquire a to-be-processed image including at least a partial area of the face.
- the extraction eyelid line module 1110 is configured to extract eyelid line information from the image to be processed based on the neural network, wherein the eyelid line information comprises: trajectory information or a fitted line represented by 10-15 key points at the eyelid or lower eyelid of the monocular eye. .
- the second training module 1120 is configured to train the neural network based on the sample data set, wherein the sample data set includes: eye key point annotation data.
- the second labeling module 1130 is configured to acquire eye key point labeling data, and the operations performed by the second labeling module 1130 may include: determining a curve control point of the eyelid line; forming a first curve according to the curve control point; using the interpolation method at the first Multiple points are inserted into the curve, and the information of the inserted point is the eye key point data.
- Figure 12 is a schematic structural view of an embodiment of the apparatus of the present application.
- the apparatus of this embodiment includes a second acquisition module 1100 and an extraction lip line module 1200.
- the device may further include: a third training module 1210 and a third labeling module 1220.
- the second obtaining module 1100 is configured to acquire a to-be-processed image including at least a partial area of the face.
- the extraction lip line module 1200 is configured to extract lip line information from the image to be processed based on the neural network, wherein the lip line information comprises: 16-21 key points at the upper contour of the single lip and 16-21 key points at the lower contour The trajectory information or line of fit represented.
- the third training module 1210 is configured to train the neural network based on the sample data set, wherein the sample data set includes: lip key point annotation data.
- the third labeling module 1220 is configured to acquire lip key point annotation data, and the third labeling module 1220 can perform operations including: determining a curve control point of the lip line; forming a first curve according to the curve control point; and adopting an interpolation method in the first curve Insert multiple points, where the information of the inserted point is the key point data for the lips.
- FIG. 13 illustrates an exemplary device 1300 suitable for implementing the present application, which may be a control system/electronic system configured in a car, a mobile terminal (eg, a smart mobile phone, etc.), a personal computer (PC, eg, a desktop computer Or a notebook computer, etc.), a tablet computer, a server, and the like.
- device 1300 includes one or more processors, communication portions, etc., which may be: one or more central processing units (CPUs) 1301, and/or one or more images
- CPUs central processing units
- GPU graphics processing unit
- the processor may perform various appropriate operations according to executable instructions stored in read only memory (ROM) 1302 or executable instructions loaded from random portion 1308 into random access memory (RAM) 1303.
- the communication unit 1312 may include, but is not limited to, a network card, which may include, but is not limited to, an IB (Infiniband) network card.
- the processor can communicate with the read only memory 1302 and/or the random access memory 1303 to execute executable instructions, connect to the communication portion 1312 via the bus 1304, and communicate with other target devices via the communication portion 1312, thereby completing the corresponding in the present application. step.
- the instructions executed by the processor include: instructions for cutting a face in the image to be processed, obtaining at least one organ image block; and inputting at least one organ image block to at least one of the first An instruction of a neural network, wherein at least two different classes of organs correspond to different first neural networks; and key points information for extracting organs of respective input organ image blocks via at least one first neural network respectively An instruction for key point information of at least one corresponding organ of the face.
- the instructions executed by the processor may further include: an instruction for acquiring initial face key point information of the image to be processed; and integrating key point information of the initial face and at least one key point of the corresponding organ to obtain An instruction for processing face key point information of an image; an instruction for training the first neural network based on the sample data set, wherein the sample data set includes: key point annotation data of the organ image of the face; and an organ for acquiring the face
- the instruction of the key point of the image to mark the data, the instruction for acquiring the key point annotation data of the organ image of the face may include: an instruction for determining a curve control point of the organ of the face; and forming a first according to the curve control point The instruction of the curve; and the instruction for inserting a plurality of points in the first curve by interpolation, the information of the inserted point is the key point labeling data.
- An instruction for performing at least one of the following processing according to the key point information of the corresponding organ of the obtained human face image rendering of a face, face changing processing, beauty
- the instructions executed by the processor include: instructions for acquiring a to-be-processed image including at least a portion of a face; and instructions for extracting eyelid line information from the image to be processed based on the neural network, wherein The eyelid line information includes: trajectory information or a fitted line represented by 10-15 key points at the eyelid or lower eyelid of a single eye.
- the instructions executed by the processor may also optionally include: instructions for training the neural network based on the sample data set, wherein the sample data set includes: eye key annotation data.
- the instruction for acquiring the eye key point labeling data may include: an instruction for determining a curve control point of the eyelid line; and an instruction for forming a first curve according to the curve control point And an instruction for inserting a plurality of points in the first curve by interpolation, wherein the information of the inserted point is the eye key point labeling data.
- the instructions executed by the processor include: instructions for acquiring a to-be-processed image including at least a portion of the face of the face; instructions for extracting lip line information from the image to be processed based on the neural network, wherein The lip line information includes: 16-21 key points at the contour of the single lip and trajectory information or fitted lines represented by 16-21 key points at the lower contour.
- the instructions executed by the processor may also optionally include: instructions for training the neural network based on the sample data set, wherein the sample data set includes: lip key annotation data.
- the instruction for acquiring the key point annotation data of the lip portion may include: an instruction for determining a curve control point of the lip line; and an instruction for forming a first curve according to the curve control point And an instruction for inserting a plurality of points in the first curve by interpolation, wherein the information of the inserted point is data for the lip key point.
- RAM 1303 various programs and data required for the operation of the device can be stored.
- the CPU 1301, the ROM 1302, and the RAM 1303 are connected to each other through a bus 1304.
- ROM 1302 is an optional module.
- the RAM 1303 stores executable instructions or writes executable instructions to the ROM 1302 at runtime, the executable instructions causing the central processing unit 1301 to perform the steps included in the object segmentation method described above.
- An input/output (I/O) interface 1305 is also coupled to bus 1304.
- the communication unit 1312 may be integrated, or may be configured to have a plurality of sub-modules (for example, a plurality of IB network cards) and be respectively connected to the bus.
- the following components are connected to the I/O interface 1305: an input portion 1306 including a keyboard, a mouse, etc.; an output portion 1307 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion 1308 including a hard disk or the like And a communication portion 1309 including a network interface card such as a LAN card, a modem, and the like.
- the communication section 1309 performs communication processing via a network such as the Internet.
- Driver 1310 is also connected to I/O interface 1305 as needed.
- a removable medium 1311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 1310 as needed so that a computer program read therefrom is installed in the storage portion 1308 as needed.
- FIG. 13 is only an optional implementation manner. In practice, the number and types of components in FIG. 13 may be selected, deleted, added, or replaced according to actual needs; In different function component settings, separate implementations such as separate settings or integrated settings can also be used.
- the GPU 1313 and the CPU 1301 can be separately configured. If the GPU 1313 is integrated, the GPU 1313 can be integrated on the CPU 1301, and the communication unit can be separately configured or integrated. On the CPU 1301 or GPU 1313 and the like.
- embodiments of the present application include a computer program product comprising tangibly embodied on a machine readable medium.
- a computer program comprising program code for performing the steps shown in the flowcharts, the program code comprising instructions corresponding to the steps performed by the present application.
- the computer program can be downloaded and installed from the network via the communication portion 1309, and/or installed from the removable medium 1311.
- the computer program is executed by the central processing unit (CPU) 1301, the above-described instructions described in the present application are executed.
- the methods and apparatus of the present application may be implemented in a number of ways.
- the methods and apparatus of the present application can be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
- the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the present application are not limited to the order specifically described above unless otherwise specifically stated.
- the present application can also be implemented as a program recorded in a recording medium, the programs including machine readable instructions for implementing the method according to the present application.
- the present application also covers a recording medium storing a program for executing the method according to the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Human Computer Interaction (AREA)
- Probability & Statistics with Applications (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本申请实施方式公开了一种人脸图像处理方法、装置和电子设备,其中的方法包括:切割待处理图像中的人脸,获得至少一个器官图像块;将所述至少一个器官图像块分别输入到至少一个第一神经网络,其中,至少两个不同类别的器官对应不同的第一神经网络;经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。本申请实施例利用第一神经网络所获得的器官关键点具有位置精准的特点。
Description
本申请要求在2017年8月9日提交中国专利局、申请号为CN201710677556.8、发明名称为“人脸图像处理方法、装置和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及计算机视觉技术,尤其是涉及一种人脸图像处理方法、装置和电子设备。
人脸关键点是人脸识别等诸多应用中不可或缺的一部分。准确的确定出人脸关键点,不仅有利于对人脸的相关位置进行矫正,而且有利于增强人脸的语义信息。
发明内容
本申请实施方式提供一种人脸图像处理技术方案。
根据本申请实施方式其中一个方面,提供了一种人脸图像处理方法,该方法包括:切割待处理图像中的人脸,获得至少一个器官图像块;将所述至少一个器官图像块分别输入到至少一个第一神经网络,其中,至少两个不同类别的器官对应不同的第一神经网络;经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
可选地,结合本申请提供的任一实施例,所述方法还包括:获取所述待处理图像的初始人脸关键点信息;整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息。
可选地,结合本申请提供的任一实施例,所述切割待处理图像中的人脸,获得至少一个器官图像块包括:获取所述待处理图像的初始人脸关键点信息;根据所述初始人脸关键点信息切割所述待处理图像中的人脸,获得至少一个器官图像块。
可选地,结合本申请提供的任一实施例,所述获取所述待处理图像的初始人脸关键点信息,包括:将所述待处理图像输入第二神经网络;经所述第二神经网络提取所述待处理图像的人脸关键点信息,得到所述初始人脸关键点信息。
可选地,结合本申请提供的任一实施例,所述整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息包括:采用所述至少一个相应器官的关键点信息替换所述初始人脸关键点中的相同器官的至少部分关键点信息,获得所述待处理图像的人脸关键点信息。
可选地,结合本申请提供的任一实施例,所述整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息包括:将所述至少一个相应器官的关键点信息在相应器官图像块中的位置和编号,分别转换为所述至少一个相应器官的关键点信息在待处理图像中的位置和编号。
可选地,结合本申请提供的任一实施例,所述人脸关键点信息包括的关键点总数量,大于所述初始人脸关键点信息包括的关键点总数量;和/或所述人脸关键点信息包括的经第一神经网络提取的一器官图像块的器官关键点的数量,大于所述初始人脸关键点信息包括的对应器官图像块的器官关键点的数量。
可选地,结合本申请提供的任一实施例,所述经第一神经网络提取的一器官图像块的至少一个器官关键点拟合而得的器官曲线的误差度,为经所述初始人脸关键点信息中包括的对应所述器官图像块的至少一个器官关键点拟合而得的器官曲线的误差度的1/5-1/10。
可选地,结合本申请提供的任一实施例,所述至少一个器官图像块包括以下至少之一:至少一眼部图像块、至少一嘴部图像块。
可选地,结合本申请提供的任一实施例,所述至少一个相应器官的关键点信息包括以下至少之一:眼睑线信息、嘴唇线信息。
可选地,结合本申请提供的任一实施例,所述眼睑线信息包括:由单眼上眼睑或下眼睑处的10-15个关键点表示的轨迹信息或拟合线;和/或,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述将所述至少一个器官图像块分别输入到至少一个第一神经网络之前,还包括:基于样本数据集训练所述第一神经网络,其中,所述样本数据集包括:人脸的器官图像的关键点标注数据。
可选地,结合本申请提供的任一实施例,所述基于样本数据集训练所述第一神经网络之前,所述方法还包括采用以下步骤获取所述人脸的器官图像的关键点标注数据:确定人脸的器官的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述关键点标注数据。
可选地,结合本申请提供的任一实施例,经所述插入的点拟合形成的第二曲线相对所述人脸的器官曲线的误差度,为所述第一曲线相对所述人脸的器官曲线的误差度的1/5-1/10。
可选地,结合本申请提供的任一实施例,所述初始关键点信息中包括的关键点的数量小于或等于106点,人脸关键点信息中包括的关键点的数量大于106点。
可选地,结合本申请提供的任一实施例,所述人脸关键点信息中包括的关键点的数量为186、240、220或274点。
可选地,结合本申请提供的任一实施例,所述人脸关键点信息包括:用于定位眼睛位置的4-24个关键点、眼睑线信息包括的44-48个关键点;用于定位嘴巴位置的0-20个关键点,唇线包括的60-64个关键点;眉毛区域包括的26-58个关键点;鼻子区域包括的15-27个关键点;脸部轮廓的33-37个关键点。
可选地,结合本申请提供的任一实施例,所述方法还包括:根据得到的人脸的相应器官的关键点信息进行以下至少之一处理:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
根据本申请实施方式其中一个方面,提供了一种人脸图像处理方法,该方法包括:获取包括人脸至少部分区域的待处理图像;基于神经网络从所述待处理图像中提取眼睑线信息,所述眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述待处理图像为单眼图像或者双眼图像;或者,所述待处理图像包括人脸图像,所述获取包括人脸至少部分区域的待处理图像,包括:切割待处理图像中的人脸的单眼图像块或双眼图像块,所述单眼图像块或双眼图像块为所述包括人脸至少部分区域的待处理图像。
可选地,结合本申请提供的任一实施例,所述基于神经网络从所述待处理图像中提取眼睑线信息之前,还包括:基于样本数据集训练所述神经网络,其中,所述样本数据集包括:眼部关键点标注数据。
可选地,结合本申请提供的任一实施例,所述眼部关键点标注数据采用以下步骤获取:确定眼睑线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述眼部关键点标注数据。
可选地,结合本申请提供的任一实施例,经所述插入的点拟合形成的第二曲线相对眼睑线的误差度,为所述第一曲线相对眼睑线的误差度的1/5-1/10。
根据本申请实施方式其中一个方面,提供了一种人脸图像处理方法,该方法包括:获取包括人脸至 少部分区域的待处理图像;基于神经网络从所述待处理图像中提取唇线信息,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述待处理图像为单唇图像或者双唇图像;或者,所述待处理图像包括人脸图像,所述获取包括人脸至少部分区域的待处理图像,包括:切割待处理图像中的人脸的单唇图像块或双唇图像块,所述单唇图像块或双唇图像块为所述包括人脸至少部分区域的待处理图像。
可选地,结合本申请提供的任一实施例,所述基于神经网络从所述待处理图像中提取唇线信息之前,还包括:基于样本数据集训练所述神经网络,其中,所述样本数据集包括:嘴唇部关键点标注数据。
可选地,结合本申请提供的任一实施例,所述嘴唇部关键点标注数据采用以下步骤获取:确定唇线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述嘴唇部关键点标注数据。
可选地,结合本申请提供的任一实施例,经所述插入的点拟合形成的第二曲线相对唇线的误差度,为所述第一曲线相对唇线的误差度的1/5-1/10。
根据本申请实施方式的其中另一个方面,提供了一种人脸图像处理装置,该装置包括:切割模块,用于切割待处理图像中的人脸,获得至少一个器官图像块;输入模块,用于将所述至少一个器官图像块分别输入到至少一个第一神经网络,其中,至少两个不同类别的器官对应不同的第一神经网络;提取器官关键点模块,用于经至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
可选地,结合本申请提供的任一实施例,所述装置还包括:第一获取模块,用于获取所述待处理图像的初始人脸关键点信息;整合模块,用于整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息。
可选地,结合本申请提供的任一实施例,所述装置还包括:第一训练模块,用于基于样本数据集训练所述第一神经网络,其中,所述样本数据集包括:人脸的器官图像的关键点标注数据。
可选地,结合本申请提供的任一实施例,所述装置还包括:用于获取所述人脸的器官图像的关键点标注数据的第一标注模块,所述第一标注模块用于:确定人脸的器官的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述关键点标注数据。
可选地,结合本申请提供的任一实施例,所述装置还包括:应用处理模块,用于根据得到的人脸的相应器官的关键点信息进行以下至少之一处理:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
根据本申请实施方式的其中另一个方面,提供了一种人脸图像处理装置,该装置包括:第二获取模块,用于获取包括人脸至少部分区域的待处理图像;提取眼睑线模块,用于基于神经网络从所述待处理图像中提取眼睑线信息,所述眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述装置还包括:第二训练模块,用于基于样本数据集训练所述神经网络,其中,所述样本数据集包括:眼部关键点标注数据。
可选地,结合本申请提供的任一实施例,所述装置还包括:用于获取所述眼部关键点标注数据的第二标注模块,所述第二标注模块用于:确定眼睑线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述眼部关键点标注数据。
根据本申请实施方式的其中另一个方面,提供了一种人脸图像处理装置,该装置包括:第二获取模块,用于获取包括人脸至少部分区域的待处理图像;提取唇线模块,用于基于神经网络从所述待处理图像中提取唇线信息,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点 表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述装置还包括:第三训练模块,用于基于样本数据集训练所述神经网络,其中,所述样本数据集包括:嘴唇部关键点标注数据。
可选地,结合本申请提供的任一实施例,所述装置还包括:用于获取所述嘴唇部关键点标注数据的第三标注模块,所述第三标注模块用于:确定唇线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述嘴唇部关键点标注数据。
根据本申请实施方式的再一个方面,提供了一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,下述指令被运行:用于切割待处理图像中的人脸,获得至少一个器官图像块的指令;用于将所述至少一个器官图像块分别输入到至少一个第一神经网络的指令,其中,至少两个不同类别的器官对应不同的第一神经网络;用于经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息的指令。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于获取所述待处理图像的初始人脸关键点信息的指令;用于整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息的指令。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于基于样本数据集训练所述第一神经网络的指令,其中,所述样本数据集包括:人脸的器官图像的关键点标注数据。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于获取所述人脸的器官图像的关键点标注数据的指令,所述用于获取所述人脸的器官图像的关键点标注数据的指令包括:用于确定人脸的器官的曲线控制点的指令;用于根据所述曲线控制点形成第一曲线的指令;用于采用插值方式在所述第一曲线中插入多个点的指令,所述插入的点的信息为所述关键点标注数据。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于根据得到的人脸的相应器官的关键点信息进行以下至少之一处理的指令:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
根据本申请实施方式的再一个方面,提供了一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,下述指令被运行:用于获取包括人脸至少部分区域的待处理图像的指令;用于基于神经网络从所述待处理图像中提取眼睑线信息的指令,所述眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于基于样本数据集训练所述神经网络的指令,其中,所述样本数据集包括:眼部关键点标注数据。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于获取所述眼部关键点标注数据的指令,所述用于获取所述眼部关键点标注数据的指令包括:用于确定眼睑线的曲线控制点的指令;用于根据所述曲线控制点形成第一曲线的指令;用于采用插值方式在所述第一曲线中插入多个点的指令,所述插入的点的信息为所述眼部关键点标注数据。
根据本申请实施方式的再一个方面,提供了一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,下述指令被运行:用于获取包括人脸至少部分区域的待处理图像的指令;用于基于神经网络从所述待处理图像中提取唇线信息的指令,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于基于样本数据集训练所述神经网络的指令,其中,所述样本数据集包括:嘴唇部关键点标注数据。
可选地,结合本申请提供的任一实施例,所述被运行的指令还包括:用于获取所述嘴唇部关键点标注数据的指令,所述用于获取所述嘴唇部关键点标注数据的指令包括:用于确定唇线的曲线控制点的指令;用于根据所述曲线控制点形成第一曲线的指令;用于采用插值方式在所述第一曲线中插入多个点的指令,所述插入的点的信息为所述嘴唇部关键点标注数据。
根据本申请实施方式的再一个方面,提供的一种计算机存储介质,其上存储有计算机程序,该计算机程序被处理器执行时,执行本申请方法实施方式中的各步骤,
例如,切割待处理图像中的人脸的至少一个器官图像块;将所述至少一个器官图像块分别输入到至少一个第一神经网络,其中,至少两个不同类别的器官对应不同的第一神经网络;经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
再例如,获取包括人脸至少部分区域的待处理图像;基于神经网络从所述待处理图像中提取眼睑线信息,所述眼睑线信息包括:由单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
再例如,获取包括人脸至少部分区域的待处理图像;基于神经网络从所述待处理图像中提取唇线信息,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
基于本申请提供的人脸图像处理方法、装置和电子设备,本申请通过从待处理图像中抽取出人脸的至少一个器官图像块,并将其提供给至少一个第一神经网络,由于本申请中的第一神经网络可以针对输入的器官图像块确定出位置准确且能够满足准确表达器官形状要求的器官关键点,因此,本申请实施例利用第一神经网络所获得的器官关键点具有位置精准的特点。
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
构成说明书的一部分的附图描述了本申请的实施例,并且连同描述一起用于解释本申请的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本申请,其中:
图1为本申请方法一个实施方式的流程图。
图2为本申请的眼睑线的关键点示意图。
图3为本申请的上嘴唇上唇线和上嘴唇下唇线的关键点示意图。
图4为本申请的下嘴唇上唇线和下嘴唇下唇线的关键点示意图。
图5为本申请的待处理图像中的双眼的眼睑线的关键点示意图。
图6为本申请方法一个实施方式的流程图。
图7为本申请初始人脸关键点中的初始眼睛关键点示意图。
图8为本申请方法一个实施方式的流程图。
图9为本申请方法一个实施方式的流程图。
图10为本申请装置一个实施方式的结构示意图。
图11为本申请装置一个实施方式的结构示意图。
图12为本申请装置一个实施方式的结构示意图。
图13为实现本申请实施方式的一示例性设备的框图。
现在将参照附图来详细描述本申请的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本申请的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本申请及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本申请实施例可以应用于计算机系统/服务器,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与计算机系统/服务器一起使用的众所周知的计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
计算机系统/服务器可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
图1为本申请方法一个实施例的流程图。如图1所示,该实施例方法包括:步骤S100、步骤S110以及步骤S120。
S100、切割待处理图像中的人脸,获得至少一个器官图像块。
在一个可选示例中,该步骤S100可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的切割模块1000执行。
在一个可选示例中,本申请中的待处理图像可以为呈现静态的图片或者照片等图像,也可以为呈现动态的视频中的视频帧等。该待处理图像可以为包含有人脸的图像,且该待处理图像中的人脸可以为正脸,也可能会存在一定角度的偏转;另外,该待处理图像中的人脸的清晰程度可以非常好,也可以存在一定程度的欠缺。
在一个可选示例中,本申请可以从待处理图像中切割出一个或者两个或者更多数量的器官图像块,且本申请切割出的器官图像块可以包括但不限于:眼睛图像块、眉毛图像块以及嘴巴图像块中的至少一个。本申请中的眼睛图像块可以为左眼图像块、右眼图像块或者双眼图像块。本申请中的眉毛图像块可以为左眉图像块、右眉图像块或者双眉图像块。本申请中的嘴巴图像块可以为上嘴唇图像块、下嘴唇图像块或者包含有上嘴唇和下嘴唇的图像块。
在一个可选示例中,本申请可以以待处理图像的初始人脸关键点信息为依据,对待处理图像进行切割处理,以获得待处理图像中的人脸的至少一个器官图像块;例如,本申请可以根据待处理图像的初始人脸关键点信息中的眼睛关键点(为区别描述,下述称为初始眼睛关键点)确定出左眼、右眼或双眼所在的区域,并根据该区域对待处理图像进行切割处理,从而获得眼睛图像块;再例如,本申请可以根据待处理图像的初始人脸关键点信息中的眉毛关键点(为区别描述,下述称为初始眉毛关键点)确定出左眉、右眉或双眉所在的区域,并根据该区域对待处理图像进行切割处理,从而获得眉毛图像块;再例如,本申请根据待处理图像的初始人脸关键点信息中的嘴巴关键点(为区别描述,下述称为初始嘴巴关键点)确定出上嘴唇、下嘴唇或整个嘴巴所在的区域,并根据该区域对待处理图像进行切割处理,从而获得嘴巴图像块。可选的,本申请可以对切割出的器官图像块进行放大或者缩小调整(如需要),以便于使切割出的器官图像块具有预定大小,且器官图像块所具有的预定大小可以根据其输入的第一神经网络对输 入的图像块的要求来决定。
在一个可选示例中,本申请可以从待处理图像中切割出左眼图像块、右眼图像块以及嘴巴图像块。
在一个可选示例中,本申请的待处理图像的初始人脸关键点信息可以包括但不限于:初始人脸关键点的编号信息以及初始人脸关键点在待处理图像中的坐标信息等。可选的,本申请中的初始人脸关键点信息所包括的初始人脸关键点的数量通常小于或者等于某一设定值,例如,初始人脸关键点信息包括21或者68或者106个初始人脸关键点等。
在一个可选示例中,本申请可以利用现有的神经网络(即第二神经网络)来提取待处理图像的初始人脸关键点,获得初始人脸关键点信息。在一个可选的例子中,上述第二神经网络可以包括:用于检测人脸位置的人脸检测深度神经网络以及用于检测人脸关键点的人脸关键点深度神经网络,本申请可以先将待处理图像输入至人脸检测深度神经网络中,由人脸检测深度神经网络输出待处理图像的人脸位置信息(如人脸外接框信息),然后,本申请可以将待处理图像以及人脸位置信息输入至人脸关键点深度神经网络中,由人脸关键点深度神经网络可以根据人脸位置信息确定出待处理图像中需要检测的区域,并针对需要检测的区域的图像进行人脸关键点检测,从而人脸关键点深度神经网络会针对待处理图像输出人脸关键点信息,该人脸关键点信息即初始人脸关键点信息。
S110、将上述至少一个器官图像块分别输入到至少一个第一神经网络。
在一个可选示例中,该步骤S110可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的输入模块1010执行。
在一个可选示例中,本申请设置有一个或者两个或者更多数量的第一神经网络,且至少一个第一神经网络分别用于从其输入图像块中提取出相应的器官关键点;可选的,不同类别的器官对应不同的第一神经网络,也就是说,如果两个器官图像块中的器官的类别不相同(如眼睛和嘴巴的类别不相同),则这两个器官图像块应提供给不相同的两个第一神经网络,如果两个器官图像块中的器官的类别相同(如左眼和右眼的类别相同),则这两个器官图像块可以提供给同一个第一神经网络。第一神经网络是针对人脸某器官的关键点信息定位任务采用监督、半监督或无监督等方式预先训练完成的神经网络,训练的具体方式本申请实施例并不限定。在一个可选示例中,第一神经网络可采用监督方式预先训练完成,如采用人脸某器官的标注数据预先训练第一神经网络。第一神经网络的网络结构可根据关键点信息定位任务的需要灵活设计,本申请实施例并不限制。例如,第一神经网络可包括但不限于卷积层、非线性Relu层、池化层、全连接层等,网络层数越多,网络越深;又例如第一神经网络的网络结构可采用但不限于ALexNet、深度残差网络(Deep Residual Network,ResNet)或VGGnet(Visual Geometry Group Network)等网络的结构。
在一个可选示例中,本申请针对眼睛图像块设置有用于眼部关键点信息定位的第一神经网络,该第一神经网络也可以称为眼部关键点信息定位深度神经网络,可针对眼部关键点信息定位任务采用监督、半监督或无监督等方式预先训练完成,例如,在一个可选示例中,可采用眼睛关键点信息和/或眼睑线相关的标注数据预先训练该第一神经网络。本申请针对嘴巴图像块设置于用于唇线定位的第一神经网络,该第一神经网络也可以称为嘴部关键点信息定位深度神经网络,可针对嘴唇关键点信息定位任务采用监督、半监督或无监督等方式预先训练完成,例如,在一个可选示例中,可采用唇线相关的标注数据预先训练该第一神经网络。一个可选的例子,从待处理图像中切割出的左眼图像块和右眼图像块被分别输入到眼部关键点信息定位深度神经网络中。一个可选的例子,从待处理图像中切割出的左眼图像块、或者右眼图像块、或者双眼图像块被输入到眼部关键点信息定位深度神经网络中。一个可选的例子,从待处理图像中切割出的嘴巴图像块被输入到嘴部关键点信息定位深度神经网络中。
可以理解,本申请从待处理图像中还可切割出左眉图像块、右眉图像块、鼻子图像块等人脸其他器官的图像块,对这些图像块经预先采用眉毛标注数据或鼻子标注数据训练过的神经网络分别进行关键点信息的提取,不再赘述。
S120、经至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
在一个可选示例中,该步骤S120可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取器官关键点模块1020执行。
在一个可选示例中,眼部关键点信息定位深度神经网络从眼睛图像块中提取出的器官的关键点信息包括:眼睛关键点信息和/或眼睑线信息。眼睛关键点信息例如眼角、眼睛中心等关键点信息。眼睑线信息由单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。例如,上眼睑线信息和/或下眼睑线信息。
在实际应用过程中,并非关键点数量越多系统性能就越优越。例如,关键点数量一定程度上有利于提高描述眼睛的形状的准确度,但也带来很大的计算开销、降低运算速度。在对眼部关键点信息定位深度神经网络的效率以及描述眼睛的形状的准确度等因素进行综合考虑后,可将单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线表示为眼睑线信息,使用眼睑线信息所描述出的眼睛的形状的准确度能够满足目前对眼睛形状有精准要求的多种应用的需求,还有利于检测眼部状态,如睁眼状态、闭眼状态等检测。
需要特别说明的是,本申请中的上眼睑线和下眼睑线在内眼角处的交汇点为内眼角关键点,上眼睑线和下眼睑线在外眼角处的交汇点为外眼角关键点,内眼角关键点可以被划归为上眼睑线的关键点,也可以被划归为下眼睑线的关键点。外眼角关键点可以被划归为上眼睑线的关键点,也可以被划归为下眼睑线的关键点。当然,内眼角关键点和外眼角关键点也可以既不属于上眼睑线的关键点,也不属于下眼睑线的关键点,而独立存在。另外,眼眼部关键点信息定位深度神经网络从左眼图像块提取出的器官关键点的数量与从右眼图像块中提取出的器官关键点的数量可以相同。
在一个可选的示例中,如图2,编号为10的关键点可称为内眼角关键点,编号为11的关键点可称为外眼角关键点,上眼睑线信息为由编号为11-12-13-14-15-16-17-18-19-20-21-10的关键点表示的轨迹信息或拟合线,下眼睑线信息为编号为11-0-1-2-3-4-5-6-7-8-9-10的关键点表示的轨迹信息或拟合线。
本申请通过眼部关键点信息定位深度神经网络提取出的眼睑线信息中包括的关键点的数量大于初始人脸关键点信息所包含的位于眼睛位置处的关键点的数量。本申请通过眼部关键点信息定位深度神经网络提取出的用于表示眼睑线的至少一个关键点可以拟合出表示眼睛形状的曲线A,如可以拟合出上眼睑线曲线或下眼睑线曲线,不妨表示为曲线A;通过106个初始人脸关键点信息中的位于眼睛位置处的关键点可以拟合出表示眼睑的曲线B,通过实际计算检测,本申请中的曲线A与实际眼睑线形状的曲线的误差度,为曲线B与实际眼睑线形状的曲线的误差度的1/5-1/10。由此可知,本申请通过针对眼睛图像块单独提取眼睑线信息,可以提高利用提取出的关键点信息描述眼睛的形状的准确度。在通常情况下,由于眼睛会随着人的面部的变化(如表情变化)而发生较为明显的变化,因此,本申请技术方案有利于提高眼部关键点信息提取的精度和准确度,提高后续基于这些关键点信息应用的准确性,例如在确定人的面部表情时,可以将眼睑线信息作为一个重要的参考因素,有利于提高确定人的面部表情的准确性;又例如在进行图像渲染时,可基于眼睑线并采用计算机绘图方式在图像的眼部绘制如贴纸等渲染信息,提高渲染信息绘制的准确性;再例如,可基于眼睑线对图像进行美颜和/或美妆等处理,提高美颜和/或美妆效果。
在一个可选示例中,嘴部关键点信息定位深度神经网络从嘴巴图像块中提取出的器官关键点通常包括:嘴唇线的关键点,例如,上嘴唇线的关键点和下嘴唇线的关键点。可选的,上嘴唇线的关键点可以包括:上嘴唇上唇线的关键点,也可以包括:上嘴唇上唇线的关键点和上嘴唇下唇线的关键点。下嘴唇线的关键点可以包括:下嘴唇上唇线的关键点,也可以包括:下嘴唇上唇线的关键点和下嘴唇下唇线的关键点。
本申请中唇线信息可以包括但不限于:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键 点表示的轨迹信息或拟合线。例如,上嘴唇的唇线由上嘴唇上轮廓处的16-21个关键点以及上嘴唇下轮廓处的16-21个关键点一起表示的轨迹信息或拟合线,可选示例如图3所示。又例如,下嘴唇的唇线由下嘴唇上轮廓处的16-21个关键点以及下嘴唇下轮廓处的16-21个关键点一起表示的轨迹信息或拟合线,可选示例如图4所示。
在实际应用过程中,并非关键点数量越多系统性能就越优越。例如,关键点数量一定程度上有利于提高描述嘴唇的形状的准确度,但也带来很大的计算开销、降低运算速度。在对深度神经网络的效率以及描述嘴唇的形状的准确度等因素进行综合考虑后,可将单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线表示为唇线信息,使用唇线信息所描述出的嘴唇的形状的准确度能够满足目前对嘴唇形状或嘴巴状态检测有精准要求的多种应用的需求,还有利于对嘴巴张合状态,如张嘴状态、哈欠状态、闭嘴状态等检测。
需要特别说明的是,本申请中的上嘴唇上唇线、上嘴唇下唇线、下嘴唇上唇线以及下嘴唇下唇线在两侧嘴角处的交汇点为两个嘴角关键点,任一嘴角关键点可以被划归为上嘴唇上唇线、上嘴唇下唇线、下嘴唇上唇线或者下嘴唇下唇线。当然,两个嘴角关键点也可以既不属于上嘴唇上唇线和上嘴唇下唇线,也不属于下嘴唇上唇线和下嘴唇下唇线,而独立存在。
本申请通过嘴部关键点信息定位深度神经网络提取出的嘴唇线的关键点的数量大于初始人脸关键点信息所包含的位于嘴巴位置处的关键点的数量。本申请通过嘴部关键点信息定位深度神经网络提取出的嘴唇线的关键点可以拟合出表示上嘴唇形状的曲线C,如可以拟合出上嘴唇上唇线曲线和上嘴唇下唇线曲线,这二条曲线形成曲线C;通过106个初始人脸关键点信息中的位于嘴巴位置处的关键点可以拟合出表示上嘴唇形状的曲线D,通过实际计算,本申请中的曲线C与实际上嘴唇形状的曲线的误差度,为曲线D与实际上嘴唇形状的曲线的误差度的1/5-1/10。由此可知,本申请通过针对嘴巴图像块单独提取唇线的关键点,可以有效提高利用提取出的关键点描述嘴唇的形状的准确度。通常情况下,由于嘴巴会随着人的面部表情的变化而发生较为明显的变化,因此,本申请技术方案有利于提高唇部关键点信息提取的精度和准确度,提高后续基于这些关键点信息应用的准确性,例如在确定人的面部表情时,可以将唇线作为一个重要的参考因素,有利于提高确定人的面部表情的准确性;又例如在进行图像渲染时,可基于唇线并采用计算机绘图方式在图像的唇部绘制如贴纸等渲染信息,提高渲染信息绘制的准确性;再例如,可基于唇线进行图像的美颜和/或美妆处理,提高美颜和/或美妆效果。
除了上述眼睑线信息、唇线信息之外,器官的关键点信息还可以包括鼻子关键点信息、眉毛关键点、眼睛中心等。在一个可选示例中,本申请获得的人脸的器官的关键点信息可以用于人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测以及人脸属性(如男/女、年龄或者民族等属性)检测等应用,本申请不限制获得的器官的关键点信息的具体应用范围。
图6为本申请方法一个实施例的流程图。如图6所示,该实施例方法包括:步骤S600、步骤S610、步骤S620以及步骤S630。
S600、切割待处理图像中的人脸,获得至少一个器官图像块。本步骤的内容可以参见图1中针对步骤S100的描述,在此不再详细说明。
在一个可选示例中,该步骤S600可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的切割模块1000执行。
S610、将至少一个器官图像块分别输入到至少一个第一神经网络。本步骤的内容可以参见图1中针对步骤S110的描述,在此不再详细说明。
在一个可选示例中,该步骤S610可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的输入模块1010执行。
S620、经至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。本步骤的内容可以参见图1中针对步骤S120的描述,在此不 再详细说明。
在一个可选示例中,该步骤S620可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取器官关键点模块1020执行。
S630、整合待处理图像的初始人脸关键点信息和上述获得的至少一个相应器官的关键点信息,获得待处理图像的人脸关键点信息。
在一个可选示例中,该步骤S630可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的整合模块1040执行。
本申请可以通过多种整合方式对初始人脸关键点信息和至少一个相应器官的关键点信息进行整合处理,下面简单介绍两种整合方式:
第一个可选的例子、基于并集的整合方式;
首先,获取第二神经网络提取出的待处理图像的初始人脸关键点,如获取到的初始人脸关键点可以为21或者68或者106个初始人脸关键点。
其次,对眼睑线的关键点、嘴唇线的关键点和眉毛线的关键点分别进行编号以及位置信息的转换,可选的,眼部关键点信息定位深度神经网络输出的眼睑线的关键点信息通常包括:针对眼睑线的关键点的预定排列顺序所设置的编号(如图2中的编号0至编号21)以及眼睑线的关键点在眼睛图像块中的坐标信息,本申请可以按照预先设定的人脸关键点的排列顺序进行编号转换处理,如将图2中的编号0-21转换为图5中的编号132-153;本申请中的位置信息的转换通常为:将眼睑线的关键点在眼睛图像块中的坐标信息映射到眼睑线的关键点在预处理图像中的坐标信息。对嘴唇线的关键点和眉毛线的关键点进行编号以及位置信息的转换可以参照上述对眼睑线的关键点进行编号以及位置信息的转换描述,在此不再详细说明。另外,本申请通常也会对部分或者全部初始人脸关键点的编号进行转换处理。
最后,将编号以及位置信息转换处理后的眼睑线的关键点信息、嘴唇线的关键点信息和眉毛线的关键点信息与编号转换处理后的初始人脸关键点信息合并在一起,形成待处理图像的人脸关键点信息,例如,形成186,240,220或274等数量的人脸关键点信息。
第二个可选的例子、基于部分替换或者全部替换的整合方式;
首先,获取第二神经网络提取出的待处理图像的初始人脸关键点,如获取到的初始人脸关键点可以为21或者68或者106个初始人脸关键点。
其次,对眼睑线的关键点、嘴唇线的关键点和眉毛线的关键点分别进行编号以及位置信息的转换,对部分或者全部初始人脸关键点的编号进行转换处理,可选地,如上述第一个可选的例子中的描述,在此不再详细说明。
最后,利用编号以及位置信息转换处理后的眼睑线的关键点信息替换初始人脸关键点信息中的位于眼睛位置处的部分关键点信息,例如,利用图5中编号为132-153的眼睑线的关键点信息替换图7中初始人脸关键点信息中的位于眼睛位置处的编号为52-57、72和73的关键点信息;当然,本申请也可以利用编号以及位置信息转换处理后的眼睑线的关键点信息替换初始人脸关键点信息中的位于眼睛位置处的关键点信息,例如,利用图5中编号为132-153的眼睑线的关键点信息替换图7中编号为52-57、72-74以及104的关键点信息。本申请还应利用编号以及位置信息转换处理后的嘴唇线的关键点信息和眉毛线的关键点信息替换编号转换处理后的初始人脸关键点信息中的位于嘴唇位置处和眉毛位置处的关键点信息,形成待处理图像的人脸关键点信息。可以理解,图7中编号为52-57、72-74以及104的关键点信息也可保留,用来定位眼睛位置,如用来定位眼睛区域。
一个可选的示例中,从图像中提取的最终的人脸关键点信息可为大于106的某一数量的关键点,如可根据业务应用的需要提取186、240、220或274点关键点。一个可选的示例中,人脸关键点信息可以包括:
1、眼部关键点信息(48-72)个,包括:用于定位眼睛位置(包括用来定位眼睛区域的关键点和眼 球位置的关键点)(4-24)个关键点、眼睑线信息包括的(44-48)个关键点(如双眼对应的四条眼睑线);
2、嘴部关键点信息(60-84)个,包括:用于定位嘴巴位置的(0-20)个关键点,唇线包括的(60-64)个关键点(如双唇对应的两条唇线);
3、眉毛区域包括的(26-58)个关键点;
4、鼻子区域包括的(15-27)个关键点;
5、脸部轮廓的(33-37)个关键点。
采用上述数量和比例的人脸关键点信息可在计算开销、定位精度和准确度方便很好地权衡,满足多数业务应用的精准要求。
在一个可选示例中,本申请的第一神经网络通常包括输入层、多个用于提取特征的卷积层、至少一个用于确定器官关键点在器官图像块中的坐标信息的全连接层以及输出层。用于训练第一神经网络的样本数据集通常包括多个图像样本。每一个图像样本中均包含有人脸图像。每一个图像样本都标注有人脸关键点标注数据,例如,每一个图像样本都标注有多于106个关键点的编号以及关键点在图像样本中的坐标信息。在训练第一神经网络时,先从样本数据集中选取一个图像样本,并将图像样本输入给第一神经网络,第一神经网络的输入层根据图像样本的标注数据将图像样本剪切出眼睛图像块或者眉毛图像块或者嘴巴图像块等器官图像块,并调整剪切出的器官图像块的大小,之后,输入层对器官图像块中的至少一个关键点的编号以及坐标信息进行转换处理,使标注数据中的相应器官关键点在整个图像样本中的至少一个关键点中的编号以及相应器官关键点在图样本中的坐标信息转换为在器官图像块中的至少一个关键点中的编号以及在器官图像块中的坐标信息;剪切并调整大小后的器官图像块提供给用于提取特征的卷积层,由卷积层提取器官图像块的图像特征,之后,用于确定器官关键点在器官图像块中的坐标信息的全连接层根据提取出的图像特征确定器官图像块中的至少一个关键点的编号以及至少一个关键点的坐标信息,并通过第一神经网络的输出层输出多组数据,每一组数据均包括一个关键点的编号以及该关键点的坐标信息。本申请可以利用输入层转换处理后的关键点的编号以及关键点的坐标信息对第一神经网络的输出层输出的多组数据进行监督学习;重复上述训练过程,在第一神经网络输出的至少一个关键点的坐标信息的误差满足预定误差要求时,第一神经网络训练成功。
在一个可选示例中,本申请中的图像样本中的至少一个器官关键点标注数据是通过下述过程标注的:首先,确定人脸的相应器官的曲线控制点(如眼睛的上/下眼睑线控制点、嘴巴的上/下嘴唇上/下唇线控制点等);其次,根据上述曲线控制点形成一曲线;再次,采用插值方式(如均匀插值方式或者非均匀插值方式)在该曲线中插入多个点,例如,如果该曲线为单眼上眼睑线或下眼睑线,则插入10-15(如11)个点;再例如,如果该曲线为嘴巴的上嘴唇上/下唇线,则插入16-21(如17)个点;再例如,如果该曲线为嘴巴的下嘴唇上/下唇线,则插入16-21(如16)个点。在曲线中插入的点在图像样本中的坐标信息即为相应器官关键点标注数据中的坐标信息,在曲线中插入的点的顺序编号会被转换为图像样本中的相应器官关键点标注数据中的编号。
需要特别说明的是,本申请中针对一条曲线所插入的点的数量可以根据实际需求确定,但是,针对一条曲线所插入的点的数量应保证:经插入的点所拟合形成的曲线相对人脸的实际器官曲线的误差度,为由曲线控制点形成的曲线相对人脸的实际器官曲线的误差度的1/5-1/10。由此可知,本申请为图像样本所形成的器官关键点标注数据所表达出的形状,能够更接近实际的器官形状,从而更有利于训练第一神经网络。
图8为本申请方法一个实施例的流程图。如图8所示,该实施例方法包括:步骤S800以及步骤S810。
S800、获取包括人脸至少部分区域的待处理图像。
在一个可选示例中,该步骤S800可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的第二获取模块1100执行。
在一个可选示例中,本申请的包括人脸至少部分区域的待处理图像可以为左眼图像或者右眼图像或 者双眼图像,本申请的包括人脸至少部分区域的待处理图像也可以为包括多个不同类型的人脸器官的图像。
在一个可选示例中,本申请可以以待处理图像的初始人脸关键点信息为依据,对待处理图像进行切割处理,以获得单眼图像块或双眼图像块,获得的单眼图像块或双眼图像块成为输入给神经网络的待处理图像。另外,本申请可以调整获得的单眼图像块或双眼图像块的大小,调整大小后的单眼图像块或双眼图像块成为输入给神经网络的待处理图像。
S810、基于神经网络从待处理图像中提取眼睑线信息。
在一个可选示例中,该步骤S810可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取眼睑线模块1110执行。
在一个可选示例中,本申请的眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
在一个可选示例中,本申请的神经网络(即上述方法实施例中描述的第一神经网络)是基于样本数据集训练获得的神经网络。用于训练神经网络的样本数据集包括:眼部关键点标注数据。可选的,眼部关键点标注数据的设置方式可以包括:首先,确定眼睑线的曲线控制点;其次,根据曲线控制点形成第一曲线;最后,采用插值方式在该第一曲线中插入多个点,插入的点的信息即为眼部关键点标注数据。经插入的点拟合形成的第二曲线相对真实的眼睑线的误差度,为第一曲线相对真实的眼睑线的误差度的1/5-1/10。
本实施例中的眼睑线信息以及神经网络的训练过程等相关内容的具体描述可以参见上述方法实施例中的描述,在此不再详细说明。
图9为本申请方法一个实施例的流程图。如图9所示,该实施例方法包括:步骤S900以及步骤S910。
S900、获取包括人脸至少部分区域的待处理图像。
在一个可选示例中,该步骤S900可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的第二获取模块1100执行。
在一个可选示例中,本申请的包括人脸至少部分区域的待处理图像可以为上嘴唇图像或者下嘴唇图像或者包含上嘴唇和下嘴唇的嘴巴图像,本申请的包括人脸至少部分区域的待处理图像也可以为包括多个不同类型的人脸器官的图像。
在一个可选示例中,本申请可以以待处理图像的初始人脸关键点信息为依据,对待处理图像进行切割处理,以获得上嘴唇图像块或下嘴唇图像块或者双唇图像块(即包括有上嘴唇和下嘴唇的嘴巴图像块),将获得的上嘴唇图像块或下嘴唇图像块或者双唇图像块作为输入给神经网络的待处理图像。另外,本申请可以调整获得的上嘴唇图像块或下嘴唇图像块或者双唇图像块的大小,调整大小后的上嘴唇图像块或下嘴唇图像块或者嘴巴图像块成为输入给神经网络的待处理图像。
S910、基于神经网络从待处理图像中提取唇线信息。
在一个可选示例中,该步骤S910可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取唇线模块1200执行。
在一个可选示例中,本申请的唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
在一个可选示例中,本申请的神经网络(即上述方法实施例中描述的第一神经网络)是基于样本数据集训练获得的神经网络。用于训练神经网络的样本数据集包括:嘴唇部关键点标注数据。可选的,嘴唇部关键点标注数据的设置方式可以包括:首先,确定唇线的曲线控制点;其次,根据曲线控制点形成第一曲线;最后,采用插值方式在该第一曲线中插入多个点,插入的点的信息即为嘴唇部关键点标注数据。经插入的点拟合形成的第二曲线相对真实的唇线的误差度,为第一曲线相对真实的唇线的误差度的1/5—1/10。
本实施例中的唇线信息以及神经网络的训练过程等相关内容的具体描述可以参见上述方法实施例中的描述,在此不再详细说明。
图10为本申请装置一个实施例的结构示意图。如图10所示,该实施例的装置包括:切割模块1000、输入模块1010以及提取器官关键点模块1020。可选的,该实施例的装置还可以包括:第一获取模块1030、整合模块1040、第一训练模块1050、第一标注模块1060以及应用处理模块1070。
切割模块1000用于切割待处理图像中的人脸,获得至少一个器官图像块。
输入模块1010用于将至少一个器官图像块分别输入到至少一个第一神经网络。其中,至少两个不同类别的器官对应不同的第一神经网络。
提取器官关键点模块1020用于经至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
第一获取模块1030用于获取待处理图像的初始人脸关键点信息。
整合模块1040用于整合初始人脸关键点信息和至少一个相应器官的关键点信息,获得待处理图像的人脸关键点信息。
第一训练模块1050用于基于样本数据集训练第一神经网络。其中的样本数据集包括:人脸的器官图像的关键点标注数据。
第一标注模块1060用于获取人脸的器官图像的关键点标注数据。第一标注模块1060可以执行的操作包括:确定人脸的器官的曲线控制点;根据曲线控制点形成第一曲线;采用插值方式在第一曲线中插入多个点,其中插入的点的信息为关键点标注数据。
应用处理模块1070用于根据得到的人脸的相应器官的关键点信息进行以下至少之一处理:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
图11为本申请装置一个实施例的结构示意图。如图11所示,该实施例的装置包括:第二获取模块1100和提取眼睑线模块1110。可选的,该装置还可以包括:第二训练模块1120以及第二标注模块1130。
第二获取模块1100用于获取包括人脸至少部分区域的待处理图像。
提取眼睑线模块1110用于基于神经网络从待处理图像中提取眼睑线信息,其中的眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
第二训练模块1120用于基于样本数据集训练神经网络,其中的样本数据集包括:眼部关键点标注数据。
第二标注模块1130用于获取眼部关键点标注数据,第二标注模块1130所执行的操作可以包括:确定眼睑线的曲线控制点;根据曲线控制点形成第一曲线;采用插值方式在第一曲线中插入多个点,其中插入的点的信息为眼部关键点标注数据。
图12为本申请装置一个实施例的结构示意图。如图12所示,该实施例的装置包括:第二获取模块1100以及提取唇线模块1200。可选的,该装置还可以包括:第三训练模块1210以及第三标注模块1220。
第二获取模块1100用于获取包括人脸至少部分区域的待处理图像。
提取唇线模块1200用于基于神经网络从待处理图像中提取唇线信息,其中的唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
第三训练模块1210用于基于样本数据集训练神经网络,其中的样本数据集包括:嘴唇部关键点标注数据。
第三标注模块1220用于获取嘴唇部关键点标注数据,第三标注模块1220可以执行的操作包括:确定唇线的曲线控制点;根据曲线控制点形成第一曲线;采用插值方式在第一曲线中插入多个点,其中插入的点的信息为嘴唇部关键点标注数据。
上述各模块所执行的操作可以参见上述方法实施例中的相关描述,在此不再详细说明。
图13示出了适于实现本申请的示例性设备1300,设备1300可以是汽车中配置的控制系统/电子系 统、移动终端(例如,智能移动电话等)、个人计算机(PC,例如,台式计算机或者笔记型计算机等)、平板电脑以及服务器等。图13中,设备1300包括一个或者多个处理器、通信部等,所述一个或者多个处理器可以为:一个或者多个中央处理单元(CPU)1301,和/或,一个或者多个图像处理器(GPU)1313等,处理器可以根据存储在只读存储器(ROM)1302中的可执行指令或者从存储部分1308加载到随机访问存储器(RAM)1303中的可执行指令而执行各种适当的动作和处理。通信部1312可以包括但不限于网卡,所述网卡可以包括但不限于IB(Infiniband)网卡。处理器可与只读存储器1302和/或随机访问存储器1303中通信以执行可执行指令,通过总线1304与通信部1312相连、并经通信部1312与其他目标设备通信,从而完成本申请中的相应步骤。
在一个可选的示例中,处理器所执行的指令包括:用于切割待处理图像中的人脸,获得至少一个器官图像块的指令;用于将至少一个器官图像块分别输入到至少一个第一神经网络的指令,其中,至少两个不同类别的器官对应不同的第一神经网络;用于经至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息的指令。处理器所执行的指令还可以可选的包括:用于获取待处理图像的初始人脸关键点信息的指令;用于整合初始人脸关键点信息和至少一个相应器官的关键点信息,获得待处理图像的人脸关键点信息的指令;用于基于样本数据集训练第一神经网络的指令,其中的样本数据集包括:人脸的器官图像的关键点标注数据;用于获取人脸的器官图像的关键点标注数据的指令,用于获取人脸的器官图像的关键点标注数据的指令可以包括:用于确定人脸的器官的曲线控制点的指令;用于根据曲线控制点形成第一曲线的指令;以及用于采用插值方式在第一曲线中插入多个点的指令,插入的点的信息为关键点标注数据。用于根据得到的人脸的相应器官的关键点信息进行以下至少之一处理的指令:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
在一个可选的示例中,处理器所执行的指令包括:用于获取包括人脸至少部分区域的待处理图像的指令;用于基于神经网络从待处理图像中提取眼睑线信息的指令,其中的眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。处理器所执行的指令还可以可选的包括:用于基于样本数据集训练神经网络的指令,其中的样本数据集包括:眼部关键点标注数据。用于获取眼部关键点标注数据的指令,用于获取眼部关键点标注数据的指令可以包括:用于确定眼睑线的曲线控制点的指令;用于根据曲线控制点形成第一曲线的指令;以及用于采用插值方式在第一曲线中插入多个点的指令,其中插入的点的信息为眼部关键点标注数据。
在一个可选的示例中,处理器所执行的指令包括:用于获取包括人脸至少部分区域的待处理图像的指令;用于基于神经网络从待处理图像中提取唇线信息的指令,其中的唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。处理器所执行的指令还可以可选的包括:用于基于样本数据集训练神经网络的指令,其中的样本数据集包括:嘴唇部关键点标注数据。用于获取嘴唇部关键点标注数据的指令,用于获取嘴唇部关键点标注数据的指令可以包括:用于确定唇线的曲线控制点的指令;用于根据曲线控制点形成第一曲线的指令;以及用于采用插值方式述第一曲线中插入多个点的指令,其中插入的点的信息为嘴唇部关键点标注数据。
上述各指令所执行的操作可以参见上述方法实施例中的相关描述,在此不再详细说明。
此外,在RAM 1303中,还可以存储有装置操作所需的各种程序以及数据。CPU1301、ROM1302以及RAM1303通过总线1304彼此相连。在有RAM1303的情况下,ROM1302为可选模块。RAM1303存储可执行指令,或在运行时向ROM1302中写入可执行指令,可执行指令使中央处理单元1301执行上述物体分割方法所包括的步骤。输入/输出(I/O)接口1305也连接至总线1304。通信部1312可以集成设置,也可以设置为具有多个子模块(例如,多个IB网卡),并分别与总线连接。
以下部件连接至I/O接口1305:包括键盘、鼠标等的输入部分1306;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分1307;包括硬盘等的存储部分1308;以及包括诸如LAN 卡、调制解调器等的网络接口卡的通信部分1309。通信部分1309经由诸如因特网的网络执行通信处理。驱动器1310也根据需要连接至I/O接口1305。可拆卸介质1311,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1310上,以便于从其上读出的计算机程序根据需要被安装在存储部分1308中。
需要特别说明的是,如图13所示的架构仅为一种可选实现方式,在实践过程中,可根据实际需要对上述图13的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如,GPU1313和CPU1301可分离设置,再如理,可将GPU1313集成在CPU1301上,通信部可分离设置,也可集成设置在CPU1301或GPU1313上等。这些可替换的实施方式均落入本申请的保护范围。
特别地,根据本申请的实施方式,下文参考流程图描述的过程可以被实现为计算机软件程序,例如,本申请实施方式包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的步骤的程序代码,程序代码可包括对应执行本申请提供的步骤对应的指令。
在这样的实施方式中,该计算机程序可以通过通信部分1309从网络上被下载及安装,和/或从可拆卸介质1311被安装。在该计算机程序被中央处理单元(CPU)1301执行时,执行本申请中记载的上述指令。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
可能以许多方式来实现本申请的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本申请的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本申请的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本申请实施为记录在记录介质中的程序,这些程序包括用于实现根据本申请的方法的机器可读指令。因而,本申请还覆盖存储用于执行根据本申请的方法的程序的记录介质。
本申请的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本申请限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本申请的原理和实际应用,并且使本领域的普通技术人员能够理解本申请从而设计适于特定用途的带有各种修改的各种实施例。
Claims (51)
- 一种人脸图像处理方法,其特征在于,包括:切割待处理图像中的人脸,获得至少一个器官图像块;将所述至少一个器官图像块分别输入到至少一个第一神经网络,其中,至少两个不同类别的器官对应不同的第一神经网络;经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:获取所述待处理图像的初始人脸关键点信息;整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息。
- 根据权利要求1所述的方法,其特征在于,所述切割待处理图像中的人脸,获得至少一个器官图像块包括:获取所述待处理图像的初始人脸关键点信息;根据所述初始人脸关键点信息切割所述待处理图像中的人脸,获得所述至少一个器官图像块。
- 根据权利要求2或3所述的方法,其特征在于,所述获取所述待处理图像的初始人脸关键点信息,包括:将所述待处理图像输入第二神经网络;经所述第二神经网络提取所述待处理图像的人脸关键点信息,得到所述初始人脸关键点信息。
- 根据权利要求2-4任一所述的方法,其特征在于,所述整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息,包括:采用所述至少一个相应器官的关键点信息替换所述初始人脸关键点中的相同器官的至少部分关键点信息,获得所述待处理图像的人脸关键点信息。
- 根据权利要求2-5任一所述的方法,其特征在于,所述整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息包括:将所述至少一个相应器官的关键点信息在相应器官图像块中的位置和编号,分别转换为所述至少一个相应器官的关键点信息在待处理图像中的位置和编号。
- 根据权利要求2-6任一所述的方法,其特征在于,所述人脸关键点信息包括的关键点总数量,大于所述初始人脸关键点信息包括的关键点总数量;和/或,所述人脸关键点信息包括的经第一神经网络提取的一器官图像块的器官关键点的数量,大于所述初始人脸关键点信息包括的对应所述器官图像块的器官关键点的数量。
- 根据权利要求7所述的方法,其特征在于,所述经第一神经网络提取的一器官图像块的至少一个器官关键点拟合而得的器官曲线的误差度,为经所述初始人脸关键点信息中包括的对应所述器官图像块的至少一个器官关键点拟合而得的器官曲线的误差度的1/5-1/10。
- 根据权利要求1-8任一所述的方法,其特征在于,所述至少一个器官图像块包括以下至少之一:至少一眼部图像块、至少一嘴部图像块。
- 根据权利要求9所述的方法,其特征在于,所述至少一个相应器官的关键点信息包括以下至少之一:眼睑线信息、嘴唇线信息。
- 根据权利要求10所述的方法,其特征在于,所述眼睑线信息包括:由单眼上眼睑或下眼睑处的10-15个关键点表示的轨迹信息或拟合线;和/或,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
- 根据权利要求1-11任一所述的方法,其特征在于,所述将所述至少一个器官图像块分别输入到至少一个第一神经网络之前,还包括:基于样本数据集训练所述第一神经网络,其中,所述样本数据集包括:人脸的器官图像的关键点标注数据。
- 根据权利要求12所述的方法,其特征在于,所述基于样本数据集训练所述第一神经网络之前,还包括采用以下步骤获取所述人脸的器官图像的关键点标注数据:确定人脸的器官的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述关键点标注数据。
- 根据权利要求13所述的方法,其特征在于,经所述插入的点拟合形成的第二曲线相对所述人脸的器官曲线的误差度,为所述第一曲线相对所述人脸的器官曲线的误差度的1/5-1/10。
- 根据权利要求2-14任一所述的方法,其特征在于,所述初始关键点信息中包括的关键点的数量小于或等于106点,所述人脸关键点信息中包括的关键点的数量大于106点。
- 根据权利要求2-15任一所述的方法,其特征在于,所述人脸关键点信息中包括的关键点的数量为186、240、220或274点。
- 根据权利要求2-16任一所述的方法,其特征在于,所述人脸关键点信息包括:用于定位眼睛位置的4-24个关键点、眼睑线信息包括的44-48个关键点;用于定位嘴巴位置的0-20个关键点,唇线包括的60-64个关键点;眉毛区域包括的26-58个关键点;鼻子区域包括的15-27个关键点;脸部轮廓的33-37个关键点。
- 根据权利要求1-17任一所述的方法,其特征在于,还包括:根据得到的人脸的相应器官的关键点信息进行以下至少之一处理:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
- 一种人脸图像处理方法,其特征在于,包括:获取包括人脸至少部分区域的待处理图像;基于神经网络从所述待处理图像中提取眼睑线信息,所述眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
- 根据权利要求19所述的方法,其特征在于,所述待处理图像为单眼图像或者双眼图像;或者,所述待处理图像包括人脸图像,所述获取包括人脸至少部分区域的待处理图像,包括:切割待处理图像中的人脸的单眼图像块或双眼图像块,所述单眼图像块或双眼图像块为所述包括人脸至少部分区域的待处理图像。
- 根据权利要求19或20所述的方法,其特征在于,所述基于神经网络从所述待处理图像中提取眼睑线信息之前,还包括:基于样本数据集训练所述神经网络,其中,所述样本数据集包括:眼部关键点标注数据。
- 根据权利要求21所述的方法,其特征在于,所述眼部关键点标注数据采用以下步骤获取:确定眼睑线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述眼部关键点标注数据。
- 根据权利要求22所述的方法,其特征在于,经所述插入的点拟合形成的第二曲线相对眼睑线的误差度,为所述第一曲线相对眼睑线的误差度的1/5-1/10。
- 一种人脸图像处理方法,其特征在于,包括:获取包括人脸至少部分区域的待处理图像;基于神经网络从所述待处理图像中提取唇线信息,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
- 根据权利要求24所述的方法,其特征在于,所述待处理图像为单唇图像或者双唇图像;或者,所述待处理图像包括人脸图像,所述获取包括人脸至少部分区域的待处理图像,包括:切割待处理图像中的人脸的单唇图像块或双唇图像块,所述单唇图像块或双唇图像块为所述包括人脸至少部分区域的待处理图像。
- 根据权利要求24或25所述的方法,其特征在于,所述基于神经网络从所述待处理图像中提取唇线信息之前,还包括:基于样本数据集训练所述神经网络,其中,所述样本数据集包括:嘴唇部关键点标注数据。
- 根据权利要求26所述的方法,其特征在于,所述嘴唇部关键点标注数据采用以下步骤获取:确定唇线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述嘴唇部关键点标注数据。
- 根据权利要求27所述的方法,其特征在于,经所述插入的点拟合形成的第二曲线相对唇线的误差度,为所述第一曲线相对唇线的误差度的1/5-1/10。
- 一种人脸图像处理装置,其特征在于,包括:切割模块,用于切割待处理图像中的人脸,获得至少一个器官图像块;输入模块,用于将所述至少一个器官图像块分别输入到至少一个第一神经网络,其中,至少两个不同类别的器官对应不同的第一神经网络;提取器官关键点模块,用于经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息。
- 根据权利要求29所述的装置,其特征在于,所述装置还包括:第一获取模块,用于获取所述待处理图像的初始人脸关键点信息;整合模块,用于整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息。
- 根据权利要求29或30所述的装置,其特征在于,所述装置还包括:第一训练模块,用于基于样本数据集训练所述第一神经网络,其中,所述样本数据集包括:人脸的器官图像的关键点标注数据。
- 根据权利要求31所述的装置,其特征在于,所述装置还包括:用于获取所述人脸的器官图像的关键点标注数据的第一标注模块,所述第一标注模块用于:确定人脸的器官的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述关键点标注数据。
- 根据权利要求29-32任一所述的装置,其特征在于,所述装置还包括:应用处理模块,用于根据得到的人脸的相应器官的关键点信息进行以下至少之一处理:人脸的图像 渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
- 一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,下述指令被运行:用于切割待处理图像中的人脸,获得至少一个器官图像块的指令;用于将所述至少一个器官图像块分别输入到至少一个第一神经网络的指令,其中,至少两个不同类别的器官对应不同的第一神经网络;用于经所述至少一个第一神经网络分别提取各自输入的器官图像块的器官的关键点信息,分别得到人脸的至少一个相应器官的关键点信息的指令。
- 根据权利要求34所述的电子设备,其特征在于,所述被运行的指令还包括:用于获取所述待处理图像的初始人脸关键点信息的指令;用于整合所述初始人脸关键点信息和所述至少一个相应器官的关键点信息,获得所述待处理图像的人脸关键点信息的指令。
- 根据权利要求34或35所述的电子设备,其特征在于,所述被运行的指令还包括:用于基于样本数据集训练所述第一神经网络的指令,其中,所述样本数据集包括:人脸的器官图像的关键点标注数据。
- 根据权利要求36所述的电子设备,其特征在于,所述被运行的指令还包括:用于获取所述人脸的器官图像的关键点标注数据的指令,所述用于获取所述人脸的器官图像的关键点标注数据的指令包括:用于确定人脸的器官的曲线控制点的指令;用于根据所述曲线控制点形成第一曲线的指令;用于采用插值方式在所述第一曲线中插入多个点的指令,所述插入的点的信息为所述关键点标注数据。
- 根据权利要求34-37任一所述的电子设备,其特征在于,被运行的指令还包括:用于根据得到的人脸的相应器官的关键点信息进行以下至少之一处理的指令:人脸的图像渲染、变脸处理、美颜处理、美妆处理、人脸识别、人脸状态检测、表情检测、属性检测。
- 一种人脸图像处理装置,其特征在于,包括:第二获取模块,用于获取包括人脸至少部分区域的待处理图像;提取眼睑线模块,用于基于神经网络从所述待处理图像中提取眼睑线信息,所述眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
- 根据权利要求39所述的装置,其特征在于,所述装置还包括:第二训练模块,用于基于样本数据集训练所述神经网络,其中,所述样本数据集包括:眼部关键点标注数据。
- 根据权利要求40所述的装置,其特征在于,所述装置还包括:用于获取所述眼部关键点标注数据的第二标注模块,所述第二标注模块用于:确定眼睑线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述眼部关键点标注数据。
- 一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,下述指令被运行:用于获取包括人脸至少部分区域的待处理图像的指令;用于基于神经网络从所述待处理图像中提取眼睑线信息的指令,所述眼睑线信息包括:单眼上眼睑处或下眼睑处的10-15个关键点表示的轨迹信息或拟合线。
- 根据权利要求42所述的电子设备,其特征在于,被运行的指令还包括:用于基于样本数据集训练所述神经网络的指令,其中,所述样本数据集包括:眼部关键点标注数据。
- 根据权利要求43所述的电子设备,其特征在于,被运行的指令还包括:用于获取所述眼部关键点标注数据的指令,所述用于获取所述眼部关键点标注数据的指令包括:用于确定眼睑线的曲线控制点的指令;用于根据所述曲线控制点形成第一曲线的指令;用于采用插值方式在所述第一曲线中插入多个点的指令,所述插入的点的信息为所述眼部关键点标注数据。
- 一种人脸图像处理装置,其特征在于,包括:第二获取模块,用于获取包括人脸至少部分区域的待处理图像;提取唇线模块,用于基于神经网络从所述待处理图像中提取唇线信息,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
- 根据权利要求45所述的装置,其特征在于,所述装置还包括:第三训练模块,用于基于样本数据集训练所述神经网络,其中,所述样本数据集包括:嘴唇部关键点标注数据。
- 根据权利要求46所述的装置,其特征在于,所述装置还包括:用于获取所述嘴唇部关键点标注数据的第三标注模块,所述第三标注模块用于:确定唇线的曲线控制点;根据所述曲线控制点形成第一曲线;采用插值方式在所述第一曲线中插入多个点,所述插入的点的信息为所述嘴唇部关键点标注数据。
- 一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,下述指令被运行:用于获取包括人脸至少部分区域的待处理图像的指令;用于基于神经网络从所述待处理图像中提取唇线信息的指令,所述唇线信息包括:单唇上轮廓处的16-21个关键点和下轮廓处的16-21个关键点表示的轨迹信息或拟合线。
- 根据权利要求48所述的电子设备,其特征在于,被运行的指令还包括:用于基于样本数据集训练所述神经网络的指令,其中,所述样本数据集包括:嘴唇部关键点标注数据。
- 根据权利要求49所述的电子设备,其特征在于,被运行的指令还包括:用于获取所述嘴唇部关键点标注数据的指令,所述用于获取所述嘴唇部关键点标注数据的指令包括:用于确定唇线的曲线控制点的指令;用于根据所述曲线控制点形成第一曲线的指令;用于采用插值方式在所述第一曲线中插入多个点的指令,所述插入的点的信息为所述嘴唇部关键点标注数据。
- 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述权利要求1-28中任一项所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/456,738 US11227147B2 (en) | 2017-08-09 | 2019-06-28 | Face image processing methods and apparatuses, and electronic devices |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710677556.8A CN108229293A (zh) | 2017-08-09 | 2017-08-09 | 人脸图像处理方法、装置和电子设备 |
CN201710677556.8 | 2017-08-09 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/456,738 Continuation US11227147B2 (en) | 2017-08-09 | 2019-06-28 | Face image processing methods and apparatuses, and electronic devices |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019029486A1 true WO2019029486A1 (zh) | 2019-02-14 |
Family
ID=62654894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/098999 WO2019029486A1 (zh) | 2017-08-09 | 2018-08-06 | 人脸图像处理方法、装置和电子设备 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11227147B2 (zh) |
CN (3) | CN108229293A (zh) |
WO (1) | WO2019029486A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188613A (zh) * | 2019-04-28 | 2019-08-30 | 上海鹰瞳医疗科技有限公司 | 图像分类方法及设备 |
CN110322416A (zh) * | 2019-07-09 | 2019-10-11 | 腾讯科技(深圳)有限公司 | 图像数据处理方法、装置以及计算机可读存储介质 |
CN110580680A (zh) * | 2019-09-09 | 2019-12-17 | 武汉工程大学 | 基于组合学习的人脸超分辨率方法及装置 |
CN113221698A (zh) * | 2021-04-29 | 2021-08-06 | 北京科技大学 | 一种基于深度学习和表情识别的面部关键点定位方法 |
CN113837067A (zh) * | 2021-09-18 | 2021-12-24 | 成都数字天空科技有限公司 | 器官轮廓检测方法、装置、电子设备及可读存储介质 |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018033137A1 (zh) * | 2016-08-19 | 2018-02-22 | 北京市商汤科技开发有限公司 | 在视频图像中展示业务对象的方法、装置和电子设备 |
CN108229293A (zh) | 2017-08-09 | 2018-06-29 | 北京市商汤科技开发有限公司 | 人脸图像处理方法、装置和电子设备 |
US10684681B2 (en) | 2018-06-11 | 2020-06-16 | Fotonation Limited | Neural network image processing apparatus |
CN109165571B (zh) * | 2018-08-03 | 2020-04-24 | 北京字节跳动网络技术有限公司 | 用于插入图像的方法和装置 |
CN110826372B (zh) * | 2018-08-10 | 2024-04-09 | 浙江宇视科技有限公司 | 人脸特征点检测方法及装置 |
CN109522863B (zh) * | 2018-11-28 | 2020-11-27 | 北京达佳互联信息技术有限公司 | 耳部关键点检测方法、装置及存储介质 |
TWI699671B (zh) * | 2018-12-12 | 2020-07-21 | 國立臺灣大學 | 減低眼球追蹤運算的方法和其眼動追蹤裝置 |
CN110110695B (zh) * | 2019-05-17 | 2021-03-19 | 北京字节跳动网络技术有限公司 | 用于生成信息的方法和装置 |
CN110706203A (zh) * | 2019-09-06 | 2020-01-17 | 成都玻尔兹曼智贝科技有限公司 | 基于深度学习的头颅侧位片关键点自动侦测方法及系统 |
CN112560555A (zh) * | 2019-09-25 | 2021-03-26 | 北京中关村科金技术有限公司 | 扩充关键点的方法、装置以及存储介质 |
US11687778B2 (en) | 2020-01-06 | 2023-06-27 | The Research Foundation For The State University Of New York | Fakecatcher: detection of synthetic portrait videos using biological signals |
CN111209873A (zh) * | 2020-01-09 | 2020-05-29 | 杭州趣维科技有限公司 | 一种基于深度学习的高精度人脸关键点定位方法及系统 |
CN111444928A (zh) * | 2020-03-30 | 2020-07-24 | 北京市商汤科技开发有限公司 | 关键点检测的方法、装置、电子设备及存储介质 |
EP4128194A1 (en) | 2020-03-31 | 2023-02-08 | Snap Inc. | Augmented reality beauty product tutorials |
CN111476151B (zh) * | 2020-04-03 | 2023-02-03 | 广州市百果园信息技术有限公司 | 眼球检测方法、装置、设备及存储介质 |
US20230215063A1 (en) * | 2020-06-09 | 2023-07-06 | Oral Tech Ai Pty Ltd | Computer-implemented detection and processing of oral features |
CN112101257B (zh) * | 2020-09-21 | 2022-05-31 | 北京字节跳动网络技术有限公司 | 训练样本生成方法、图像处理方法、装置、设备和介质 |
CN112613447B (zh) * | 2020-12-29 | 2024-09-17 | 上海商汤智能科技有限公司 | 关键点检测方法及装置、电子设备和存储介质 |
CN112597944B (zh) * | 2020-12-29 | 2024-06-11 | 北京市商汤科技开发有限公司 | 关键点检测方法及装置、电子设备和存储介质 |
CN115049580A (zh) * | 2021-03-09 | 2022-09-13 | 杭州朝厚信息科技有限公司 | X线头影图像的关键点检测方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268591A (zh) * | 2014-09-19 | 2015-01-07 | 海信集团有限公司 | 一种面部关键点检测方法及装置 |
CN105354565A (zh) * | 2015-12-23 | 2016-02-24 | 北京市商汤科技开发有限公司 | 基于全卷积网络人脸五官定位与判别的方法及系统 |
CN106909870A (zh) * | 2015-12-22 | 2017-06-30 | 中兴通讯股份有限公司 | 人脸图像的检索方法及装置 |
CN108229293A (zh) * | 2017-08-09 | 2018-06-29 | 北京市商汤科技开发有限公司 | 人脸图像处理方法、装置和电子设备 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050272517A1 (en) * | 2001-06-11 | 2005-12-08 | Recognition Insight, Llc | Swing position recognition and reinforcement |
US7027054B1 (en) * | 2002-08-14 | 2006-04-11 | Avaworks, Incorporated | Do-it-yourself photo realistic talking head creation system and method |
WO2006023046A1 (en) * | 2004-06-21 | 2006-03-02 | Nevengineering, Inc. | Single image based multi-biometric system and method |
US7751599B2 (en) * | 2006-08-09 | 2010-07-06 | Arcsoft, Inc. | Method for driving virtual facial expressions by automatically detecting facial expressions of a face image |
JP2010041604A (ja) * | 2008-08-07 | 2010-02-18 | Fujitsu Ltd | ネットワーク管理方法 |
US20140043329A1 (en) * | 2011-03-21 | 2014-02-13 | Peng Wang | Method of augmented makeover with 3d face modeling and landmark alignment |
EP2915101A4 (en) * | 2012-11-02 | 2017-01-11 | Itzhak Wilf | Method and system for predicting personality traits, capabilities and suggested interactions from images of a person |
US9589357B2 (en) * | 2013-06-04 | 2017-03-07 | Intel Corporation | Avatar-based video encoding |
CN105981041A (zh) * | 2014-05-29 | 2016-09-28 | 北京旷视科技有限公司 | 使用粗到细级联神经网络的面部关键点定位 |
CN104537630A (zh) * | 2015-01-22 | 2015-04-22 | 厦门美图之家科技有限公司 | 一种基于年龄估计的图像美颜方法和装置 |
US9552510B2 (en) * | 2015-03-18 | 2017-01-24 | Adobe Systems Incorporated | Facial expression capture for character animation |
CN105469081B (zh) * | 2016-01-15 | 2019-03-22 | 成都品果科技有限公司 | 一种用于美颜的人脸关键点定位方法及系统 |
US10055880B2 (en) * | 2016-12-06 | 2018-08-21 | Activision Publishing, Inc. | Methods and systems to modify a two dimensional facial image to increase dimensional depth and generate a facial image that appears three dimensional |
-
2017
- 2017-08-09 CN CN201710677556.8A patent/CN108229293A/zh active Pending
- 2017-08-09 CN CN202110473985.XA patent/CN113128449A/zh active Pending
- 2017-08-09 CN CN202110475720.3A patent/CN113205040A/zh active Pending
-
2018
- 2018-08-06 WO PCT/CN2018/098999 patent/WO2019029486A1/zh active Application Filing
-
2019
- 2019-06-28 US US16/456,738 patent/US11227147B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268591A (zh) * | 2014-09-19 | 2015-01-07 | 海信集团有限公司 | 一种面部关键点检测方法及装置 |
CN106909870A (zh) * | 2015-12-22 | 2017-06-30 | 中兴通讯股份有限公司 | 人脸图像的检索方法及装置 |
CN105354565A (zh) * | 2015-12-23 | 2016-02-24 | 北京市商汤科技开发有限公司 | 基于全卷积网络人脸五官定位与判别的方法及系统 |
CN108229293A (zh) * | 2017-08-09 | 2018-06-29 | 北京市商汤科技开发有限公司 | 人脸图像处理方法、装置和电子设备 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188613A (zh) * | 2019-04-28 | 2019-08-30 | 上海鹰瞳医疗科技有限公司 | 图像分类方法及设备 |
CN110322416A (zh) * | 2019-07-09 | 2019-10-11 | 腾讯科技(深圳)有限公司 | 图像数据处理方法、装置以及计算机可读存储介质 |
CN110322416B (zh) * | 2019-07-09 | 2022-11-18 | 腾讯科技(深圳)有限公司 | 图像数据处理方法、装置以及计算机可读存储介质 |
CN110580680A (zh) * | 2019-09-09 | 2019-12-17 | 武汉工程大学 | 基于组合学习的人脸超分辨率方法及装置 |
CN110580680B (zh) * | 2019-09-09 | 2022-07-05 | 武汉工程大学 | 基于组合学习的人脸超分辨率方法及装置 |
CN113221698A (zh) * | 2021-04-29 | 2021-08-06 | 北京科技大学 | 一种基于深度学习和表情识别的面部关键点定位方法 |
CN113221698B (zh) * | 2021-04-29 | 2023-08-15 | 北京科技大学 | 一种基于深度学习和表情识别的面部关键点定位方法 |
CN113837067A (zh) * | 2021-09-18 | 2021-12-24 | 成都数字天空科技有限公司 | 器官轮廓检测方法、装置、电子设备及可读存储介质 |
CN113837067B (zh) * | 2021-09-18 | 2023-06-02 | 成都数字天空科技有限公司 | 器官轮廓检测方法、装置、电子设备及可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US11227147B2 (en) | 2022-01-18 |
CN108229293A (zh) | 2018-06-29 |
CN113128449A (zh) | 2021-07-16 |
US20190325200A1 (en) | 2019-10-24 |
CN113205040A (zh) | 2021-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019029486A1 (zh) | 人脸图像处理方法、装置和电子设备 | |
US20200218883A1 (en) | Face pose analysis method, electronic device, and storage medium | |
US10776970B2 (en) | Method and apparatus for processing video image and computer readable medium | |
US11295474B2 (en) | Gaze point determination method and apparatus, electronic device, and computer storage medium | |
US11481869B2 (en) | Cross-domain image translation | |
US20200320347A1 (en) | System and method for domain adaptation using synthetic data | |
CN108229296B (zh) | 人脸皮肤属性识别方法和装置、电子设备、存储介质 | |
WO2018121777A1 (zh) | 人脸检测方法、装置和电子设备 | |
CN108073910B (zh) | 用于生成人脸特征的方法和装置 | |
CN108734078B (zh) | 图像处理方法、装置、电子设备、存储介质及程序 | |
CN110148084B (zh) | 由2d图像重建3d模型的方法、装置、设备及存储介质 | |
CN108229301B (zh) | 眼睑线检测方法、装置和电子设备 | |
CN108776983A (zh) | 基于重建网络的人脸重建方法和装置、设备、介质、产品 | |
US9129152B2 (en) | Exemplar-based feature weighting | |
WO2021127916A1 (zh) | 脸部情感识别方法、智能装置和计算机可读存储介质 | |
WO2019127102A1 (zh) | 信息处理方法、装置、云处理设备以及计算机程序产品 | |
CN110796089A (zh) | 用于训练换脸模型的方法和设备 | |
CN108491812B (zh) | 人脸识别模型的生成方法和装置 | |
CN111950486A (zh) | 基于云计算的教学视频处理方法 | |
CN113688737B (zh) | 人脸图像处理方法、装置、电子设备、存储介质及程序 | |
CN116704066A (zh) | 图像生成模型的训练方法、装置、终端及存储介质 | |
EP4394690A1 (en) | Image processing method and apparatus, computer device, computer-readable storage medium, and computer program product | |
CN108229477B (zh) | 针对图像的视觉关联性识别方法、装置、设备及存储介质 | |
US10628674B1 (en) | Image capture with context data overlay | |
Sable et al. | Doc-handler: Document scanner, manipulator, and translator based on image and natural language processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18843571 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18/06/2020) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18843571 Country of ref document: EP Kind code of ref document: A1 |