WO2019119396A1 - 人脸表情识别方法及装置 - Google Patents

人脸表情识别方法及装置 Download PDF

Info

Publication number
WO2019119396A1
WO2019119396A1 PCT/CN2017/117921 CN2017117921W WO2019119396A1 WO 2019119396 A1 WO2019119396 A1 WO 2019119396A1 CN 2017117921 W CN2017117921 W CN 2017117921W WO 2019119396 A1 WO2019119396 A1 WO 2019119396A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
facial expression
image
facial
expression
Prior art date
Application number
PCT/CN2017/117921
Other languages
English (en)
French (fr)
Inventor
吴世豪
胡希平
程俊
张星明
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2017/117921 priority Critical patent/WO2019119396A1/zh
Publication of WO2019119396A1 publication Critical patent/WO2019119396A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the invention belongs to the technical field of facial expression recognition, and in particular relates to a facial expression recognition method and device.
  • Facial expression recognition refers to analyzing the expression state of a human face from a given image, thereby determining the psychological emotion of the recognized object, for example, natural, happy, angry, surprised, and the like. Facial expression recognition is an important field that contributes to the development of many aspects such as character analysis, character analysis, and depression detection. Therefore, it is of great value to solve the human face expression recognition.
  • the existing facial expression recognition is less robust to facial expression features, and is easily interfered by noise such as identity information, resulting in low accuracy of expression recognition.
  • the current expression recognition algorithm can only perform target expression recognition, and the expression lacks association, and the different expressions of the same user cannot be known.
  • a first aspect of the present invention provides a facial expression recognition method, the facial expression recognition method comprising:
  • the face expression tag and the face verification result are displayed.
  • a second aspect of the present invention provides a facial expression recognition apparatus, the facial expression recognition apparatus comprising:
  • An image acquisition module configured to acquire an image to be processed
  • a face extraction module configured to extract a face image from the image to be processed
  • An expression classification module configured to perform facial expression classification on the facial image based on deep learning, and obtain a facial expression label of the facial image, wherein the facial expression label indicates an expression of the facial expression;
  • a face verification module configured to perform face verification on the face image, and obtain a face verification result, where the face verification result indicates information of a user to which the face belongs;
  • a display module configured to display the facial expression tag and the face verification result.
  • a third aspect of the present invention provides a facial expression recognition apparatus including a memory, a processor, and a computer program stored in the memory and operable on the processor, the processor executing the computer program The steps of the facial expression recognition method described in the above first aspect are implemented.
  • a fourth aspect of the present invention provides a computer readable storage medium storing a computer program, the computer program being executed by a processor to implement a facial expression recognition method according to the first aspect described above A step of.
  • the solution of the present invention acquires an image to be processed, extracts a face image from the image to be processed, and performs facial expression classification on the face image based on deep learning, acquires a facial expression tag of the face image, and performs human face image on the face image. Face verification, you can get the face verification result, so as to know the user to which the face belongs.
  • the solution of the invention improves the accuracy of the expression recognition by recognizing the expression in the face image based on the deep learning, and by performing face verification on the face image, the expressions in the face image can be correlated to determine whether it belongs to Different expressions of the same user.
  • FIG. 1 is a schematic flowchart of an implementation process of a facial expression recognition method according to Embodiment 1 of the present invention
  • FIG. 2a is an exemplary diagram of different expressions of the same user
  • FIG. 2b is an exemplary diagram of expressions of different users
  • FIG. 3 is a schematic flowchart of an implementation process of a facial expression recognition method according to Embodiment 2 of the present invention.
  • FIG. 4 is a schematic flowchart of an implementation process of a facial expression recognition method according to Embodiment 3 of the present invention.
  • FIG. 5 is a schematic diagram of a facial expression recognition apparatus according to Embodiment 4 of the present invention.
  • FIG. 6 is a schematic diagram of a facial expression recognition apparatus according to Embodiment 5 of the present invention.
  • the term “if” can be interpreted as “when” or “on” or “in response to determining” or “in response to detecting” depending on the context. .
  • the phrase “if determined” or “if detected [condition or event described]” may be interpreted in context to mean “once determined” or “in response to determining” or “once detected [condition or event described] ] or “in response to detecting [conditions or events described]”.
  • the size of the serial number of each step in the embodiment does not mean that the order of execution is sequential, and the order of execution of each process should be determined by its function and internal logic, and should not be limited to the implementation process of the embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of an implementation process of a facial expression recognition method according to Embodiment 1 of the present invention. As shown in the figure, the facial expression recognition method may include the following steps:
  • Step S101 Acquire an image to be processed.
  • the image to be processed may be an image directly input to the facial expression recognition device, or may be an image directly obtained by inputting a video to the facial expression recognition device or a facial expression recognition device directly connected to the imaging device. Get the image.
  • the use of video or camera devices usually extracts each frame from the video for processing.
  • the number of the to-be-processed images may be one or more, which is not limited herein.
  • the image to be processed is two frames of video separated by one second, that is, the number of images to be processed is two.
  • Step S102 extracting a face image from the image to be processed.
  • the position information of the face in the image to be processed can be determined by the Dlib machine learning library, and then the face extraction is performed from the image to be processed.
  • Dlib is a machine learning library written by C++, which includes many machines. Learn common algorithms. If the image to be processed contains multiple faces, after the face in the image to be processed is extracted, a plurality of face images of different sizes may be obtained, and then the facial expressions of the obtained plurality of face images are separately classified and human. Face verification to identify the expression of each face and obtain the information of the user to which each face belongs, thereby judging whether multiple faces belong to the same user.
  • Step S103 Perform facial expression classification on the facial image based on deep learning, and obtain a facial expression label of the facial image.
  • the facial expression tag indicates an expression of the face.
  • a convolutional neural network (Convolutional Neural Network)
  • CNN convolutional Neural Network
  • the network, CNN performs expression classification on the face image, that is, recognizes an expression of a face in the face image.
  • Step S104 performing face verification on the face image to obtain a face verification result.
  • the face verification result indicates information of a user to which the face belongs.
  • the face image may be face-verified by a face verification model (for example, a DeepID face verification model), and specifically, the face in the face image may be verified by the face verification model.
  • a face verification model for example, a DeepID face verification model
  • the facial expression database may store information of a plurality of users and facial expressions of each of the plurality of users.
  • the information of the user may be identification information that identifies the identity of the user, and can distinguish different users, for example, a human face.
  • Each user in the emoticon database sets a sequence number.
  • steps S103 and S104 may be performed simultaneously.
  • Step S105 displaying the facial expression tag and the face verification result.
  • the face image, the facial expression tag, and the face verification result may be displayed, and the face expression tag and the face verification result may be displayed on the face image designation.
  • a location eg, above, below, left, or right of the face image
  • the user knows which face image the face expression tag and the face verification result correspond to, and updates the face expression database
  • displaying the facial expression label may facilitate the user to view an expression of a human face in the facial image
  • displaying the facial verification result may facilitate viewing which user the facial surface belongs to in the facial image.
  • Simultaneously displaying the facial expression tag and the face verification result may facilitate the user to view which user the facial expression in the face image belongs to.
  • the face expression tag and the face verification result of the face image may be displayed at a specified position of each face image.
  • FIG. 2a An example diagram of different expressions of the same user is shown in FIG. 2a, and p1 in FIG. 2a is the serial number of the user.
  • An example diagram of the expressions of different users is shown in Figure 2b. Different serial numbers can represent different users, so that the user can know whether the facial expressions belong to the same user by viewing the serial numbers in the face image, such as p2 and p3 in FIG. 2b.
  • the embodiment of the present invention improves the accuracy of the expression recognition by recognizing the expression in the face image based on the deep learning, and by performing face verification on the face image, the expressions in the face image can be correlated to determine whether Different expressions belonging to the same user.
  • FIG. 3 is a schematic flowchart of an implementation process of a facial expression recognition method according to Embodiment 2 of the present invention.
  • the method for identifying a facial expression may include the following steps:
  • Step S301 acquiring an image to be processed.
  • step S101 This step is the same as that of step S101.
  • step S101 For details, refer to the related description of step S101, and details are not described herein again.
  • Step S302 extracting a face image from the image to be processed.
  • step S102 This step is the same as the step S102.
  • steps S102 For details, refer to the related description of step S102, and details are not described herein again.
  • Step S303 the size of the face image is adjusted to a first preset size.
  • the size of the face image may be scaled and adjusted to a first preset size. Specifically, the size of the face image may be scaled and adjusted to M*M (for example, 48*48). The size of , where M is an integer greater than zero.
  • Step S304 respectively, dividing an image of a size of a second preset size from the N preset positions in the adjusted face image.
  • N is an integer greater than zero.
  • the face image after the zoom adjustment is segmented, and may be separately segmented from N preset positions in the face image.
  • An image of a second preset size is extracted, that is, N images of a second preset size are segmented from the face image.
  • an image having a size of 42*42 is segmented from the upper left corner, the lower left corner, the upper right corner, the lower right corner, and the center position of the face image, that is, five sizes are divided into 42* from the face image. 42 images.
  • Step S305 the segmented N images are input into a convolutional neural network CNN expression classification model for prediction, and the facial expression tags of the face image are obtained.
  • the facial expression tag indicates an expression of the face.
  • the image of the N preset positions obtained in step S304 can be input into the trained CNN expression classification model for prediction, and then each image in the image of the N preset positions can be acquired.
  • the prediction probability of each expression is an expression that maximizes the mean value of the prediction probability in each expression as the expression of the above-described face image.
  • the training method of the CNN expression classification model may be: acquiring an expression classification data set, and preprocessing all the images in the expression classification data set (screening a face image from all the images, and filtering out The size of the face image is adjusted to a first preset size), a face image of a size of a first preset size is obtained, and each face image in the face image is randomly segmented, and K is segmented (where K An integer greater than zero, for example, eight images of a second preset size, and the K images segmented from each face image are randomly flipped for training, which helps to improve the spatial adaptability of the model.
  • the data set is also preprocessed to obtain a face image of a first preset size, and then separately segmented at a preset position in the face image.
  • a second predetermined size is the size of the image to test, and finally the preset position classification probability averaged to obtain expression classification results and the accuracy of the calculation, to retain high accuracy of the model.
  • the N images that are segmented are input into a convolutional neural network CNN expression classification model for prediction, and the facial expression tags of the facial image are obtained by:
  • the segmented N images are input into the CNN expression classification model for prediction, and the prediction probabilities of the plurality of facial expressions in each of the N images are obtained;
  • a facial expression having the largest mean value of the predicted probabilities in the plurality of facial expressions is used as a facial expression label of the facial image.
  • the plurality of facial expressions include but are not limited to nature, happiness, surprise, sadness, fear, anger, and the like.
  • an image having a size of 42*42 is respectively segmented from the upper left corner, the lower left corner, the upper right corner, the lower right corner, and the center position of the face image, and can be respectively defined as the first image, the second image, and the first image.
  • the three images, the fourth image, and the fifth image are input into the CNN expression classification model for prediction, and the predicted probabilities of natural, happy, surprised, sad, scared, angry, etc. in the first image are respectively 0.6.
  • the predicted probabilities of the above six expressions in the second image are 0.5, 0.2, 0.1, 0.1, 0, and 0.1 respectively
  • the predicted probabilities of the above six expressions in the third image are respectively 0.6, 0.1, 0.1, 0.1, 0
  • the prediction probabilities of the above six expressions in the fourth image are 0.5, 0.2, 0.1, 0.1, 0.1, 0; respectively
  • the prediction of the above six expressions in the fifth image The probabilities are 0.7, 0, 0.1, 0.1, 0, and 0.1 respectively.
  • the average of the prediction probabilities of the above six expressions in the five images is 0.58 (natural). ), 0.12 (happy), 0.1 (surprise), 0.1 (sadness), 0.06 (fear), and 0.04 (angry), so that the facial expression in the above face image can be determined to be natural.
  • Step S306 performing face verification on the face image to obtain a face verification result.
  • the face verification result indicates information of a user to which the face belongs.
  • step S104 This step is the same as that of step S104.
  • steps S104 For details, refer to the related description of step S104, and details are not described herein again.
  • Step S307 displaying the facial expression tag and the face verification result.
  • step S105 This step is the same as the step S105.
  • steps S105 For details, refer to the related description of step S105, and details are not described herein again.
  • the embodiment of the present invention adds the expression classification of the face image through the CNN based on the first embodiment, thereby identifying the expression in the face image and improving the accuracy of the expression recognition.
  • the facial expression recognition method may include the following steps:
  • Step S401 acquiring an image to be processed.
  • step S101 This step is the same as that of step S101.
  • step S101 For details, refer to the related description of step S101, and details are not described herein again.
  • Step S402 extracting a face image from the image to be processed.
  • step S102 This step is the same as the step S102.
  • steps S102 For details, refer to the related description of step S102, and details are not described herein again.
  • Step S403 performing facial expression classification on the facial image based on deep learning, and acquiring a facial expression label of the facial image.
  • the facial expression tag indicates an expression of the face.
  • step S103 This step is the same as the step S103.
  • steps S103 For details, refer to the related description of step S103, and details are not described herein again.
  • Step S404 adjusting the size of the face image to a third preset size.
  • the size of the face image may be scaled and adjusted to a third preset size.
  • the size of the face image may be scaled and adjusted to L1*L2 (for example, 39*31).
  • L1*L2 for example, 39*31.
  • Step S405 the adjusted face image is divided into a plurality of images.
  • the face image after the zoom adjustment may be randomly divided, and the face image after the zoom adjustment is divided into multiple images.
  • the size of the plurality of images may be the same or different, and is not limited herein, and the number of the plurality of images is not limited.
  • Step S406 input the plurality of images into a face verification model, and obtain a classification probability of each face of the face in the face expression database.
  • the facial expression database may refer to a database storing information of a large number of users and a facial expression tag of each user.
  • Step S407 If the maximum value of the classification probability is greater than a preset threshold, determine that the user to which the face belongs is a user corresponding to the maximum value of the classification probability.
  • the plurality of divided images are respectively input into a face verification model (for example, a DeepID face verification model), thereby acquiring a face in the face image in each of the face expression databases.
  • the classification probability of the user that is, the user of the face in the face image belongs to the classification probability of each user in the face expression database.
  • a facial expression database stores 1000 facial expression tags of the user, and the 1000 users are respectively numbered with a serial number, for example, p1, p2, p3, ..., p1000, and the face in FIG. 1 is in the facial expression database.
  • the classification probability of each user is 0.8, 0, 0, 0.2, 0, ..., 0, and the preset probability is 0.6. It can be determined that the facial expression label in FIG. 1 belongs to the user p1, that is, the face image in FIG. A face image belonging to the user p1, then the expression tag of the face and the user's serial number p1 can be displayed above the face image in FIG.
  • the embodiment of the present invention further includes:
  • the face expression database contains 1000 users, but the face image extracted from the image to be processed does not belong to the face image of any one of the 1000 users, and at this time, the face image can be The serial number of the corresponding user is set to p1001, and the correspondence relationship between the facial expression tag of the face image and the information of the user is added to the facial expression database.
  • the facial expression database may be updated by adding the facial expression label and the information of the user to which the human face belongs to the facial expression database, and at the same time, in order to improve the face verification.
  • the accuracy and convenience of the model can be used to perform face verification on the face image of the user to which the face belongs by the face verification model, and the last Soft-max layer of the face verification model can be updated and retrained.
  • Step S408 displaying the facial expression tag and the information of the user to which the face belongs.
  • step S105 This step is the same as the step S105.
  • steps S105 For details, refer to the related description of step S105, and details are not described herein again.
  • the embodiment of the present invention improves the accuracy of the expression recognition by recognizing the expression in the face image based on the deep learning, and by performing face verification on the face image, the expressions in the face image can be correlated to determine whether Different expressions belonging to the same user.
  • FIG. 5 is a schematic diagram of a facial expression recognition apparatus according to Embodiment 4 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown.
  • the facial expression recognition device includes:
  • An image obtaining module 51 configured to acquire an image to be processed
  • a face extraction module 52 configured to extract a face image from the image to be processed
  • the expression classification module 53 is configured to perform facial expression classification on the facial image based on the deep learning, and obtain a facial expression label of the facial image, wherein the facial expression label indicates an expression of the facial expression;
  • the face verification module 54 is configured to perform face verification on the face image to obtain a face verification result, where the face verification result indicates information of a user to which the face belongs;
  • the display module 55 is configured to display the facial expression tag and the face verification result.
  • the expression classification module 53 includes:
  • a first adjusting unit configured to adjust a size of the face image to a first preset size
  • a first dividing unit configured to respectively segment an image of a second preset size from the N preset positions in the adjusted face image, where N is an integer greater than zero;
  • a prediction unit configured to input the segmented N images into a convolutional neural network CNN expression classification model for prediction, and obtain a facial expression label of the face image;
  • the prediction unit includes:
  • a prediction subunit configured to input the segmented N images into the CNN expression classification model for prediction, and obtain prediction probabilities of the plurality of facial expressions in each of the N images;
  • a calculating subunit configured to calculate, according to a prediction probability of each of the N images in the plurality of facial expressions, each facial expression in the plurality of facial expressions in the N images The mean of the predicted probability;
  • the determining subunit is configured to use a facial expression with the largest mean value of the predicted probabilities in the plurality of facial expressions as the facial expression label of the facial image.
  • the face verification module 54 includes:
  • a second adjusting unit configured to adjust a size of the face image to a third preset size
  • a second dividing unit configured to divide the adjusted face image into a plurality of images
  • An image input unit configured to input the plurality of images into a face verification model, and obtain a classification probability of each face of the face in a facial expression database
  • a first determining unit configured to determine, if the maximum value of the classification probability is greater than a preset threshold, a user to which the face belongs is a user corresponding to a maximum value of the classification probabilities;
  • a second determining unit configured to determine, if the maximum value of the classification probability is less than or equal to a preset threshold, determining that the user to which the face belongs does not exist in the facial expression database
  • an adding unit configured to add the facial expression label and the information of the user to which the face belongs to the facial expression database.
  • the facial expression recognition device provided by the embodiment of the present invention can be applied to the foregoing method embodiment 1, the second embodiment, and the third embodiment.
  • FIG. 6 is a schematic diagram of a facial expression recognition apparatus according to Embodiment 5 of the present invention.
  • the facial expression recognition apparatus 6 of this embodiment includes a processor 60, a memory 61, and a computer program 62 stored in the memory 61 and operable on the processor 60.
  • the processor 60 executes the computer program 62, the steps in the embodiments of the above-described respective facial expression recognition methods are implemented, for example, steps S101 to S105 shown in FIG.
  • the processor 60 when executing the computer program 62, implements the functions of the modules/units in the various apparatus embodiments described above, such as the functions of the modules 51-55 shown in FIG.
  • the computer program 62 can be partitioned into one or more modules/units that are stored in the memory 61 and executed by the processor 60 to complete this invention.
  • the one or more modules/units may be a series of computer program instruction segments capable of performing a particular function, the instruction segments being used to describe the execution of the computer program 62 in the facial expression recognition device 6.
  • the computer program 62 can be divided into an image acquisition module, a face extraction module, an expression classification module, a face verification module, and a display module, and the specific functions of each module are as follows:
  • An image acquisition module configured to acquire an image to be processed
  • a face extraction module configured to extract a face image from the image to be processed
  • An expression classification module configured to perform facial expression classification on the facial image based on deep learning, and obtain a facial expression label of the facial image, wherein the facial expression label indicates an expression of the facial expression;
  • a face verification module configured to perform face verification on the face image, and obtain a face verification result, where the face verification result indicates information of a user to which the face belongs;
  • a display module configured to display the facial expression tag and the face verification result.
  • the expression classification module includes:
  • a first adjusting unit configured to adjust a size of the face image to a first preset size
  • a first dividing unit configured to respectively segment an image of a second preset size from the N preset positions in the adjusted face image, where N is an integer greater than zero;
  • a prediction unit configured to input the segmented N images into a convolutional neural network CNN expression classification model for prediction, and obtain a facial expression label of the face image;
  • the prediction unit includes:
  • a prediction subunit configured to input the segmented N images into the CNN expression classification model for prediction, and obtain prediction probabilities of the plurality of facial expressions in each of the N images;
  • a calculating subunit configured to calculate, according to a prediction probability of each of the N images in the plurality of facial expressions, each facial expression in the plurality of facial expressions in the N images The mean of the predicted probability;
  • the determining subunit is configured to use a facial expression with the largest mean value of the predicted probabilities in the plurality of facial expressions as the facial expression label of the facial image.
  • the face verification module includes:
  • a second adjusting unit configured to adjust a size of the face image to a third preset size
  • a second dividing unit configured to divide the adjusted face image into a plurality of images
  • An image input unit configured to input the plurality of images into a face verification model, and obtain a classification probability of each face of the face in a facial expression database
  • a first determining unit configured to determine, if the maximum value of the classification probability is greater than a preset threshold, a user to which the face belongs is a user corresponding to a maximum value of the classification probabilities;
  • a second determining unit configured to determine, if the maximum value of the classification probability is less than or equal to a preset threshold, determining that the user to which the face belongs does not exist in the facial expression database
  • an adding unit configured to add the facial expression label and the information of the user to which the face belongs to the facial expression database.
  • the facial expression recognition device 6 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the facial expression recognition device may include, but is not limited to, a processor 60 and a memory 61. It will be understood by those skilled in the art that FIG. 6 is merely an example of the facial expression recognition device 6, and does not constitute a limitation on the facial expression recognition device 6, and may include more or less components than those illustrated, or may combine some
  • the components, or different components, such as the face recognition device may also include input and output devices, network access devices, buses, and the like.
  • the processor 60 may be a central processing unit (CPU), and the processor may be another general-purpose processor, a digital signal processor (DSP), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable) Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 61 may be an internal storage unit of the facial expression recognition device 6, such as a hard disk or a memory of the facial expression recognition device 6.
  • the memory 61 may also be an external storage device of the facial expression recognition device 6, such as a plug-in hard disk provided on the facial expression recognition device 6, a smart memory card (SMC), and a secure digital number. (Secure Digital, SD) card, flash card, etc.
  • SMC smart memory card
  • secure digital number Secure Digital, SD
  • the memory 61 may also include both an internal storage unit of the facial expression recognition device 6 and an external storage device.
  • the memory 61 is used to store the computer program and other programs and data required by the facial expression recognition device.
  • the memory 61 can also be used to temporarily store data that has been output or is about to be output.
  • each functional unit and module described above is exemplified. In practical applications, the above functions may be assigned to different functional units as needed.
  • the module is completed by dividing the internal structure of the device into different functional units or modules to perform all or part of the functions described above.
  • Each functional unit and module in the embodiment may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit, and the integrated unit may be hardware.
  • Formal implementation can also be implemented in the form of software functional units.
  • the specific names of the respective functional units and modules are only for the purpose of facilitating mutual differentiation, and are not intended to limit the scope of protection of the present application.
  • For the specific working process of the unit and the module in the foregoing system reference may be made to the corresponding process in the foregoing method embodiment, and details are not described herein again.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated modules/units if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present invention implements all or part of the processes in the foregoing embodiments, and may also be completed by a computer program to instruct related hardware.
  • the computer program may be stored in a computer readable storage medium. The steps of the various method embodiments described above may be implemented when the program is executed by the processor.
  • the computer program comprises computer program code, which may be in the form of source code, object code form, executable file or some intermediate form.
  • the computer readable medium may include any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM). , random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in a jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer readable media Does not include electrical carrier signals and telecommunication signals.

Abstract

人脸表情识别方法及装置,包括:获取待处理图像(S101);从所述待处理图像中提取人脸图像(S102);基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签(S103),其中,所述人脸表情标签指示所述人脸的表情;对所述人脸图像进行人脸验证,获取人脸验证结果(S104),其中,所述人脸验证结果指示所述人脸所属的用户的信息;显示所述人脸表情标签和所述人脸验证结果(S105)。可解决现有技术中表情识别的准确率较低,且表情缺乏关联,无法获知同一用户的不同表情的问题。

Description

人脸表情识别方法及装置 技术领域
本发明属于人脸表情识别技术领域,尤其涉及人脸表情识别方法及装置。
背景技术
人脸表情识别是指从给定的图像中分析检测出人脸的表情状态,从而确定出被识别对象的心理情绪,例如,自然、高兴、生气、吃惊等。人脸表情识别是一个重要的领域,其有助于人物心情分析,人物性格分析,抑郁症检测等众多领域的发展。因此解决人脸表情识别人体具有十分重要的价值。然而,现有的人脸表情识别对表情特征的鲁棒性较差,容易受到身份信息等噪声的干扰,导致表情识别的准确率较低。同时目前的表情识别算法通常只能进行目标表情识别,表情缺乏关联,无法获知同一用户的不同表情。
故,有必要提出一种新的技术方案,以解决上述技术问题。
技术问题
现有技术中表情识别的准确率较低,且表情缺乏关联,无法获知同一用户的不同表情的问题。
技术解决方案
本发明的第一方面提供了一种人脸表情识别方法,所述人脸表情识别方法包括:
获取待处理图像;
从所述待处理图像中提取人脸图像;
基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签,其中,所述人脸表情标签指示所述人脸的表情;
对所述人脸图像进行人脸验证,获取人脸验证结果,其中,所述人脸验证结果指示所述人脸所属的用户的信息;
显示所述人脸表情标签和所述人脸验证结果。
本发明的第二方面提供了一种人脸表情识别装置,所述人脸表情识别装置包括:
图像获取模块,用于获取待处理图像;
人脸提取模块,用于从所述待处理图像中提取人脸图像;
表情分类模块,用于基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签,其中,所述人脸表情标签指示所述人脸的表情;
人脸验证模块,用于对所述人脸图像进行人脸验证,获取人脸验证结果,其中,所述人脸验证结果指示所述人脸所属的用户的信息;
显示模块,用于显示所述人脸表情标签和所述人脸验证结果。
本发明的第三方面提供了一种人脸表情识别装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述人脸表情识别方法的步骤。
本发明的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上述第一方面所述人脸表情识别方法的步骤。
有益效果
本发明方案获取待处理图像,从该待处理图像中提取人脸图像,并基于深度学习对该人脸图像进行表情分类,获取该人脸图像的人脸表情标签,通过对人脸图像进行人脸验证,可以获取人脸验证结果,从而获知该人脸所属的用户。本发明方案通过基于深度学习对人脸图像中的表情进行识别,提高了表情识别的准确性,并通过对人脸图像进行人脸验证,可将人脸图像中的表情进行关联,判断是否属于同一用户的不同表情。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例一提供的人脸表情识别方法的实现流程示意图;
图2a是同一用户的不同表情的示例图;图2b是不同用户的表情的示例图;
图3是本发明实施例二提供的人脸表情识别方法的实现流程示意图;
图4是本发明实施例三提供的人脸表情识别方法的实现流程示意图;
图5是本发明实施例四提供的人脸表情识别装置的示意图;
图6是本发明实施例五提供的人脸表情识别装置的示意图。
本发明的实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本发明实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本发明。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本发明的描述。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本发明说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本发明。如在本发明说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本发明说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。
应理解,本实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
为了说明本发明所述的技术方案,下面通过具体实施例来进行说明。
参见图1,是本发明实施例一提供的人脸表情识别方法的实现流程示意图,如图所示该人脸表情识别方法可以包括以下步骤:
步骤S101,获取待处理图像。
在本发明实施例中,所述待处理图像可以是直接向人脸表情识别装置输入的图像,也可以是通过向人脸表情识别装置输入视频获取的图像或者人脸表情识别装置直接连接摄像装置获取的图像。其中,使用视频或者摄像装置通常是从视频中提取每一帧进行处理。需要说明的是,所述待处理图像的数量可以为一个或者多个,在此不做限定。例如,待处理图像为视频相隔一秒的两帧图像,即待处理图像的数量为两个。
步骤S102,从所述待处理图像中提取人脸图像。
在本发明实施例中,可以通过Dlib机器学习库确定待处理图像中人脸的位置信息,然后从待处理图像中进行人脸提取,Dlib是一个由C++编写的机器学习库,包含了许多机器学习常用算法。如果待处理图像中包含多个人脸,那么在对待处理图像中的人脸进行提取后,可能会获得多个大小不同的人脸图像,然后再对获得的多个人脸图像分别进行表情分类和人脸验证,以识别出每个人脸的表情,并获取每个人脸所属的用户的信息,从而可判断多个人脸是否属于同一用户。
步骤S103,基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签。
其中,所述人脸表情标签指示所述人脸的表情。
在本发明实施例中,可以通过卷积神经网络(Convolutional Neural Network,CNN)对所述人脸图像进行表情分类,即识别出所述人脸图像中人脸的表情。
步骤S104,对所述人脸图像进行人脸验证,获取人脸验证结果。
其中,所述人脸验证结果指示所述人脸所属的用户的信息。
在本发明实施例中,可以通过人脸验证模型(例如DeepID人脸验证模型)对所述人脸图像进行人脸验证,具体可以是通过人脸验证模型验证所述人脸图像中的人脸属于人脸表情数据库中的哪个用户的人脸。其中,人脸表情数据库中可以存储多个用户的信息以及多个用户中每个用户的人脸表情,用户的信息可以是指表征用户的身份的标识信息,能够区分不同的用户,例如为人脸表情数据库中的每个用户设置一个序号。
可选的,在执行步骤S102之后,可以同时执行步骤S103和S104。
步骤S105,显示所述人脸表情标签和所述人脸验证结果。
具体的可以是显示所述人脸图像、所述人脸表情标签以及所述人脸验证结果,并可以将所述人脸表情标签和所述人脸验证结果显示在所述人脸图像的指定位置(例如所述人脸图像的上方、下方、左侧或右侧),以便于用户知晓所述人脸表情标签和所述人脸验证结果对应哪个人脸图像,并更新人脸表情数据库,以增加人脸表情数据中该人脸所属用户的人脸表情标签。
在本发明实施例中,显示所述人脸表情标签可以便于用户查看所述人脸图像中人脸的表情,显示所述人脸验证结果可以便于查看所述人脸图像中人脸属于哪个用户,同时显示所述人脸表情标签和所述人脸验证结果可以便于用户查看所述人脸图像中人脸的表情属于哪个用户。在步骤S102提取的人脸图像的数量为多个时,可以在每个人脸图像的指定位置显示该人脸图像的人脸表情标签和人脸验证结果。如图2a所示是同一用户的不同表情的示例图,图2a中的p1为用户的序号。如图2b所示是不同用户的表情的示例图。不同的序号可以表示不同的用户,从而便于用户通过查看人脸图像中序号就可获知识别出人脸表情是否属于同一用户,例如图2b中的p2和p3。
本发明实施例通过基于深度学习对人脸图像中的表情进行识别,提高了表情识别的准确性,并通过对人脸图像进行人脸验证,可将人脸图像中的表情进行关联,判断是否属于同一用户的不同表情。
参见图3,是本发明实施例二提供的人脸表情识别方法的实现流程示意图,如图所示该人脸表情识别方法可以包括以下步骤:
步骤S301,获取待处理图像。
该步骤与步骤S101相同,具体可参见步骤S101的相关描述,在此不再赘述。
步骤S302,从所述待处理图像中提取人脸图像。
该步骤与步骤S102相同,具体可参见步骤S102的相关描述,在此不再赘述。
步骤S303,将所述人脸图像的大小调整至第一预设大小。
在本发明实施例中,可以将所述人脸图像的大小进行缩放调整至第一预设大小,具体的可以是将所述人脸图像的大小进行缩放调整至M*M(例如48*48)的大小,其中,M为大于零的整数。
步骤S304,从调整后的所述人脸图像中的N个预设位置处分别分割出一个大小为第二预设大小的图像。
其中,N为大于零的整数。
在本发明实施例中,对所述人脸图像的大小进行缩放调整后,对缩放调整后的所述人脸图像进行分割,可以从所述人脸图像中的N个预设位置处分别分割出一个大小为第二预设大小的图像,即从所述人脸图像中分割出N个大小为第二预设大小的图像。例如,从所述人脸图像的左上角、左下角、右上角、右下角和中心位置分别分割出一个大小为42*42的图像,即从所述人脸图像中分割五个大小为42*42的图像。
步骤S305,将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签。
其中,所述人脸表情标签指示所述人脸的表情。
在本发明实施中,可以将步骤S304获得的上述N个预设位置的图像输入至已训练好的CNN表情分类模型中进行预测,那么就可以获取N个预设位置的图像中每个图像的各个表情的预测概率,将各个表情中预测概率的均值最大的表情作为上述人脸图像的表情。
其中,上述CNN表情分类模型的训练方法具体可以是:获取表情分类数据集,对所述表情分类数据集中的所有图像进行预处理(从所述所有图像中筛选出人脸图像,并将筛选出的人脸图像的大小调整至第一预设大小),获得大小为第一预设大小的人脸图像,对该人脸图像中的每个人脸图像进行随机分割,分割出K(其中,K为大于零的整数,例如八)个大小为第二预设大小的图像,并将从每个人脸图像中分割出的K个图像随机进行翻转来训练,这样有助于提高模型的空间适应性,获取表情分类结果,然后进行损失函数Loss的计算,并使用随机梯度下降优化器进行模型优化,其中学习率可以为0.001,总的迭代次数可以为50000次,并且每1000次会进行测试,测试数据集同样进行预处理获得大小为第一预设大小的人脸图像,然后在人脸图像中的预设位置处分别分割出一个大小为第二预设大小的图像进行测试,最后将预设位置的分类概率取均值得到表情分类结果,并计算准确率,保留准确率高的模型。
可选的,所述将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签包括:
将分割出的N个图像输入至CNN表情分类模型中进行预测,获取多种人脸表情在所述N个图像中每个图像的预测概率;
根据所述多种人脸表情在所述N个图像中每个图像的预测概率,计算所述多种人脸表情中每种人脸表情在所述N个图像中的预测概率的均值;
将所述多种人脸表情中预测概率的均值最大的人脸表情作为所述人脸图像的人脸表情标签。
其中,所述多种人脸表情包括但不限于自然、高兴、惊奇、伤心、害怕、生气等。
示例性的,从所述人脸图像的左上角、左下角、右上角、右下角和中心位置分别分割出一个大小为42*42的图像,可以分别定义为第一图像、第二图像、第三图像、第四图像和第五图像,将上述五个图像输入至CNN表情分类模型中进行预测,得到自然、高兴、惊奇、伤心、害怕、生气等在第一图像中的预测概率分别为0.6、0.1、0.1、0.1、0.1、0;上述六种表情在第二图像中的预测概率分别为0.5、0.2、0.1、0.1、0、0.1;上述六种表情在第三图像中的预测概率分别为0.6、0.1、0.1、0.1、0.1、0;上述六种表情在第四图像中的预测概率分别为0.5、0.2、0.1、0.1、0.1、0;上述六种表情在第五图像中的预测概率分别为0.7、0、0.1、0.1、0、0.1;根据上述六种表情在每个图像中的预测概率可以计算出上述六种表情在五个图像中的预测概率的均值分别为0.58(自然)、0.12(高兴)、0.1(惊奇)、0.1(伤心)、0.06(害怕)、0.04(生气),从而可以判定上述人脸图像中的人脸表情为自然。
步骤S306,对所述人脸图像进行人脸验证,获取人脸验证结果。
其中,所述人脸验证结果指示所述人脸所属的用户的信息。
该步骤与步骤S104相同,具体可参见步骤S104的相关描述,在此不再赘述。
步骤S307,显示所述人脸表情标签和所述人脸验证结果。
该步骤与步骤S105相同,具体可参见步骤S105的相关描述,在此不再赘述。
本发明实施例在实施例一的基础上增加了通过CNN对人脸图像进行表情分类,从而识别出人脸图像中的表情,提高了表情识别的准确性。
参见图4,是本发明实施例三提供的人脸表情识别方法的实现流程示意图,如图所示该人脸表情识别方法可以包括以下步骤:
步骤S401,获取待处理图像。
该步骤与步骤S101相同,具体可参见步骤S101的相关描述,在此不再赘述。
步骤S402,从所述待处理图像中提取人脸图像。
该步骤与步骤S102相同,具体可参见步骤S102的相关描述,在此不再赘述。
步骤S403,基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签。
其中,所述人脸表情标签指示所述人脸的表情。
该步骤与步骤S103相同,具体可参见步骤S103的相关描述,在此不再赘述。
步骤S404,将所述人脸图像的大小调整至第三预设大小。
在本发明实施例中,可以将所述人脸图像的大小进行缩放调整至第三预设大小,具体的可以是将所述人脸图像的大小进行缩放调整至L1*L2(例如39*31)的大小,其中,L1和L2为大于零的整数。
步骤S405,将调整后的所述人脸图像分割成多个图像。
在本发明实施例中,对所述人脸图像的大小进行缩放调整后,可以对缩放调整后的所述人脸图像进行随意分割,将缩放调整后的所述人脸图像分割成多个图像。其中,所述多个图像的大小可以相同,也可以不同,在此不做限定,对所述多个图像的数量也不做限定。
步骤S406,将所述多个图像输入至人脸验证模型,获取所述人脸在人脸表情数据库中每个用户的分类概率。
其中,所述人脸表情数据库可以是指存储有大量用户的信息以及每个用户的人脸表情标签的数据库。
步骤S407,若所述分类概率中的最大值大于预设阈值,则确定所述人脸所属的用户为所述分类概率中的最大值对应的用户。
在本发明实施例中,将分割成的多个图像分别输入至人脸验证模型(例如DeepID人脸验证模型)中,从而获取所述人脸图像中的人脸在人脸表情数据库中每个用户的分类概率,即所述人脸图像中的人脸的用户属于所述人脸表情数据库中每个用户的分类概率。例如,人脸表情数据库中存储有1000个用户的人脸表情标签,分别将该1000个用户用序号标号,例如p1、p2、p3,…,p1000,图1中人脸在人脸表情数据库中的每个用户分类概率分别为0.8、0、0、0.2、0,…,0,预设概率为0.6,可以判定图1中的人脸表情标签属于用户p1,即图1中的人脸图像属于用户p1的人脸图像,那么就可以在图1中人脸图像的上方显示该人脸的表情标签和用户的序号p1。
可选的,本发明实施例还包括:
若所述分类概率中的最大值小于或等于预设阈值,则确定所述人脸表情数据库中不存在所述人脸所属的用户;
将所述人脸表情标签和所述人脸所属的用户的信息添加至所述人脸表情数据库。
示例性的,人脸表情数据库中包含1000个用户,但是从待处理图像中提取的人脸图像并不属于这1000个用户中的任一个用户的人脸图像,此时,可以将人脸图像对应的用户的序号设定为p1001,将该人脸图像的人脸表情标签与该用户的信息的对应关系添加至人脸表情数据库。
在本发明实施例中,可以通过将所述人脸表情标签和所述人脸所属的用户的信息添加至所述人脸表情数据库,从而更新所述人脸表情数据库,同时为了提高人脸验证模型的准确性和便于后续可以通过该人脸验证模型对所述人脸所属的用户的人脸图像进行人脸验证,可以对人脸验证模型的最后Soft-max层进行更新以及再训练。
步骤S408,显示所述人脸表情标签和所述人脸所属的用户的信息。
该步骤与步骤S105相同,具体可参见步骤S105的相关描述,在此不再赘述。
本发明实施例通过基于深度学习对人脸图像中的表情进行识别,提高了表情识别的准确性,并通过对人脸图像进行人脸验证,可将人脸图像中的表情进行关联,判断是否属于同一用户的不同表情。
参见图5,是本发明实施例四提供的人脸表情识别装置的示意图,为了便于说明,仅示出了与本发明实施例相关的部分。
所述人脸表情识别装置包括:
图像获取模块51,用于获取待处理图像;
人脸提取模块52,用于从所述待处理图像中提取人脸图像;
表情分类模块53,用于基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签,其中,所述人脸表情标签指示所述人脸的表情;
人脸验证模块54,用于对所述人脸图像进行人脸验证,获取人脸验证结果,其中,所述人脸验证结果指示所述人脸所属的用户的信息;
显示模块55,用于显示所述人脸表情标签和所述人脸验证结果。
可选的,所述表情分类模块53包括:
第一调整单元,用于将所述人脸图像的大小调整至第一预设大小;
第一分割单元,用于从调整后的所述人脸图像中的N个预设位置处分别分割出一个大小为第二预设大小的图像,其中,N为大于零的整数;
预测单元,用于将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签;
所述预测单元包括:
预测子单元,用于将分割出的N个图像输入至CNN表情分类模型中进行预测,获取多种人脸表情在所述N个图像中每个图像的预测概率;
计算子单元,用于根据所述多种人脸表情在所述N个图像中每个图像的预测概率,计算所述多种人脸表情中每种人脸表情在所述N个图像中的预测概率的均值;
确定子单元,用于将所述多种人脸表情中预测概率的均值最大的人脸表情作为所述人脸图像的人脸表情标签。
可选的,所述人脸验证模块54包括:
第二调整单元,用于将所述人脸图像的大小调整至第三预设大小;
第二分割单元,用于将调整后的所述人脸图像分割成多个图像;
图像输入单元,用于将所述多个图像输入至人脸验证模型,获取所述人脸在人脸表情数据库中每个用户的分类概率;
第一确定单元,用于若所述分类概率中的最大值大于预设阈值,则确定所述人脸所属的用户为所述分类概率中的最大值对应的用户;
第二确定单元,用于若所述分类概率中的最大值小于或等于预设阈值,则确定所述人脸表情数据库中不存在所述人脸所属的用户;
添加单元,用于将所述人脸表情标签和所述人脸所属的用户的信息添加至所述人脸表情数据库。
本发明实施例提供的人脸表情识别装置可以应用在前述方法实施例一、实施例二和实施例三中,详情参见上述方法实施例一、实施例二和实施例三的描述,在此不再赘述。
图6是本发明实施例五提供的人脸表情识别装置的示意图。如图6所示,该实施例的人脸表情识别装置6包括:处理器60、存储器61以及存储在所述存储器61中并可在所述处理器60上运行的计算机程序62。所述处理器60执行所述计算机程序62时实现上述各个人脸表情识别方法实施例中的步骤,例如图1所示的步骤S101至S105。或者,所述处理器60执行所述计算机程序62时实现上述各装置实施例中各模块/单元的功能,例如图5所示模块51至55的功能。
示例性的,所述计算机程序62可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器61中,并由所述处理器60执行,以完成本发明。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序62在所述人脸表情识别装置6中的执行过程。例如,所述计算机程序62可以被分割成图像获取模块、人脸提取模块、表情分类模块、人脸验证模块以及显示模块,各模块具体功能如下:
图像获取模块,用于获取待处理图像;
人脸提取模块,用于从所述待处理图像中提取人脸图像;
表情分类模块,用于基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签,其中,所述人脸表情标签指示所述人脸的表情;
人脸验证模块,用于对所述人脸图像进行人脸验证,获取人脸验证结果,其中,所述人脸验证结果指示所述人脸所属的用户的信息;
显示模块,用于显示所述人脸表情标签和所述人脸验证结果。
可选的,所述表情分类模块包括:
第一调整单元,用于将所述人脸图像的大小调整至第一预设大小;
第一分割单元,用于从调整后的所述人脸图像中的N个预设位置处分别分割出一个大小为第二预设大小的图像,其中,N为大于零的整数;
预测单元,用于将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签;
所述预测单元包括:
预测子单元,用于将分割出的N个图像输入至CNN表情分类模型中进行预测,获取多种人脸表情在所述N个图像中每个图像的预测概率;
计算子单元,用于根据所述多种人脸表情在所述N个图像中每个图像的预测概率,计算所述多种人脸表情中每种人脸表情在所述N个图像中的预测概率的均值;
确定子单元,用于将所述多种人脸表情中预测概率的均值最大的人脸表情作为所述人脸图像的人脸表情标签。
可选的,所述人脸验证模块包括:
第二调整单元,用于将所述人脸图像的大小调整至第三预设大小;
第二分割单元,用于将调整后的所述人脸图像分割成多个图像;
图像输入单元,用于将所述多个图像输入至人脸验证模型,获取所述人脸在人脸表情数据库中每个用户的分类概率;
第一确定单元,用于若所述分类概率中的最大值大于预设阈值,则确定所述人脸所属的用户为所述分类概率中的最大值对应的用户;
第二确定单元,用于若所述分类概率中的最大值小于或等于预设阈值,则确定所述人脸表情数据库中不存在所述人脸所属的用户;
添加单元,用于将所述人脸表情标签和所述人脸所属的用户的信息添加至所述人脸表情数据库。
所述人脸表情识别装置6可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述人脸表情识别装置可包括,但不仅限于,处理器60、存储器61。本领域技术人员可以理解,图6仅仅是人脸表情识别装置6的示例,并不构成对人脸表情识别装置6的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述人脸识别装置还可以包括输入输出设备、网络接入设备、总线等。
所述处理器60可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application Specific Integrated Circuit,ASIC)、现成可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器61可以是所述人脸表情识别装置6的内部存储单元,例如人脸表情识别装置6的硬盘或内存。所述存储器61也可以是所述人脸表情识别装置6的外部存储设备,例如所述人脸表情识别装置6上配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器61还可以既包括所述人脸表情识别装置6的内部存储单元也包括外部存储设备。所述存储器61用于存储所述计算机程序以及所述人脸表情识别装置所需的其他程序和数据。所述存储器61还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
在本发明所提供的实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种人脸表情识别方法,其特征在于,所述人脸表情识别方法包括:
    获取待处理图像;
    从所述待处理图像中提取人脸图像;
    基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签,其中,所述人脸表情标签指示所述人脸的表情;
    对所述人脸图像进行人脸验证,获取人脸验证结果,其中,所述人脸验证结果指示所述人脸所属的用户的信息;
    显示所述人脸表情标签和所述人脸验证结果。
  2. 如权利要求1所述的人脸表情识别方法,其特征在于,所述基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签包括:
    将所述人脸图像的大小调整至第一预设大小;
    从调整后的所述人脸图像中的N个预设位置处分别分割出一个大小为第二预设大小的图像,其中,N为大于零的整数;
    将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签。
  3. 如权利要求2所述的人脸表情识别方法,其特征在于,所述将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签包括:
    将分割出的N个图像输入至CNN表情分类模型中进行预测,获取多种人脸表情在所述N个图像中每个图像的预测概率;
    根据所述多种人脸表情在所述N个图像中每个图像的预测概率,计算所述多种人脸表情中每种人脸表情在所述N个图像中的预测概率的均值;
    将所述多种人脸表情中预测概率的均值最大的人脸表情作为所述人脸图像的人脸表情标签。
  4. 如权利要求1至3任一项所述的人脸表情识别方法,其特征在于,所述对所述人脸图像进行人脸验证,获取人脸验证结果包括:
    将所述人脸图像的大小调整至第三预设大小;
    将调整后的所述人脸图像分割成多个图像;
    将所述多个图像输入至人脸验证模型,获取所述人脸在人脸表情数据库中每个用户的分类概率;
    若所述分类概率中的最大值大于预设阈值,则确定所述人脸所属的用户为所述分类概率中的最大值对应的用户。
  5. 如权利要求4所述的人脸表情识别方法,其特征在于,所述人脸表情识别方法还包括:
    若所述分类概率中的最大值小于或等于预设阈值,则确定所述人脸表情数据库中不存在所述人脸所属的用户;
    将所述人脸表情标签和所述人脸所属的用户的信息添加至所述人脸表情数据库。
  6. 一种人脸表情识别装置,其特征在于,所述人脸表情识别装置包括:
    图像获取模块,用于获取待处理图像;
    人脸提取模块,用于从所述待处理图像中提取人脸图像;
    表情分类模块,用于基于深度学习对所述人脸图像进行表情分类,获取所述人脸图像的人脸表情标签,其中,所述人脸表情标签指示所述人脸的表情;
    人脸验证模块,用于对所述人脸图像进行人脸验证,获取人脸验证结果,其中,所述人脸验证结果指示所述人脸所属的用户的信息;
    显示模块,用于显示所述人脸表情标签和所述人脸验证结果。
  7. 如权利要求6所述的人脸表情识别装置,其特征在于,所述表情分类模块包括:
    第一调整单元,用于将所述人脸图像的大小调整至第一预设大小;
    第一分割单元,用于从调整后的所述人脸图像中的N个预设位置处分别分割出一个大小为第二预设大小的图像,其中,N为大于零的整数;
    预测单元,用于将分割出的N个图像输入至卷积神经网络CNN表情分类模型中进行预测,获取所述人脸图像的人脸表情标签;
    所述预测单元包括:
    预测子单元,用于将分割出的N个图像输入至CNN表情分类模型中进行预测,获取多种人脸表情在所述N个图像中每个图像的预测概率;
    计算子单元,用于根据所述多种人脸表情在所述N个图像中每个图像的预测概率,计算所述多种人脸表情中每种人脸表情在所述N个图像中的预测概率的均值;
    确定子单元,用于将所述多种人脸表情中预测概率的均值最大的人脸表情作为所述人脸图像的人脸表情标签。
  8. 如权利要求6或7所述的人脸表情识别装置,其特征在于,所述人脸验证模块包括:
    第二调整单元,用于将所述人脸图像的大小调整至第三预设大小;
    第二分割单元,用于将调整后的所述人脸图像分割成多个图像;
    图像输入单元,用于将所述多个图像输入至人脸验证模型,获取所述人脸在人脸表情数据库中每个用户的分类概率;
    第一确定单元,用于若所述分类概率中的最大值大于预设阈值,则确定所述人脸所属的用户为所述分类概率中的最大值对应的用户;
    第二确定单元,用于若所述分类概率中的最大值小于或等于预设阈值,则确定所述人脸表情数据库中不存在所述人脸所属的用户;
    添加单元,用于将所述人脸表情标签和所述人脸所属的用户的信息添加至所述人脸表情数据库。
  9. 一种人脸表情识别装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至5任一项所述人脸表情识别方法的步骤。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至5任一项所述人脸表情识别方法的步骤。
PCT/CN2017/117921 2017-12-22 2017-12-22 人脸表情识别方法及装置 WO2019119396A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/117921 WO2019119396A1 (zh) 2017-12-22 2017-12-22 人脸表情识别方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/117921 WO2019119396A1 (zh) 2017-12-22 2017-12-22 人脸表情识别方法及装置

Publications (1)

Publication Number Publication Date
WO2019119396A1 true WO2019119396A1 (zh) 2019-06-27

Family

ID=66993025

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/117921 WO2019119396A1 (zh) 2017-12-22 2017-12-22 人脸表情识别方法及装置

Country Status (1)

Country Link
WO (1) WO2019119396A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274447A (zh) * 2020-01-13 2020-06-12 深圳壹账通智能科技有限公司 基于视频的目标表情生成方法、装置、介质、电子设备
CN111476741A (zh) * 2020-04-28 2020-07-31 北京金山云网络技术有限公司 图像的去噪方法、装置、电子设备和计算机可读介质
CN112712097A (zh) * 2019-10-25 2021-04-27 杭州海康威视数字技术股份有限公司 一种基于开放平台的图像识别方法、装置及用户端
CN112749292A (zh) * 2019-10-31 2021-05-04 深圳云天励飞技术有限公司 用户标签生成方法及装置、计算机装置和存储介质
CN113239833A (zh) * 2021-05-20 2021-08-10 厦门大学 一种基于双分支干扰分离网络的人脸表情识别方法
CN114036334A (zh) * 2021-10-09 2022-02-11 武汉烽火信息集成技术有限公司 基于区块链的人脸检索方法、设备和计算机可读存储介质
US11854248B2 (en) 2020-03-19 2023-12-26 Boe Technology Group Co., Ltd. Image classification method, apparatus and training method, apparatus thereof, device and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090003709A1 (en) * 2007-06-29 2009-01-01 Canon Kabushiki Kaisha Image processing apparatus and method, and storage medium
CN103793718A (zh) * 2013-12-11 2014-05-14 台州学院 一种基于深度学习的人脸表情识别方法
CN104091160A (zh) * 2014-07-14 2014-10-08 成都万维图新信息技术有限公司 一种人脸检测方法
CN104573617A (zh) * 2013-10-28 2015-04-29 季春宏 一种摄像控制方法
EP2993616A1 (en) * 2014-09-05 2016-03-09 Huawei Technologies Co., Ltd. Method and apparatus for generating facial feature verification model
CN105654033A (zh) * 2015-12-21 2016-06-08 小米科技有限责任公司 人脸图像验证方法和装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090003709A1 (en) * 2007-06-29 2009-01-01 Canon Kabushiki Kaisha Image processing apparatus and method, and storage medium
CN104573617A (zh) * 2013-10-28 2015-04-29 季春宏 一种摄像控制方法
CN103793718A (zh) * 2013-12-11 2014-05-14 台州学院 一种基于深度学习的人脸表情识别方法
CN104091160A (zh) * 2014-07-14 2014-10-08 成都万维图新信息技术有限公司 一种人脸检测方法
EP2993616A1 (en) * 2014-09-05 2016-03-09 Huawei Technologies Co., Ltd. Method and apparatus for generating facial feature verification model
CN105654033A (zh) * 2015-12-21 2016-06-08 小米科技有限责任公司 人脸图像验证方法和装置

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712097A (zh) * 2019-10-25 2021-04-27 杭州海康威视数字技术股份有限公司 一种基于开放平台的图像识别方法、装置及用户端
CN112712097B (zh) * 2019-10-25 2024-01-05 杭州海康威视数字技术股份有限公司 一种基于开放平台的图像识别方法、装置及用户端
CN112749292A (zh) * 2019-10-31 2021-05-04 深圳云天励飞技术有限公司 用户标签生成方法及装置、计算机装置和存储介质
CN111274447A (zh) * 2020-01-13 2020-06-12 深圳壹账通智能科技有限公司 基于视频的目标表情生成方法、装置、介质、电子设备
US11854248B2 (en) 2020-03-19 2023-12-26 Boe Technology Group Co., Ltd. Image classification method, apparatus and training method, apparatus thereof, device and medium
CN111476741A (zh) * 2020-04-28 2020-07-31 北京金山云网络技术有限公司 图像的去噪方法、装置、电子设备和计算机可读介质
CN111476741B (zh) * 2020-04-28 2024-02-02 北京金山云网络技术有限公司 图像的去噪方法、装置、电子设备和计算机可读介质
CN113239833A (zh) * 2021-05-20 2021-08-10 厦门大学 一种基于双分支干扰分离网络的人脸表情识别方法
CN113239833B (zh) * 2021-05-20 2023-08-29 厦门大学 一种基于双分支干扰分离网络的人脸表情识别方法
CN114036334A (zh) * 2021-10-09 2022-02-11 武汉烽火信息集成技术有限公司 基于区块链的人脸检索方法、设备和计算机可读存储介质
CN114036334B (zh) * 2021-10-09 2024-03-19 武汉烽火信息集成技术有限公司 基于区块链的人脸检索方法、设备和计算机可读存储介质

Similar Documents

Publication Publication Date Title
WO2019119396A1 (zh) 人脸表情识别方法及装置
CN107958230B (zh) 人脸表情识别方法及装置
WO2019109526A1 (zh) 人脸图像的年龄识别方法、装置及存储介质
US20190392587A1 (en) System for predicting articulated object feature location
WO2022001623A1 (zh) 基于人工智能的图像处理方法、装置、设备及存储介质
WO2021139324A1 (zh) 图像识别方法、装置、计算机可读存储介质及电子设备
WO2021051545A1 (zh) 基于行为识别模型的摔倒动作判定方法、装置、计算机设备及存储介质
WO2019033571A1 (zh) 面部特征点检测方法、装置及存储介质
CN110503076B (zh) 基于人工智能的视频分类方法、装置、设备和介质
WO2020098257A1 (zh) 一种图像分类方法、装置及计算机可读存储介质
WO2020164278A1 (zh) 一种图像处理方法、装置、电子设备和可读存储介质
WO2022247005A1 (zh) 图像中目标物识别方法、装置、电子设备及存储介质
CN110738102A (zh) 一种人脸识别方法及系统
CN104915673A (zh) 一种基于视觉词袋模型的目标分类方法和系统
CN112487886A (zh) 一种有遮挡的人脸识别方法、装置、存储介质及终端
WO2022105179A1 (zh) 生物特征图像识别方法、装置、电子设备及可读存储介质
WO2023050651A1 (zh) 图像语义分割方法、装置、设备及存储介质
WO2020244151A1 (zh) 图像处理方法、装置、终端及存储介质
JP2022542199A (ja) キーポイントの検出方法、装置、電子機器および記憶媒体
CN112633221A (zh) 一种人脸方向的检测方法及相关装置
CN112395979A (zh) 基于图像的健康状态识别方法、装置、设备及存储介质
CN114902299A (zh) 图像中关联对象的检测方法、装置、设备和存储介质
CN112541394A (zh) 黑眼圈及鼻炎识别方法、系统及计算机介质
CN110941978B (zh) 一种未识别身份人员的人脸聚类方法、装置及存储介质
CN110414431B (zh) 基于弹性上下文关系损失函数的人脸识别方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17935521

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 12.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17935521

Country of ref document: EP

Kind code of ref document: A1