WO2021143667A1 - Facial expression analysis method and system, and facial expression-based satisfaction analysis method and system - Google Patents

Facial expression analysis method and system, and facial expression-based satisfaction analysis method and system Download PDF

Info

Publication number
WO2021143667A1
WO2021143667A1 PCT/CN2021/071233 CN2021071233W WO2021143667A1 WO 2021143667 A1 WO2021143667 A1 WO 2021143667A1 CN 2021071233 W CN2021071233 W CN 2021071233W WO 2021143667 A1 WO2021143667 A1 WO 2021143667A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial expression
expression
face
spectrogram
frame
Prior art date
Application number
PCT/CN2021/071233
Other languages
French (fr)
Chinese (zh)
Inventor
郭明坤
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2021143667A1 publication Critical patent/WO2021143667A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Definitions

  • This application relates to the technical field of data analysis, for example, to a method and system for analyzing facial expressions and a method and system for analyzing facial expression satisfaction.
  • facial expression analysis when performing facial expression analysis, most of them only train a neural network model based on training data and corresponding annotation information, and input the object to be predicted into the trained neural network model to obtain the facial expression analysis result.
  • the facial expressions are fluctuating, that is, the face does not keep happy, calm, or angry every second, which makes the obtained facial expression analysis results inaccurate.
  • This application provides a facial expression analysis method and system and a facial expression satisfaction analysis method and system.
  • the complete video information of the user’s expression is used, and the fluctuation of the expression is fully considered.
  • the real emotion of the user can be determined and accurately Determine user satisfaction.
  • This application provides a method for analyzing facial expressions, including:
  • the human facial expression spectrogram is divided into multiple emotional regions corresponding to different expressions.
  • This application also provides a facial expression satisfaction analysis method. After adopting the aforementioned facial expression analysis method, it also includes: in each time period in the facial expression video clip to be analyzed, The facial expression spectrogram corresponding to multiple emotional regions of different expressions is analyzed and calculated to determine the user's satisfaction.
  • This application also provides a facial expression analysis system, which adopts the aforementioned facial expression analysis method, including:
  • the picture acquisition module is configured to acquire a video segment of the facial expression to be analyzed, and acquire the picture stream in the facial expression video segment;
  • the expression spectrum module is set to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream;
  • An expression reference module configured to determine a reference line corresponding to the human face in a natural state according to the facial expression spectrogram, and determine the natural emotional area of the human face in the natural state based on the reference line;
  • the expression partition module is set to divide the facial expression spectrogram into multiple emotional regions corresponding to different expressions based on the natural emotional region.
  • This application also provides a facial expression satisfaction analysis system, which adopts the aforementioned facial expression analysis system, including:
  • the picture acquisition module is configured to acquire a video segment of the facial expression to be analyzed, and acquire the picture stream in the facial expression video segment;
  • An expression spectrum module configured to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream;
  • An expression reference module configured to determine a reference line corresponding to a human face in a natural state according to the facial expression spectrogram, and determine the natural emotional region of the human face in the natural state based on the reference line;
  • An expression partition module configured to divide the facial expression spectrogram into a plurality of emotional regions corresponding to different expressions based on the natural emotional region
  • the satisfaction calculation module is configured to analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram within each time period in the facial expression video clip to be analyzed to determine the user's satisfaction.
  • the present application also provides an electronic device, including a memory and a processor, the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to realize the face Expression analysis method and facial expression satisfaction analysis method.
  • This application also provides a computer-readable storage medium that stores a computer program that is executed by a processor to realize the facial expression analysis method and the facial expression satisfaction analysis method.
  • FIG. 1 is a schematic flowchart of a method for analyzing facial expressions according to an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a method for analyzing facial expression satisfaction according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a human facial expression spectrogram corresponding to a picture stream according to an embodiment of the disclosure
  • FIG. 4 is a schematic diagram of a reference interval for determining a reference line according to an embodiment of the disclosure
  • FIG. 5 is a schematic diagram of multiple emotional regions according to an embodiment of the present disclosure.
  • Fig. 6 is a schematic diagram of the frequency and percentage of three emotions, positive, natural, and negative in each time period according to an embodiment of the present disclosure.
  • the directional indications are only used to explain the multiple directions in a specific posture (as shown in the figure).
  • the relative positional relationship and movement of the components, etc., when the specific posture changes, the directional indication also changes accordingly.
  • the terms used are for illustrative purposes only, and are not intended to limit the scope of the present disclosure.
  • the terms “including” and/or “including” are used to specify the existence of the described elements, steps, operations and/or components, but do not exclude the presence or addition of one or more other elements, steps, operations and/or components .
  • the terms “first”, “second”, etc. may be used to describe various elements, do not represent an order, and do not limit these elements.
  • “plurality” means two or more. These terms are only used to distinguish one element from another.
  • the drawings are used for illustration purposes only to depict the embodiments of the present disclosure.
  • a face expression analysis method includes:
  • S1 Obtain a face expression video clip to be analyzed, and obtain a picture stream in the face expression video clip.
  • the facial expression video segment when acquiring the picture stream, you can extract frames frame by frame (that is, extract each frame), frame at fixed intervals (for example, one frame per second), or extract key frames (that is, according to the picture
  • the image stream in the facial expression video segment is obtained by extracting i-frames).
  • the facial expression video segment when acquiring the picture stream in the video segment, can be divided to obtain multiple video sub-segments, and at least one frame of each video sub-segment is randomly or fixedly extracted, and according to the extracted Multiple picture frames determine the picture stream in the facial expression video clip.
  • the fluctuation of the facial expression is fully considered, and the one-sidedness and inaccuracy of determining the user's emotion are avoided through a single frame of image.
  • S2 Determine a face expression spectrogram corresponding to the picture stream according to the facial expression index of each frame of the picture stream.
  • the facial expression index ranges from 0 to 100. The higher the index, the more positive the expression, that is, the more inclined to emotions such as happiness and surprise. The lower the index, the more negative the expression, that is, the more inclined to emotions such as anger and fear.
  • the index near the middle indicates that the expression is in a natural state. For example, according to the time stamp information of each frame of the picture in the picture stream, a face expression spectrogram is generated according to the facial expression index of each frame and the corresponding time information. The generated facial expression spectrogram can intuitively display the fluctuation process of the user's expression.
  • the method before analyzing the facial expression index of each frame of the picture stream, the method further includes: performing face detection on each frame of the picture stream to obtain the face of each frame of the picture. Image, and then analyze the facial expression index of each frame to obtain the facial expression spectrogram.
  • the face detection algorithm can, for example, use Multi-Task Convolutional Neural Network (MTCNN), Single Shot Multi-boxes Detector (SSD) or target detection Algorithms (You Only Look Once V3, YOLOV3), etc., but not limited to the ones listed, you can choose according to your needs.
  • MTCNN Multi-Task Convolutional Neural Network
  • SSD Single Shot Multi-boxes Detector
  • target detection Algorithms You Only Look Once V3, YOLOV3
  • S2 includes:
  • the key feature points divide the face image into multiple regions, and each region contains multiple key feature points used to determine the facial expression index. If all the feature points are directly used to determine the facial expression index, the amount of calculation will be increased. Through key feature point recognition, the amount of calculation can be reduced while ensuring the accuracy of the facial expression index. For example, there are 106 feature points on a face image. From these 106 feature points, some of the key feature points used to determine the facial expression index are identified, such as the key feature points of the mouth, the key feature points of the eye, and the key feature of the eyebrows. Wait. When recognizing the multiple feature points, feature point recognition can be performed based on the trained neural network model. It is not limited to the above methods, and adaptive selection and adjustment of feature point recognition methods can be carried out.
  • key feature point recognition may also be performed on the face image to obtain multiple key feature points used to determine the facial expression index.
  • the key feature points are the key feature points of the mouth, the key feature points of the eye, and the key feature points of the eyebrows.
  • the neural network model obtained by training can be used to process the face image, for example, the face image is input to the trained nerve Processed in the network model, the key feature points of the mouth, the key features of the eyes, and the key feature points of the eyebrows in the multi-frame face image are obtained.
  • the face image can be divided to obtain multiple regions of the face image according to the region of the reference facial organs, and the key feature points of the face images of the multiple regions can be extracted to obtain the multiple regions included in the multiple regions.
  • Key feature points For example, referring to facial organs including mouth, eyes, and eyebrows, three areas of a face image can be obtained, and the images of the three areas can be input into the mouth key feature point detection model, the eye key feature point detection model, and the eyebrows.
  • the key feature point detection model obtains multiple key feature points contained in multiple regions.
  • S22 Determine the key feature points included in the multiple regions for each frame of the picture, determine the expression scores corresponding to the multiple regions, and determine the facial expression index of each frame of the picture according to the expression scores corresponding to the multiple regions.
  • each region determines at least one included angle between the lines of key feature points in each region, and determine the expression score corresponding to each region based on the at least one included angle; determine the weight corresponding to each region;
  • the expression scores corresponding to multiple regions and the weights corresponding to multiple regions determine the facial expression index of each frame. For example, when calculating at least one included angle and the region, each region corresponds to a weight. The weight of each region can be different. The sum of the weights of all regions is 1.
  • Each region contains at least one included angle. The angle corresponds to an expression score (for example, a percentile system), and the included angle and area are weighted to obtain the facial expression index. Since there are many feature points in the face area, calculating the angle between the two feature points will increase the amount of calculation.
  • the key feature points calculate the angle of the connection line, which reduces the amount of calculation.
  • there can be many lines between the key feature points of an area and you can select the target line to calculate the included angle, such as the angle between the lines of adjacent key feature points, the key feature points at both ends and the middle key feature point The angle of the connection, etc. In this way, on the basis of ensuring the accuracy of the facial expression index, the amount of calculation can be reduced and the processing efficiency can be improved.
  • the contour information of the facial organs contained in the multiple regions may be determined based on the key feature points contained in the multiple regions, and the contour information of the facial organs contained in the multiple regions may be determined respectively.
  • the expression score corresponding to each area may be determined based on the key feature points contained in the multiple regions, and the contour information of the facial organs contained in the multiple regions may be determined respectively.
  • S23 Acquire a face expression spectrogram corresponding to the picture stream according to the face expression index of all pictures.
  • the facial expression index corresponding to each frame of the picture is obtained, and then the facial expression spectrogram corresponding to the picture stream is obtained, as shown in FIG. 3.
  • S3 Determine a reference line corresponding to the human face in a natural state according to the facial expression spectrogram, and determine the natural emotional area of the human face in the natural state based on the reference line.
  • S3 includes:
  • an interval width may be preset, and according to the interval width, the first interval in which the facial expression index appears most frequently in the facial expression spectrogram is determined.
  • the first interval where the facial expression index appears most frequently in the facial expression spectrogram can more truly reflect the expression state of the current user in the natural state, and the baseline determined based on this can accurately reflect the expression of the current user in the natural state.
  • S32 includes:
  • the horizontal center line is determined as the reference line corresponding to the human face in a natural state
  • the horizontal line corresponding to the first threshold is determined as the reference line corresponding to the human face in a natural state
  • the horizontal line corresponding to the second threshold is determined as the reference line corresponding to the human face in a natural state.
  • the baseline in order to avoid positive or negative emotions throughout the process, for example, can be set between 30-60. If the actual measured baseline is higher than 60, set it to 60 , If the actual measured baseline is lower than 30, set it to 30 to avoid the special situation of positive emotions or negative emotions throughout the process.
  • the set values of the first threshold and the second threshold of the baseline can be adjusted adaptively, and are not limited to the above-mentioned values.
  • an area of 15 up and down and a total width of 30 can be used to represent the facial expression in a natural state, that is, the natural emotional area of the human face in the natural state.
  • the width of the natural emotion zone can be adjusted adaptively and is not limited to the above values.
  • the area above the natural emotion area can be determined as a positive emotion area, and the area below the natural emotion area can be determined as a negative emotion area.
  • the natural emotion area the user's emotions can be stratified, avoiding positive or negative emotions throughout the process.
  • This embodiment utilizes the complete video information of the user's facial expressions, fully considers the fluctuations of the facial expressions, and fully considers the differences in the natural state of the individual users, and obtains the baseline corresponding to the natural state based on frequency analysis, which can determine the user's true emotions.
  • the user’s emotions are stratified through the reference area, and the time period of the user’s video clips are weighted.
  • the user’s emotion type and time weight are comprehensively considered, and the user’s satisfaction can be determined more accurately.
  • the method for analyzing facial expression satisfaction after adopting the aforementioned method for analyzing facial expression, further includes: S5, in the facial expression video clip to be analyzed In each time period in, analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram to determine user satisfaction.
  • S5 includes:
  • S51 Divide the facial expression video clip to be analyzed into multiple time periods, and calculate the proportions of different facial expressions in the multiple time periods according to the facial expression spectrogram.
  • S53 Determine the satisfaction result according to the proportions of different expressions in the multiple time periods and the weights corresponding to the multiple time periods.
  • the weight of different emotions in time needs to be designed to be different.
  • the weights can be set as: the first 20% of the time emotions account for 10% of the weight, the last 10% emotions account for 60% of the weight, and the middle part account for the remaining 30% of the weight.
  • the time weight can be adjusted appropriately.
  • the proportions of each emotional region in the facial expression spectrogram in different time periods are respectively counted. In each time period, determine the corresponding weight of each emotion area according to the proportion of each emotion area in the facial expression spectrogram, and then determine the weight of each time period and the weight of each emotion area in each time period.
  • the weight is weighted and calculated to obtain the user's satisfaction coefficient.
  • the user satisfaction coefficient can be obtained by subtracting the weighted calculated value of negative sentiment from the weighted value of positive sentiment.
  • the weight of the time period of the user's video segment is divided, combined with the proportion of the emotional area in each time period, and the user's emotional type and time weight are comprehensively considered, which can more accurately determine the user's satisfaction.
  • the positive emotions of the user gradually increase, and the negative emotions gradually decrease.
  • the satisfaction result can be determined according to the satisfaction coefficient. For example, when the satisfaction coefficient is greater than or equal to the satisfaction threshold, the satisfaction result is satisfactory. When the satisfaction coefficient is less than or equal to the dissatisfaction threshold, the satisfaction result is unsatisfactory. When the satisfaction coefficient is greater than the dissatisfaction threshold and less than the satisfaction threshold, the satisfaction result is fair. For example, it can be set to be greater than or equal to 0.1 as satisfactory, less than or equal to -0.1 as unsatisfactory, and between -0.1 and 0.1 as normal. Among them, both the satisfactory threshold and the unsatisfactory threshold can be adjusted adaptively.
  • the facial expression analysis system includes: a picture acquisition module, an expression frequency spectrum module, an expression reference module, and an expression partition module.
  • the picture acquisition module is configured to acquire the facial expression video segment to be analyzed, and to acquire the picture stream in the facial expression video segment.
  • the video segment when acquiring the picture stream, you can extract frames frame by frame (that is, extract each frame), frame at fixed intervals (for example, one frame per second), or extract key frames (that is, according to the picture
  • the image stream in the facial expression video clip is obtained by extracting i-frames).
  • the video segment when acquiring the picture stream in the video segment, the video segment can be divided to obtain multiple video sub-segments, and at least one frame in each video sub-segment is randomly or fixedly extracted, based on the extracted multiple pictures Frame, determine the picture stream in the facial expression video clip.
  • the fluctuation of the facial expression is fully considered, and the one-sidedness and inaccuracy of determining the user's emotion are avoided through a single frame of image.
  • the expression spectrum module is configured to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream.
  • the facial expression index ranges from 0 to 100. The higher the index, the more positive the expression, that is, the more inclined to emotions such as happiness and surprise. The lower the index, the more negative the expression, that is, the more inclined to emotions such as anger and fear.
  • the index near the middle indicates that the expression is in a natural state.
  • the facial expression spectrogram is generated according to the facial expression index of each frame and the corresponding time information. The generated facial expression spectrogram can intuitively display the fluctuation process of the user's expression.
  • a face detection module configured to perform face detection on each frame of the picture stream to obtain each frame The face image in the picture, and then the face expression index of each frame of the picture is analyzed to obtain the face expression spectrogram.
  • the face detection algorithm can be selected, for example, MTCNN, SSD, or YOLOV3, etc., but it is not limited to the listed ones, and can be selected according to requirements.
  • a facial expression index is output.
  • the expression spectrum module may include a face area division module, an expression index determination module, and an expression spectrogram determination module.
  • the face region dividing module is configured to divide the face image into multiple regions, and each region contains multiple key feature points for determining the facial expression index.
  • the face region dividing module includes: performing face feature point recognition on the face image to obtain multiple feature points of the face image, and identifying from the multiple feature points for determining the face Multiple key feature points of the expression index; according to the multiple key feature points, the face image is divided into multiple regions, and each region contains multiple key feature points for determining the facial expression index. If all the feature points are directly used to determine the facial expression index, the amount of calculation will be increased. Through key feature point recognition, the amount of calculation can be reduced while ensuring the accuracy of the facial expression index. For example, there are 106 feature points on a face image.
  • key feature points used to determine the facial expression index are identified, such as the key feature points of the mouth, the key feature points of the eye, and the key feature of the eyebrows. Wait.
  • key feature points can be identified based on the trained neural network model. It is not limited to the above methods, and adaptive selection and adjustment of feature point recognition methods can be carried out.
  • the face region dividing module may also perform key feature point recognition on the face image to obtain multiple key feature points used to determine the facial expression index.
  • the key feature points are the key feature points of the mouth, the key feature points of the eye, and the key feature points of the eyebrows.
  • the neural network model obtained by training can be used to process the face image, for example, the face image is input to the trained nerve Processed in the network model, the key feature points of the mouth, the key features of the eyes, and the key feature points of the eyebrows in the multi-frame face image are obtained.
  • the face region dividing module can also divide the face image to obtain multiple regions of the face image according to the region of the reference facial organs, and extract the key feature points of the face images in the multiple regions to obtain multiple regions.
  • Multiple key feature points included in each area For example, referring to facial organs including mouth, eyes, and eyebrows, three areas of a face image can be obtained, and the images of the three areas can be input into the mouth key feature point detection model, the eye key feature point detection model, and the eyebrows.
  • the key feature point detection model obtains multiple key feature points contained in multiple regions.
  • the expression index determination module is configured to determine the key feature points contained in multiple areas for each frame of the picture, determine the expression scores corresponding to the multiple areas, and determine the person in each frame of the picture according to the expression scores corresponding to the multiple areas Face expression index.
  • the expression index determination module includes: determining at least one included angle between the lines of key feature points in each region, and determining the expression score corresponding to each region according to the at least one included angle; Determine the weight corresponding to each area; determine the facial expression index of each frame of the picture according to the expression scores corresponding to multiple areas and the weights corresponding to multiple areas. For example, when calculating at least one included angle and the area where it is located, each area corresponds to a weight, and each area has a different weight. The sum of the weights of all areas is 1, each area contains at least one included angle, and each included angle Corresponding to an expression score (for example, a percentile system), the included angle and area are weighted and calculated to obtain the facial expression index.
  • an expression score for example, a percentile system
  • the angle between the two feature points will increase the amount of calculation.
  • the key feature points calculate the angle of the connection line, which reduces the amount of calculation.
  • there can be many lines between the key feature points of an area and you can select the target line to calculate the included angle, such as the angle between the lines of adjacent key feature points, the key feature points at both ends and the middle key feature point The angle of the connection, etc. In this way, on the basis of ensuring the accuracy of the facial expression index, the amount of calculation can be reduced and the processing efficiency can be improved.
  • the expression index determination module may also determine contour information of facial organs contained in multiple regions based on key feature points contained in multiple regions, and based on the contour information of facial organs contained in multiple regions , Respectively determine the expression scores corresponding to multiple regions.
  • the facial expression spectrogram determining module is configured to obtain the facial expression spectrogram corresponding to the picture stream according to the facial expression indexes of all pictures.
  • the facial expression spectrogram determination module obtains the facial expression index corresponding to each frame of picture after weighted calculation of the facial expression index, and then obtains the facial expression spectrogram corresponding to the picture stream, as shown in FIG. 3.
  • the facial expression reference module is configured to determine a reference line corresponding to the human face in a natural state according to the facial expression spectrogram, and determine the natural emotional region of the human face in the natural state based on the reference line.
  • the expression reference module includes a frequency interval determination module, a reference line determination module, and a natural emotion region determination module.
  • the frequency interval determination module is configured to determine the first interval in which the facial expression index appears most frequently in the facial expression spectrogram.
  • an interval width may be preset, and according to the interval width, the first interval in which the facial expression index appears most frequently in the facial expression spectrogram is determined.
  • the reference line determination module is configured to determine the reference line corresponding to the human face in a natural state according to the first interval.
  • the first interval where the facial expression index appears most frequently in the facial expression spectrogram can more truly reflect the expression state of the current user in the natural state, and the baseline determined based on this can accurately reflect the expression of the current user in the natural state.
  • the baseline determination module includes:
  • the facial expression index corresponding to the horizontal center line is greater than a first threshold and less than a second threshold, the horizontal center line is determined as the reference line corresponding to the human face in a natural state;
  • the facial expression index corresponding to the horizontal center line is less than or equal to the first threshold, determining the horizontal line corresponding to the first threshold as the reference line corresponding to the human face in a natural state;
  • the horizontal line corresponding to the second threshold is determined as the reference line corresponding to the human face in a natural state.
  • the interval width As shown in Figure 4, for example, you can set the interval width to 20. In the process of determining the baseline, use this interval to scan the facial expression spectrogram from bottom to top to find the interval with the highest frequency.
  • the baseline is the interval Horizontal center.
  • the schematic diagram of the baseline is shown in Figure 3.
  • the baseline in order to avoid positive or negative emotions throughout the process, for example, can be set between 30-60. If the actual measured baseline is higher than 60, set it to 60 , If the actual measured baseline is lower than 30, set it to 30 to avoid the special situation of positive emotions or negative emotions throughout the process.
  • the set values of the first threshold and the second threshold of the baseline can be adjusted adaptively, and are not limited to the above-mentioned values.
  • the natural emotion area determination module is configured to take the reference line as the center, and use the second interval within the set width range of the upper and lower sides of the reference line as the natural emotion area of the human face in the natural state.
  • an area of 15 up and down and a total width of 30 can be used to represent the expression in a natural state, that is, the natural emotional area of a human face in a natural state.
  • the width of the natural emotion zone can be adjusted adaptively and is not limited to the above values.
  • the expression partition module is configured to use the natural emotional region as a reference to divide the facial expression spectrogram into multiple emotional regions corresponding to different expressions. As shown in FIG. 5, in the human facial expression spectrogram, the area above the natural emotion area can be determined as a positive emotion area, and the area below the natural emotion area can be determined as a negative emotion area. Through the reference area, the natural emotion area, the user's emotions can be stratified, avoiding positive or negative emotions throughout the process.
  • the facial expression satisfaction analysis system adopts the aforementioned facial expression analysis system, and the difference lies in that it also includes a satisfaction calculation module.
  • the satisfaction calculation module is configured to analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram within each time period in the facial expression video clip to be analyzed to determine the user's satisfaction.
  • the satisfaction calculation module includes:
  • the time period expression percentage calculation module is configured to divide the facial expression video segment to be analyzed into multiple time periods, and calculate the proportions of different expressions in the multiple time periods according to the facial expression spectrogram.
  • the time period weight determination module is configured to determine the weights corresponding to multiple time periods.
  • the satisfaction result calculation module is configured to determine the satisfaction result according to the proportions of different expressions in multiple time periods and the weights corresponding to the multiple time periods.
  • the weights can be set as: the first 20% of the time emotions account for 10% of the weight, the last 10% emotions account for 60% of the weight, and the middle part account for the remaining 30% of the weight.
  • the time weight can be adjusted appropriately.
  • the proportion of each emotional region in the facial expression spectrogram in different time periods is separately counted. In each time period, determine the corresponding weight of each emotion area according to the proportion of each emotion area in the facial expression spectrogram, and then determine the weight of each time period and the weight of each emotion area in each time period.
  • the weight is weighted and calculated to obtain the user's satisfaction.
  • User satisfaction can be obtained by subtracting the weighted value of negative sentiment from the weighted value of positive sentiment.
  • the positive emotions of the user gradually increase, and the negative emotions gradually decrease.
  • the satisfaction result can be determined according to the satisfaction coefficient. For example, when the satisfaction coefficient is greater than or equal to the satisfaction threshold, the satisfaction result is satisfactory. When the satisfaction coefficient is less than or equal to the dissatisfaction threshold, the satisfaction result is unsatisfactory. When the satisfaction coefficient is greater than the dissatisfaction threshold and less than the satisfaction threshold, the satisfaction result is general. For example, it can be set higher than 0.1 as satisfactory, lower than -0.1 as unsatisfactory, and between -0.1-0.1 as normal. Among them, both the satisfactory threshold and the unsatisfactory threshold can be adjusted adaptively.
  • the facial expression analysis method and system and the facial expression satisfaction analysis method and system provided in the present disclosure utilize the complete video information of user expressions, fully consider the fluctuations of facial expressions, and fully consider the naturalness of the user. State differences, based on frequency analysis to obtain the baseline corresponding to the natural state, it is possible to determine the user’s true emotions. Set the reference interval corresponding to the natural state according to the user's reference line to avoid the situation of positive or negative expressions throughout the process. The user’s emotions are stratified through the reference area, and the time period of the user’s video clips are weighted. The user’s emotion type and time weight are comprehensively considered, and the user’s satisfaction can be determined more accurately.
  • the present disclosure also relates to an electronic device, including a server, a terminal, and the like.
  • the electronic device includes: at least one processor; a memory that is communicatively connected with the at least one processor; and a communication component that is communicatively connected to the memory, and the communication component receives and sends data under the control of the processor; wherein the memory stores data
  • the instructions are executed by at least one processor, and the instructions are executed by at least one processor to implement the facial expression analysis method and the facial expression satisfaction analysis method in the foregoing embodiments.
  • the memory as a non-volatile computer-readable storage medium, can be configured to store non-volatile software programs, non-volatile computer-executable programs, and modules.
  • the processor executes multiple functional applications and data processing of the device by running non-volatile software programs, instructions, and modules stored in the memory, that is, realizing the aforementioned facial expression analysis method and facial expression satisfaction analysis method.
  • the memory may include a program storage area and a data storage area, where the program storage area can store an operating system and an application program required by at least one function; the data storage area can store a list of options and the like.
  • the memory may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the memory may optionally include a memory remotely arranged with respect to the processor, and these remote memories may be connected to an external device through a network. Examples of the aforementioned networks include the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • One or more modules are stored in the memory, and when executed by one or more processors, the facial expression analysis method and the facial expression satisfaction analysis method in any of the foregoing embodiments are executed.
  • the above products can execute the facial expression analysis method and the facial expression satisfaction analysis method provided by the embodiments of this application, and have the corresponding functional modules and effects of the execution method.
  • the facial expression analysis method and the facial expression satisfaction analysis method provided by the embodiment can execute the facial expression analysis method and the facial expression satisfaction analysis method provided by the embodiments of this application, and have the corresponding functional modules and effects of the execution method.
  • the present disclosure also relates to a computer-readable storage medium for storing a computer-readable program for a computer to execute part or all of the above-mentioned facial expression analysis method and facial expression satisfaction analysis method.
  • the program is stored in a storage medium and includes multiple instructions to enable a device (may be a single-chip microcomputer, a chip, etc.) or A processor (processor) executes all or part of the steps of the methods described in the multiple embodiments of the present application.
  • the aforementioned storage media include: Universal Serial Bus flash disk (USB flash disk), mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory) , Magnetic disks or optical disks and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

A facial expression analysis method and system. The method comprises: obtaining a facial expression video clip to be analyzed, and obtaining a picture stream in the facial expression video clip; determining, according to a facial expression index of each frame of picture in the picture stream, a facial expression spectrogram corresponding to the picture stream; determining, according to the facial expression spectrogram, a reference line corresponding to the face in a natural state, and determining, on the basis of the reference line, a natural emotion area of the face in the natural state; and by taking the natural emotion area as a reference, dividing the facial expression spectrogram into a plurality of emotion areas corresponding to different expressions. Also provided are a facial expression-based satisfaction analysis method and system.

Description

人脸表情分析方法和系统及人脸表情满意度分析方法和系统Facial expression analysis method and system and facial expression satisfaction analysis method and system
本申请要求在2020年01月13日提交中国专利局、申请号为202010033040.1的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office with an application number of 202010033040.1 on January 13, 2020, and the entire content of this application is incorporated into this application by reference.
技术领域Technical field
本申请涉及数据分析技术领域,例如涉及一种人脸表情分析方法和系统及人脸表情满意度分析方法和系统。This application relates to the technical field of data analysis, for example, to a method and system for analyzing facial expressions and a method and system for analyzing facial expression satisfaction.
背景技术Background technique
相关技术中,在进行人脸表情分析时,大多只是根据训练数据以及对应的标注信息训练神经网络模型,把待预测的对象输入训练好的神经网络模型,得到人脸表情分析结果。而人脸表情是波动的,即脸部不会每一秒都一直持续保持着高兴、平静或者愤怒,这使得得到的人脸表情分析结果并不准确。In related technologies, when performing facial expression analysis, most of them only train a neural network model based on training data and corresponding annotation information, and input the object to be predicted into the trained neural network model to obtain the facial expression analysis result. The facial expressions are fluctuating, that is, the face does not keep happy, calm, or angry every second, which makes the obtained facial expression analysis results inaccurate.
发明内容Summary of the invention
本申请提供一种人脸表情分析方法和系统及人脸表情满意度分析方法和系统,利用了用户表情的完整视频信息,充分考虑了表情的波动,能够确定用户真实的情绪,并能够准确地确定用户的满意度。This application provides a facial expression analysis method and system and a facial expression satisfaction analysis method and system. The complete video information of the user’s expression is used, and the fluctuation of the expression is fully considered. The real emotion of the user can be determined and accurately Determine user satisfaction.
本申请提供了一种人脸表情分析方法,包括:This application provides a method for analyzing facial expressions, including:
获取待分析的人脸表情视频片段,并获取所述人脸表情视频片段中的图片流;Acquiring a face expression video clip to be analyzed, and acquiring a picture stream in the face expression video clip;
根据所述图片流中每帧画面的人脸表情指数,确定图片流所对应的人脸表情频谱图;Determine the face expression spectrogram corresponding to the picture stream according to the facial expression index of each frame of the picture in the picture stream;
根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定人脸在自然状态下的自然情绪区域;Determine the reference line corresponding to the human face in the natural state according to the face expression spectrogram, and determine the natural emotional area of the human face in the natural state based on the reference line;
以所述自然情绪区域为基准,将人脸表情频谱图划分为对应不同表情的多个情绪区域。Using the natural emotional region as a reference, the human facial expression spectrogram is divided into multiple emotional regions corresponding to different expressions.
本申请还提供了一种人脸表情满意度分析方法,在采用前述的一种人脸表情分析方法之后,还包括:在待分析的人脸表情视频片段中的每个时间段内,对人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度。This application also provides a facial expression satisfaction analysis method. After adopting the aforementioned facial expression analysis method, it also includes: in each time period in the facial expression video clip to be analyzed, The facial expression spectrogram corresponding to multiple emotional regions of different expressions is analyzed and calculated to determine the user's satisfaction.
本申请还提供了一种人脸表情分析系统,采用前述所述的一种人脸表情分析方法,包括:This application also provides a facial expression analysis system, which adopts the aforementioned facial expression analysis method, including:
图片采集模块,设置为获取待分析的人脸表情的视频片段,并获取所述人脸表情视频片段中的图片流;The picture acquisition module is configured to acquire a video segment of the facial expression to be analyzed, and acquire the picture stream in the facial expression video segment;
表情频谱模块,设置为根据图片流中每帧画面的人脸表情指数,确定图片流所对应的人脸表情频谱图;The expression spectrum module is set to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream;
表情基准模块,设置为根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定人脸在自然状态下的自然情绪区域;An expression reference module, configured to determine a reference line corresponding to the human face in a natural state according to the facial expression spectrogram, and determine the natural emotional area of the human face in the natural state based on the reference line;
表情分区模块,设置为以所述自然情绪区域为基准,将人脸表情频谱图划分为对应不同表情的多个情绪区域。The expression partition module is set to divide the facial expression spectrogram into multiple emotional regions corresponding to different expressions based on the natural emotional region.
本申请还提供了一种人脸表情满意度分析系统,采用前述的一种人脸表情分析系统,包括:This application also provides a facial expression satisfaction analysis system, which adopts the aforementioned facial expression analysis system, including:
图片采集模块,设置为获取待分析的人脸表情的视频片段,并获取所述人脸表情视频片段中的图片流;The picture acquisition module is configured to acquire a video segment of the facial expression to be analyzed, and acquire the picture stream in the facial expression video segment;
表情频谱模块,设置为根据所述图片流中每帧画面的人脸表情指数,确定所述图片流所对应的人脸表情频谱图;An expression spectrum module, configured to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream;
表情基准模块,设置为根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定所述人脸在所述自然状态下的自然情绪区域;An expression reference module, configured to determine a reference line corresponding to a human face in a natural state according to the facial expression spectrogram, and determine the natural emotional region of the human face in the natural state based on the reference line;
表情分区模块,设置为以所述自然情绪区域为基准,将所述人脸表情频谱图划分为对应不同表情的多个情绪区域;An expression partition module, configured to divide the facial expression spectrogram into a plurality of emotional regions corresponding to different expressions based on the natural emotional region;
满意度计算模块,设置为在待分析的人脸表情视频片段中的每个时间段内,对人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度。The satisfaction calculation module is configured to analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram within each time period in the facial expression video clip to be analyzed to determine the user's satisfaction.
本申请还提供了一种电子设备,包括存储器和处理器,所述存储器设置为存储一条或多条计算机指令,其中,所述一条或多条计算机指令被处理器执行以实现所述的人脸表情分析方法和人脸表情满意度分析方法。The present application also provides an electronic device, including a memory and a processor, the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to realize the face Expression analysis method and facial expression satisfaction analysis method.
本申请还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行以实现所述的人脸表情分析方法和人脸表情满意度分析方法。This application also provides a computer-readable storage medium that stores a computer program that is executed by a processor to realize the facial expression analysis method and the facial expression satisfaction analysis method.
附图说明Description of the drawings
图1为本公开一实施例所述的一种人脸表情分析方法的流程示意图;FIG. 1 is a schematic flowchart of a method for analyzing facial expressions according to an embodiment of the present disclosure;
图2为本公开一实施例所述的一种人脸表情满意度分析方法的流程示意图;FIG. 2 is a schematic flowchart of a method for analyzing facial expression satisfaction according to an embodiment of the present disclosure;
图3为本公开一实施例所述的图片流所对应的人脸表情频谱图的示意图;3 is a schematic diagram of a human facial expression spectrogram corresponding to a picture stream according to an embodiment of the disclosure;
图4为本公开一实施例所述的用于确定基准线的基准区间的示意图;4 is a schematic diagram of a reference interval for determining a reference line according to an embodiment of the disclosure;
图5为本公开一实施例所述的多个情绪区域的示意图;FIG. 5 is a schematic diagram of multiple emotional regions according to an embodiment of the present disclosure;
图6为本公开一实施例所述的每个时间段的积极、自然和消极三种情绪频次和百分比的示意图。Fig. 6 is a schematic diagram of the frequency and percentage of three emotions, positive, natural, and negative in each time period according to an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行描述,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。The technical solutions in the embodiments of the present disclosure will be described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. The described embodiments are only a part of the embodiments of the present disclosure, rather than all the embodiments.
若本公开实施例中有涉及方向性指示(诸如上、下、左、右、前、后……),则该方向性指示仅用于解释在一个特定姿态(如附图所示)下多个部件之间的相对位置关系和运动情况等,在该特定姿态发生改变时,则该方向性指示也相应地随之改变。If there are directional indications (such as up, down, left, right, front, back...) in the embodiments of the present disclosure, the directional indications are only used to explain the multiple directions in a specific posture (as shown in the figure). The relative positional relationship and movement of the components, etc., when the specific posture changes, the directional indication also changes accordingly.
另外,在本公开的描述中,所用术语仅用于说明目的,并非旨在限制本公开的范围。术语“包括”和/或“包含”用于指定所述元件、步骤、操作和/或组件的存在,但并不排除存在或添加一个或多个其他元件、步骤、操作和/或组件的情况。术语“第一”、“第二”等可能用于描述多种元件,不代表顺序,且不对这些元件起限定作用。此外,在本公开的描述中,“多个”的含义是两个及两个以上。这些术语仅用于区分一个元素和另一个元素。附图仅出于说明的目的用来描绘本公开所述实施例。In addition, in the description of the present disclosure, the terms used are for illustrative purposes only, and are not intended to limit the scope of the present disclosure. The terms "including" and/or "including" are used to specify the existence of the described elements, steps, operations and/or components, but do not exclude the presence or addition of one or more other elements, steps, operations and/or components . The terms "first", "second", etc. may be used to describe various elements, do not represent an order, and do not limit these elements. In addition, in the description of the present disclosure, "plurality" means two or more. These terms are only used to distinguish one element from another. The drawings are used for illustration purposes only to depict the embodiments of the present disclosure.
本公开实施例所述的一种人脸表情分析方法,如图1所示,包括:A face expression analysis method according to an embodiment of the present disclosure, as shown in FIG. 1, includes:
S1,获取待分析的人脸表情视频片段,并获取人脸表情视频片段中的图片流。S1: Obtain a face expression video clip to be analyzed, and obtain a picture stream in the face expression video clip.
一种可选的实施方式中,在获取图片流时,可以通过逐帧(即每一帧都抽取出来)、固定间隔抽帧(例如每一秒抽一帧)或抽取关键帧(即按照画面的变化抽取i帧)的方式获取人脸表情视频片段中的图片流。本实施例中,在获取 视频片段中的图片流时,可以将人脸表情视频片段进行分割,得到多个视频子片段,随机或固定抽取每个视频子片段中的至少一帧,根据抽取的多个画面帧,确定人脸表情视频片段中的图片流。本实施例通过采集包含用户表情的完整视频信息,充分考虑了人脸表情的波动,避免单独通过一帧图像来确定用户情绪的片面性及不准确性。In an alternative embodiment, when acquiring the picture stream, you can extract frames frame by frame (that is, extract each frame), frame at fixed intervals (for example, one frame per second), or extract key frames (that is, according to the picture The image stream in the facial expression video segment is obtained by extracting i-frames). In this embodiment, when acquiring the picture stream in the video segment, the facial expression video segment can be divided to obtain multiple video sub-segments, and at least one frame of each video sub-segment is randomly or fixedly extracted, and according to the extracted Multiple picture frames determine the picture stream in the facial expression video clip. In this embodiment, by collecting complete video information including the user's facial expression, the fluctuation of the facial expression is fully considered, and the one-sidedness and inaccuracy of determining the user's emotion are avoided through a single frame of image.
S2,根据图片流中每帧画面的人脸表情指数,确定图片流所对应的人脸表情频谱图。S2: Determine a face expression spectrogram corresponding to the picture stream according to the facial expression index of each frame of the picture stream.
举例来说,人脸表情指数范围为0~100,指数越高表示表情越偏积极,即更倾向于高兴和惊喜等情绪。指数越低表示表情越偏消极,即更倾向于愤怒和害怕等情绪。指数靠近中间表示表情处于自然状态。例如,按照图片流中每帧画面的时间戳信息,根据每帧画面的人脸表情指数以及对应的时间信息,生成人脸表情频谱图。通过生成的人脸表情频谱图能直观的显示用户表情的波动过程。For example, the facial expression index ranges from 0 to 100. The higher the index, the more positive the expression, that is, the more inclined to emotions such as happiness and surprise. The lower the index, the more negative the expression, that is, the more inclined to emotions such as anger and fear. The index near the middle indicates that the expression is in a natural state. For example, according to the time stamp information of each frame of the picture in the picture stream, a face expression spectrogram is generated according to the facial expression index of each frame and the corresponding time information. The generated facial expression spectrogram can intuitively display the fluctuation process of the user's expression.
一种可选的实施方式中,在分析图片流中每帧画面的人脸表情指数之前,还包括:对所述图片流中的每帧画面进行人脸检测,获取每帧画面中的人脸图像,再对每帧画面的人脸表情指数进行分析获取人脸表情频谱图。In an optional implementation manner, before analyzing the facial expression index of each frame of the picture stream, the method further includes: performing face detection on each frame of the picture stream to obtain the face of each frame of the picture. Image, and then analyze the facial expression index of each frame to obtain the facial expression spectrogram.
在进行人脸检测时,人脸检测算法例如可选用多任务卷积神经网络(Multi-Task Convolutional Neural Network,MTCNN)、单镜头多箱式检测(Single Shot Multi-boxes Detector,SSD)或目标检测算法(You Only Look Once V3,YOLOV3)等,但不仅限于列举的几种,可以根据需求进行选择。通过分析每帧画面,输出一个人脸表情指数。When performing face detection, the face detection algorithm can, for example, use Multi-Task Convolutional Neural Network (MTCNN), Single Shot Multi-boxes Detector (SSD) or target detection Algorithms (You Only Look Once V3, YOLOV3), etc., but not limited to the ones listed, you can choose according to your needs. By analyzing each frame of the picture, a facial expression index is output.
一种可选的实施方式中,S2包括:In an optional implementation manner, S2 includes:
S21,将人脸图像划分为多个区域,每个区域内包含用于确定人脸表情指数的多个关键特征点。S21: Divide the face image into multiple regions, and each region contains multiple key feature points for determining the facial expression index.
可选的,对人脸图像进行人脸特征点识别,得到人脸图像的多个特征点,从多个特征点中识别出用于确定人脸表情指数的多个关键特征点;根据多个关键特征点,将人脸图像划分为多个区域,每个区域内包含用于确定人脸表情指数的多个关键特征点。如果直接利用所有特征点来确定人脸表情指数,会加大计算量,通过关键特征点识别,在保证人脸表情指数准确度的同时能减少计算量。例如,人脸图像上有106个特征点,从这106个特征点中识别出用于确定人脸表情指数的部分关键特征点,比如嘴部关键特征点、眼部关键特征点和眉毛关键特征点等。在识别所述多个特征点时,可以基于训练好的神经网络模型进行特征点识别。并不仅限于上述方式,可以进行适应性选择和调整特征点识别方法。Optionally, perform face feature point recognition on the face image to obtain multiple feature points of the face image, and identify multiple key feature points for determining the facial expression index from the multiple feature points; The key feature points divide the face image into multiple regions, and each region contains multiple key feature points used to determine the facial expression index. If all the feature points are directly used to determine the facial expression index, the amount of calculation will be increased. Through key feature point recognition, the amount of calculation can be reduced while ensuring the accuracy of the facial expression index. For example, there are 106 feature points on a face image. From these 106 feature points, some of the key feature points used to determine the facial expression index are identified, such as the key feature points of the mouth, the key feature points of the eye, and the key feature of the eyebrows. Wait. When recognizing the multiple feature points, feature point recognition can be performed based on the trained neural network model. It is not limited to the above methods, and adaptive selection and adjustment of feature point recognition methods can be carried out.
在一种可选的实施方式中,还可以对人脸图像进行关键特征点识别,得到用于确定人脸表情指数的多个关键特征点。例如,关键特征点为嘴部关键特征点、眼部关键特征点和眉毛关键特征点,可以用训练得到的神经网络模型对人脸图像进行处理,例如,将人脸图像输入至训练好的神经网络模型中处理,得到多帧人脸图像中的嘴部关键特征点、眼部关键特征点和眉毛关键特征点。In an optional implementation manner, key feature point recognition may also be performed on the face image to obtain multiple key feature points used to determine the facial expression index. For example, the key feature points are the key feature points of the mouth, the key feature points of the eye, and the key feature points of the eyebrows. The neural network model obtained by training can be used to process the face image, for example, the face image is input to the trained nerve Processed in the network model, the key feature points of the mouth, the key features of the eyes, and the key feature points of the eyebrows in the multi-frame face image are obtained.
可选的,还可以根据参考面部器官的区域,对人脸图像进行划分得到人脸图像的多个区域,分别对多个区域的人脸图像进行关键特征点提取,得到多个区域包括的多个关键特征点。例如,参考面部器官包括嘴部、眼部以及眉毛,可以得到人脸图像的三个区域,可以将三个区域的图像分别输入嘴部关键特征点检测模型、眼部关键特征点检测模型以及眉毛关键特征点检测模型,分别得到多个区域包含的多个关键特征点。Optionally, the face image can be divided to obtain multiple regions of the face image according to the region of the reference facial organs, and the key feature points of the face images of the multiple regions can be extracted to obtain the multiple regions included in the multiple regions. Key feature points. For example, referring to facial organs including mouth, eyes, and eyebrows, three areas of a face image can be obtained, and the images of the three areas can be input into the mouth key feature point detection model, the eye key feature point detection model, and the eyebrows. The key feature point detection model obtains multiple key feature points contained in multiple regions.
S22,针对每帧画面分别确定多个区域包含的关键特征点,确定多个区域对应的表情分值,并根据多个区域对应的表情分值,确定每帧画面的人脸表情指数。S22: Determine the key feature points included in the multiple regions for each frame of the picture, determine the expression scores corresponding to the multiple regions, and determine the facial expression index of each frame of the picture according to the expression scores corresponding to the multiple regions.
可选的,确定每个区域中关键特征点连线之间的至少一个夹角,并根据所述至少一个夹角,确定每个区域对应的表情分值;确定每个区域对应的权重;根据多个区域对应的表情分值以及多个区域对应的权重,确定每帧画面的人脸表情指数。例如,在对至少一个夹角及所在区域进行计算时,每个区域对应一权重,每个区域权重可以不同,所有区域的权重和为1,每个区域内包含至少一个夹角,每个夹角对应一表情分值(例如,百分制),对夹角和区域进行加权计算得到人脸表情指数。由于人脸区域的特征点较多,如对两两特征点连线的夹角都进行计算会加大计算量,筛选出用于加权计算人脸表情指数的关键特征点后,可以直接对这些关键特征点计算连线夹角,减小了计算量。其中,一个区域的关键特征点之间可有很多连线,可以选择目标连线来计算夹角,比如相邻关键特征点连线之间的夹角,两端关键特征点与中间关键特征点连线的夹角等。这样,能在保证人脸表情指数准确度的基础上,减少计算量,提高处理效率。Optionally, determine at least one included angle between the lines of key feature points in each region, and determine the expression score corresponding to each region based on the at least one included angle; determine the weight corresponding to each region; The expression scores corresponding to multiple regions and the weights corresponding to multiple regions determine the facial expression index of each frame. For example, when calculating at least one included angle and the region, each region corresponds to a weight. The weight of each region can be different. The sum of the weights of all regions is 1. Each region contains at least one included angle. The angle corresponds to an expression score (for example, a percentile system), and the included angle and area are weighted to obtain the facial expression index. Since there are many feature points in the face area, calculating the angle between the two feature points will increase the amount of calculation. After selecting the key feature points for weighted calculation of the facial expression index, you can directly calculate these The key feature points calculate the angle of the connection line, which reduces the amount of calculation. Among them, there can be many lines between the key feature points of an area, and you can select the target line to calculate the included angle, such as the angle between the lines of adjacent key feature points, the key feature points at both ends and the middle key feature point The angle of the connection, etc. In this way, on the basis of ensuring the accuracy of the facial expression index, the amount of calculation can be reduced and the processing efficiency can be improved.
在一种可选的实施方式中,还可以根据多个区域包含的关键特征点,确定多个区域包含的面部器官的轮廓信息,并根据多个区域包含的面部器官的轮廓信息,分别确定多个区域对应的表情分值。In an alternative embodiment, the contour information of the facial organs contained in the multiple regions may be determined based on the key feature points contained in the multiple regions, and the contour information of the facial organs contained in the multiple regions may be determined respectively. The expression score corresponding to each area.
S23,根据所有画面的人脸表情指数获取图片流所对应的人脸表情频谱图。S23: Acquire a face expression spectrogram corresponding to the picture stream according to the face expression index of all pictures.
经过上述加权计算后得到每帧画面所对应的人脸表情指数,进而得到图片流所对应的人脸表情频谱图,如图3所示。After the above weighting calculation, the facial expression index corresponding to each frame of the picture is obtained, and then the facial expression spectrogram corresponding to the picture stream is obtained, as shown in FIG. 3.
S3,根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于基准线确定人脸在自然状态下的自然情绪区域。S3: Determine a reference line corresponding to the human face in a natural state according to the facial expression spectrogram, and determine the natural emotional area of the human face in the natural state based on the reference line.
由于每个人自然状态下的表情都是不同的,有的人天生苦瓜脸,有的人天生脸带笑意,所以每个人的自然状态的基准线是不同的,通过找到基准线,能够较准确地根据用户的基准线,确定用户真实的情绪,从而有效提高人脸表情识别的准确度。Since each person’s expression in the natural state is different, some people are born with bitter melon faces, and some are born with smiles, so the baseline of each person’s natural state is different. By finding the baseline, it can be more accurate Based on the user’s baseline, the user’s true emotions are determined, thereby effectively improving the accuracy of facial expression recognition.
一种可选的实施方式中,S3包括:In an optional implementation manner, S3 includes:
S31,在人脸表情频谱图中,确定人脸表情指数出现频次最高的第一区间。S31: In the face expression spectrogram, determine the first interval where the face expression index appears most frequently.
人在接收服务的过程中大部分时间是处于自然状态的,可以将出现频次最高的点所在的水平线作为基准线。但每个点的数值不可能完全一样,所以需要找到一个基准区间(即第一区间),即频次最高的区间。例如,可以预设一个区间宽度,并根据该区间宽度,确定人脸表情频谱图中人脸表情指数出现频次最高的第一区间。People are in a natural state most of the time in the process of receiving services, and the horizontal line at the point with the highest frequency can be used as the baseline. But the value of each point cannot be exactly the same, so it is necessary to find a reference interval (that is, the first interval), that is, the interval with the highest frequency. For example, an interval width may be preset, and according to the interval width, the first interval in which the facial expression index appears most frequently in the facial expression spectrogram is determined.
S32,根据第一区间,确定人脸在自然状态下所对应的基准线。S32: According to the first interval, determine a reference line corresponding to the human face in a natural state.
人脸表情频谱图中人脸表情指数出现频次最高的第一区间能够较真实地反应当前用户在自然状态下的表情状态,据此确定得到的基准线能够准确反应当前用户自然状态下的表情。The first interval where the facial expression index appears most frequently in the facial expression spectrogram can more truly reflect the expression state of the current user in the natural state, and the baseline determined based on this can accurately reflect the expression of the current user in the natural state.
可选的,S32包括:Optionally, S32 includes:
确定第一区间的水平中心线;Determine the horizontal centerline of the first interval;
若水平中心线对应的人脸表情指数大于第一阈值,且小于第二阈值,则将水平中心线确定为人脸在自然状态下所对应的基准线;If the facial expression index corresponding to the horizontal center line is greater than the first threshold and less than the second threshold, the horizontal center line is determined as the reference line corresponding to the human face in a natural state;
若水平中心线对应的人脸表情指数小于或等于第一阈值,则将第一阈值对应的水平线确定为人脸在自然状态下所对应的基准线;If the facial expression index corresponding to the horizontal center line is less than or equal to the first threshold, the horizontal line corresponding to the first threshold is determined as the reference line corresponding to the human face in a natural state;
若水平中心线对应的人脸表情指数大于或等于第二阈值,则将第二阈值对应的水平线确定为人脸在自然状态下所对应的基准线。If the facial expression index corresponding to the horizontal center line is greater than or equal to the second threshold, the horizontal line corresponding to the second threshold is determined as the reference line corresponding to the human face in a natural state.
如图4所示,例如可以将区间宽度设置为20,在确定基准线的过程中,用这个区间从下往上扫描人脸表情频谱图,找到频次最高的区间,基准线即为该区间的水平中心。所确定的基准线的示意图如图3所示。As shown in Figure 4, for example, you can set the interval width to 20. In the process of determining the baseline, use this interval to scan the facial expression spectrogram from bottom to top to find the interval with the highest frequency. The baseline is the interval Horizontal center. A schematic diagram of the determined baseline is shown in Figure 3.
如图3所示,为避免全程都是积极情绪或消极情绪的情况,例如,可以将基准线设定在30-60之间,若实际测得的基准线高于60,则设定为60,若实际测得的基准线低于30,则设定为30,以此来避免全程都是积极情绪或消极情绪的特殊情况。基准线的第一阈值和第二阈值的设定值可以适应性调整,并不仅 限于上述数值。As shown in Figure 3, in order to avoid positive or negative emotions throughout the process, for example, the baseline can be set between 30-60. If the actual measured baseline is higher than 60, set it to 60 , If the actual measured baseline is lower than 30, set it to 30 to avoid the special situation of positive emotions or negative emotions throughout the process. The set values of the first threshold and the second threshold of the baseline can be adjusted adaptively, and are not limited to the above-mentioned values.
S33,以基准线为中心,将基准线的上下设定宽度范围内的第二区间作为人脸在自然状态下的自然情绪区域。S33, taking the reference line as the center, and taking the second interval within the set width range of the upper and lower sides of the reference line as the natural emotional area of the human face in the natural state.
如图4所示,获取到基准线后,例如,可以将上下15,总宽30的区域代表人脸表情处于自然状态,即人脸在自然状态下的自然情绪区域。自然情绪区域的宽度可以适应性调整,并不仅限于上述数值。As shown in FIG. 4, after the baseline is obtained, for example, an area of 15 up and down and a total width of 30 can be used to represent the facial expression in a natural state, that is, the natural emotional area of the human face in the natural state. The width of the natural emotion zone can be adjusted adaptively and is not limited to the above values.
S4,以所述自然情绪区域为基准,将人脸表情频谱图划分为对应不同表情的多个情绪区域。S4: Using the natural emotional region as a reference, divide the facial expression spectrogram into multiple emotional regions corresponding to different expressions.
如图5所示,在人脸表情频谱图中,可以将自然情绪区域以上的区域确定为积极情绪区域,将自然情绪区域以下的区域确定为消极情绪区域。通过基准区域即自然情绪区域,可以实现用户的情绪分层,避免全程都是积极或消极的情绪。As shown in FIG. 5, in the human facial expression spectrogram, the area above the natural emotion area can be determined as a positive emotion area, and the area below the natural emotion area can be determined as a negative emotion area. Through the reference area, the natural emotion area, the user's emotions can be stratified, avoiding positive or negative emotions throughout the process.
本实施例利用了用户表情的完整视频信息,充分考虑了表情的波动,并充分考虑了用户个体的自然状态差异,基于频次分析获取自然状态对应的基准线,能够确定用户真实的情绪。根据用户的基准线设定自然状态对应的基准区间,避免全程为积极表情或消极表情的情况。通过基准区域实现用户的情绪分层,并对用户视频片段的时间段划分权重,综合考虑了用户的情绪种类和时间权重,能够更为准确地确定用户的满意度。This embodiment utilizes the complete video information of the user's facial expressions, fully considers the fluctuations of the facial expressions, and fully considers the differences in the natural state of the individual users, and obtains the baseline corresponding to the natural state based on frequency analysis, which can determine the user's true emotions. Set the reference interval corresponding to the natural state according to the user's reference line to avoid the situation of positive or negative expressions throughout the process. The user’s emotions are stratified through the reference area, and the time period of the user’s video clips are weighted. The user’s emotion type and time weight are comprehensively considered, and the user’s satisfaction can be determined more accurately.
本公开实施例所述的一种人脸表情满意度分析方法,如图2所示,在采用前述的一种人脸表情分析方法之后,还包括:S5,在待分析的人脸表情视频片段中的每个时间段内,对人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度。The method for analyzing facial expression satisfaction according to the embodiment of the present disclosure, as shown in FIG. 2, after adopting the aforementioned method for analyzing facial expression, further includes: S5, in the facial expression video clip to be analyzed In each time period in, analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram to determine user satisfaction.
一种可选的实施方式中,S5包括:In an optional implementation manner, S5 includes:
S51,将待分析的人脸表情视频片段划分为多个时间段,根据人脸表情频谱图,分别计算多个时间段内不同表情所占的比例。S51: Divide the facial expression video clip to be analyzed into multiple time periods, and calculate the proportions of different facial expressions in the multiple time periods according to the facial expression spectrogram.
S52,确定多个时间段对应的权重。S52: Determine weights corresponding to multiple time periods.
S53,根据多个时间段内不同表情所占的比例以及多个时间段对应的权重,确定满意度结果。S53: Determine the satisfaction result according to the proportions of different expressions in the multiple time periods and the weights corresponding to the multiple time periods.
由于“哭着来笑着走”比“笑着来哭着走”的满意度肯定要高,因此不同情绪在时间上的权重需要设计成不同。例如权重可以设置为:前20%时间情绪占10%权重,最后10%的情绪占60%权重,中间部分占余下的30%权重。时间权重可以进行适当调整。根据前述S4步骤中所述的方法获取多个情绪区域后, 分别统计不同时间段内每个情绪区域在人脸表情频谱图中所占比例。在每个时间段内,根据每个情绪区域在人脸表情频谱图中所占比例确定每个情绪区域对应的权重,再对每个时间段的权重和每个时间段内每个情绪区域的权重进行加权计算,得到用户的满意度系数。用户满意度系数可通过积极情绪的加权值减去消极情绪的加权计算值获得。对用户视频片段的时间段划分权重,并结合情绪区域在每个时间段内的占比,综合考虑了用户的情绪种类和时间权重,能够更为准确地确定用户的满意度。Since the satisfaction of "cry and laugh and walk" is definitely higher than that of "laugh and cry and walk", the weight of different emotions in time needs to be designed to be different. For example, the weights can be set as: the first 20% of the time emotions account for 10% of the weight, the last 10% emotions account for 60% of the weight, and the middle part account for the remaining 30% of the weight. The time weight can be adjusted appropriately. After obtaining multiple emotional regions according to the method described in step S4, the proportions of each emotional region in the facial expression spectrogram in different time periods are respectively counted. In each time period, determine the corresponding weight of each emotion area according to the proportion of each emotion area in the facial expression spectrogram, and then determine the weight of each time period and the weight of each emotion area in each time period. The weight is weighted and calculated to obtain the user's satisfaction coefficient. The user satisfaction coefficient can be obtained by subtracting the weighted calculated value of negative sentiment from the weighted value of positive sentiment. The weight of the time period of the user's video segment is divided, combined with the proportion of the emotional area in each time period, and the user's emotional type and time weight are comprehensively considered, which can more accurately determine the user's satisfaction.
如图6所示,本实施例中,可以看到用户积极情绪逐渐增多,消极情绪逐渐减少。按照图6所示的示例,可以计算出用户的满意度系数=(31%*10%+28%*30%+37%*60%)-(31%*10%+22%*30%+11%*60%)=0.174。As shown in FIG. 6, in this embodiment, it can be seen that the positive emotions of the user gradually increase, and the negative emotions gradually decrease. According to the example shown in Figure 6, the user’s satisfaction coefficient can be calculated=(31%*10%+28%*30%+37%*60%)-(31%*10%+22%*30%+ 11%*60%)=0.174.
可选的,可以根据满意度系数确定满意度结果。例如,当满意度系数大于或等于满意阈值时,满意度结果为满意。当满意度系数小于或等于不满意阈值时,满意度结果为不满意。当满意度系数大于不满意阈值,且小于满意度阈值时,满意度结果为一般。例如,可以设定大于或等于0.1为满意,小于或等于-0.1为不满意,-0.1-0.1之间为一般。其中,满意阈值和不满意阈值均可适应性调整。Optionally, the satisfaction result can be determined according to the satisfaction coefficient. For example, when the satisfaction coefficient is greater than or equal to the satisfaction threshold, the satisfaction result is satisfactory. When the satisfaction coefficient is less than or equal to the dissatisfaction threshold, the satisfaction result is unsatisfactory. When the satisfaction coefficient is greater than the dissatisfaction threshold and less than the satisfaction threshold, the satisfaction result is fair. For example, it can be set to be greater than or equal to 0.1 as satisfactory, less than or equal to -0.1 as unsatisfactory, and between -0.1 and 0.1 as normal. Among them, both the satisfactory threshold and the unsatisfactory threshold can be adjusted adaptively.
本公开实施例所述的一种人脸表情分析系统,包括:图片采集模块、表情频谱模块、表情基准模块和表情分区模块。The facial expression analysis system according to the embodiment of the present disclosure includes: a picture acquisition module, an expression frequency spectrum module, an expression reference module, and an expression partition module.
图片采集模块被配置为获取待分析的人脸表情视频片段,并获取人脸表情视频片段中的图片流。The picture acquisition module is configured to acquire the facial expression video segment to be analyzed, and to acquire the picture stream in the facial expression video segment.
一种可选的实施方式中,在获取图片流时,可以通过逐帧(即每一帧都抽取出来)、固定间隔抽帧(例如每一秒抽一帧)或抽取关键帧(即按照画面的变化抽取i帧)的方式获取所述人脸表情视频片段中的图片流。本实施例中,在获取视频片段中的图片流时,可以将视频片段进行分割,得到多个视频子片段,随机或固定抽取每个视频子片段中的至少一帧,根据抽取的多个画面帧,确定人脸表情视频片段中的图片流。本实施例通过采集包含用户表情的完整视频信息,充分考虑了人脸表情的波动,避免单独通过一帧图像来确定用户情绪的片面性及不准确性。In an alternative embodiment, when acquiring the picture stream, you can extract frames frame by frame (that is, extract each frame), frame at fixed intervals (for example, one frame per second), or extract key frames (that is, according to the picture The image stream in the facial expression video clip is obtained by extracting i-frames). In this embodiment, when acquiring the picture stream in the video segment, the video segment can be divided to obtain multiple video sub-segments, and at least one frame in each video sub-segment is randomly or fixedly extracted, based on the extracted multiple pictures Frame, determine the picture stream in the facial expression video clip. In this embodiment, by collecting complete video information including the user's facial expression, the fluctuation of the facial expression is fully considered, and the one-sidedness and inaccuracy of determining the user's emotion are avoided through a single frame of image.
表情频谱模块被配置为根据图片流中每帧画面的人脸表情指数,确定图片流所对应的人脸表情频谱图。The expression spectrum module is configured to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream.
举例来说,人脸表情指数范围为0~100,指数越高表示表情越偏积极,即更倾向于高兴和惊喜等情绪。指数越低表示表情越偏消极,即更倾向于愤怒和害怕等情绪。指数靠近中间表示表情处于自然状态。例如,按照图片流中每帧画面的时间戳信息,根据每帧画面的人脸表情指数以及对应的时间信息,生成人 脸表情频谱图。通过生成的人脸表情频谱图能直观的显示用户表情的波动过程。For example, the facial expression index ranges from 0 to 100. The higher the index, the more positive the expression, that is, the more inclined to emotions such as happiness and surprise. The lower the index, the more negative the expression, that is, the more inclined to emotions such as anger and fear. The index near the middle indicates that the expression is in a natural state. For example, according to the time stamp information of each frame in the picture stream, the facial expression spectrogram is generated according to the facial expression index of each frame and the corresponding time information. The generated facial expression spectrogram can intuitively display the fluctuation process of the user's expression.
一种可选的实施方式中,在分析图片流中每帧画面的人脸表情指数之前,还包括人脸检测模块,被配置为对图片流中的每帧画面进行人脸检测,获取每帧画面中的人脸图像,再对每帧画面的人脸表情指数进行分析获取人脸表情频谱图。In an optional implementation manner, before analyzing the facial expression index of each frame of the picture stream, it further includes a face detection module configured to perform face detection on each frame of the picture stream to obtain each frame The face image in the picture, and then the face expression index of each frame of the picture is analyzed to obtain the face expression spectrogram.
在进行人脸检测时,人脸检测算法例如可选用MTCNN、SSD或YOLOV3等,但不仅限于列举的几种,可以根据需求进行选择。通过分析每帧画面,输出一个人脸表情指数。When performing face detection, the face detection algorithm can be selected, for example, MTCNN, SSD, or YOLOV3, etc., but it is not limited to the listed ones, and can be selected according to requirements. By analyzing each frame of the picture, a facial expression index is output.
一种可选的实施方式中,表情频谱模块可以包括人脸区域划分模块、表情指数确定模块和表情频谱图确定模块。In an optional implementation manner, the expression spectrum module may include a face area division module, an expression index determination module, and an expression spectrogram determination module.
人脸区域划分模块,被配置为将人脸图像划分为多个区域,每个区域内包含用于确定人脸表情指数的多个关键特征点。The face region dividing module is configured to divide the face image into multiple regions, and each region contains multiple key feature points for determining the facial expression index.
一种可选的实施方式中,人脸区域划分模块包括:对人脸图像进行人脸特征点识别,得到人脸图像的多个特征点,从多个特征点中识别出用于确定人脸表情指数的多个关键特征点;根据多个关键特征点,将人脸图像划分为多个区域,每个区域内包含用于确定人脸表情指数的多个关键特征点。如果直接利用所有特征点来确定人脸表情指数,会加大计算量,通过关键特征点识别,在保证人脸表情指数准确度的同时能减少计算量。例如,人脸图像上有106个特征点,从这106个特征点中识别出用于确定人脸表情指数的部分关键特征点,比如嘴部关键特征点、眼部关键特征点和眉毛关键特征点等。在识别所述多个特征点时,可以基于训练好的神经网络模型进行关键特征点识别。并不仅限于上述方式,可以进行适应性选择和调整特征点识别方法。In an optional implementation manner, the face region dividing module includes: performing face feature point recognition on the face image to obtain multiple feature points of the face image, and identifying from the multiple feature points for determining the face Multiple key feature points of the expression index; according to the multiple key feature points, the face image is divided into multiple regions, and each region contains multiple key feature points for determining the facial expression index. If all the feature points are directly used to determine the facial expression index, the amount of calculation will be increased. Through key feature point recognition, the amount of calculation can be reduced while ensuring the accuracy of the facial expression index. For example, there are 106 feature points on a face image. From these 106 feature points, some of the key feature points used to determine the facial expression index are identified, such as the key feature points of the mouth, the key feature points of the eye, and the key feature of the eyebrows. Wait. When recognizing the multiple feature points, key feature points can be identified based on the trained neural network model. It is not limited to the above methods, and adaptive selection and adjustment of feature point recognition methods can be carried out.
在一种可选的实施方式中,人脸区域划分模块还可以对人脸图像进行关键特征点识别,得到用于确定人脸表情指数的多个关键特征点。例如,关键特征点为嘴部关键特征点、眼部关键特征点和眉毛关键特征点,可以用训练得到的神经网络模型对人脸图像进行处理,例如,将人脸图像输入至训练好的神经网络模型中处理,得到多帧人脸图像中的嘴部关键特征点、眼部关键特征点和眉毛关键特征点。In an optional implementation manner, the face region dividing module may also perform key feature point recognition on the face image to obtain multiple key feature points used to determine the facial expression index. For example, the key feature points are the key feature points of the mouth, the key feature points of the eye, and the key feature points of the eyebrows. The neural network model obtained by training can be used to process the face image, for example, the face image is input to the trained nerve Processed in the network model, the key feature points of the mouth, the key features of the eyes, and the key feature points of the eyebrows in the multi-frame face image are obtained.
可选的,人脸区域划分模块还可以根据参考面部器官的区域,对人脸图像进行划分得到人脸图像的多个区域,分别对多个区域的人脸图像进行关键特征点提取,得到多个区域包括的多个关键特征点。例如,参考面部器官包括嘴部、眼部以及眉毛,可以得到人脸图像的三个区域,可以将三个区域的图像分别输入嘴部关键特征点检测模型、眼部关键特征点检测模型以及眉毛关键特征点检 测模型,分别得到多个区域包含的多个关键特征点。Optionally, the face region dividing module can also divide the face image to obtain multiple regions of the face image according to the region of the reference facial organs, and extract the key feature points of the face images in the multiple regions to obtain multiple regions. Multiple key feature points included in each area. For example, referring to facial organs including mouth, eyes, and eyebrows, three areas of a face image can be obtained, and the images of the three areas can be input into the mouth key feature point detection model, the eye key feature point detection model, and the eyebrows. The key feature point detection model obtains multiple key feature points contained in multiple regions.
表情指数确定模块,被配置为针对每帧画面分别确定多个区域包含的关键特征点,确定多个区域对应的表情分值,并根据多个区域对应的表情分值,确定每帧画面的人脸表情指数。The expression index determination module is configured to determine the key feature points contained in multiple areas for each frame of the picture, determine the expression scores corresponding to the multiple areas, and determine the person in each frame of the picture according to the expression scores corresponding to the multiple areas Face expression index.
一种可选的实施方式中,表情指数确定模块包括:确定每个区域中关键特征点连线之间的至少一个夹角,并根据至少一个夹角,确定每个区域对应的表情分值;确定每个区域对应的权重;根据多个区域对应的表情分值以及多个区域对应的权重,确定每帧画面的人脸表情指数。例如,在对至少一个夹角及所在区域进行计算时,每个区域对应一权重,每个区域权重不同,所有区域的权重和为1,每个区域内包含至少一个夹角,每个夹角对应一表情分值(例如,百分制),对夹角和区域进行加权计算得到人脸表情指数。由于人脸区域的特征点较多,如对两两特征点连线的夹角都进行计算会加大计算量,筛选出用于加权计算人脸表情指数的关键特征点后,可以直接对这些关键特征点计算连线夹角,减小了计算量。其中,一个区域的关键特征点之间可有很多连线,可以选择目标连线来计算夹角,比如相邻关键特征点连线之间的夹角,两端关键特征点与中间关键特征点连线的夹角等。这样,能在保证人脸表情指数准确度的基础上,减少计算量,提高处理效率。In an optional implementation manner, the expression index determination module includes: determining at least one included angle between the lines of key feature points in each region, and determining the expression score corresponding to each region according to the at least one included angle; Determine the weight corresponding to each area; determine the facial expression index of each frame of the picture according to the expression scores corresponding to multiple areas and the weights corresponding to multiple areas. For example, when calculating at least one included angle and the area where it is located, each area corresponds to a weight, and each area has a different weight. The sum of the weights of all areas is 1, each area contains at least one included angle, and each included angle Corresponding to an expression score (for example, a percentile system), the included angle and area are weighted and calculated to obtain the facial expression index. Since there are many feature points in the face area, calculating the angle between the two feature points will increase the amount of calculation. After selecting the key feature points for weighted calculation of the facial expression index, you can directly calculate these The key feature points calculate the angle of the connection line, which reduces the amount of calculation. Among them, there can be many lines between the key feature points of an area, and you can select the target line to calculate the included angle, such as the angle between the lines of adjacent key feature points, the key feature points at both ends and the middle key feature point The angle of the connection, etc. In this way, on the basis of ensuring the accuracy of the facial expression index, the amount of calculation can be reduced and the processing efficiency can be improved.
在一种可选的实施方式中,表情指数确定模块还可以根据多个区域包含的关键特征点,确定多个区域包含的面部器官的轮廓信息,并根据多个区域包含的面部器官的轮廓信息,分别确定多个区域对应的表情分值。In an optional embodiment, the expression index determination module may also determine contour information of facial organs contained in multiple regions based on key feature points contained in multiple regions, and based on the contour information of facial organs contained in multiple regions , Respectively determine the expression scores corresponding to multiple regions.
表情频谱图确定模块,被配置为根据所有画面的人脸表情指数获取图片流所对应的人脸表情频谱图。该表情频谱图确定模块在经过人脸表情指数加权计算后得到每帧画面所对应的人脸表情指数,进而得到图片流所对应的人脸表情频谱图,如图3所示。The facial expression spectrogram determining module is configured to obtain the facial expression spectrogram corresponding to the picture stream according to the facial expression indexes of all pictures. The facial expression spectrogram determination module obtains the facial expression index corresponding to each frame of picture after weighted calculation of the facial expression index, and then obtains the facial expression spectrogram corresponding to the picture stream, as shown in FIG. 3.
表情基准模块,被配置为根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于基准线确定人脸在自然状态下的自然情绪区域。The facial expression reference module is configured to determine a reference line corresponding to the human face in a natural state according to the facial expression spectrogram, and determine the natural emotional region of the human face in the natural state based on the reference line.
由于每个人自然状态下的表情都是不同的,有的人天生苦瓜脸,有的人天生脸带笑意,所以每个人的自然状态的基准线是不同的,通过找到基准线,能够较准确地根据用户的基准线,确定用户真实的情绪,从而有效提高人脸表情识别的准确度。Since each person’s expression in the natural state is different, some people are born with bitter melon faces, and some are born with smiles, so the baseline of each person’s natural state is different. By finding the baseline, it can be more accurate Based on the user’s baseline, the user’s true emotions are determined, thereby effectively improving the accuracy of facial expression recognition.
一种可选的实施方式中,表情基准模块包括频次区间确定模块、基准线确定模块和自然情绪区域确定模块。In an optional implementation manner, the expression reference module includes a frequency interval determination module, a reference line determination module, and a natural emotion region determination module.
频次区间确定模块,被配置为在人脸表情频谱图中,确定人脸表情指数出 现频次最高的第一区间。The frequency interval determination module is configured to determine the first interval in which the facial expression index appears most frequently in the facial expression spectrogram.
人在接收服务的过程中大部分时间是处于自然状态的,可以将出现频次最高的点所在的水平线作为基准线。但每个点的数值不可能完全一样,所以需要找一个基准区间(即第一区间),即需要找到频次最高的区间。例如,可以预设一个区间宽度,并根据该区间宽度,确定人脸表情频谱图中人脸表情指数出现频次最高的第一区间。People are in a natural state most of the time in the process of receiving services, and the horizontal line at the point with the highest frequency can be used as the baseline. However, the value of each point cannot be exactly the same, so it is necessary to find a reference interval (that is, the first interval), that is, the interval with the highest frequency needs to be found. For example, an interval width may be preset, and according to the interval width, the first interval in which the facial expression index appears most frequently in the facial expression spectrogram is determined.
基准线确定模块被配置为根据所述第一区间,确定人脸在自然状态下所对应的基准线。The reference line determination module is configured to determine the reference line corresponding to the human face in a natural state according to the first interval.
人脸表情频谱图中人脸表情指数出现频次最高的第一区间能够较真实地反应当前用户在自然状态下的表情状态,据此确定得到的基准线能够准确反应当前用户自然状态下的表情。The first interval where the facial expression index appears most frequently in the facial expression spectrogram can more truly reflect the expression state of the current user in the natural state, and the baseline determined based on this can accurately reflect the expression of the current user in the natural state.
一种可选的实施方式,基准线确定模块包括:In an optional implementation manner, the baseline determination module includes:
确定所述第一区间的水平中心线;Determine the horizontal centerline of the first interval;
若所述水平中心线对应的人脸表情指数大于第一阈值,且小于第二阈值,则将所述水平中心线确定为人脸在自然状态下所对应的基准线;If the facial expression index corresponding to the horizontal center line is greater than a first threshold and less than a second threshold, the horizontal center line is determined as the reference line corresponding to the human face in a natural state;
若所述水平中心线对应的人脸表情指数小于或等于第一阈值,则将所述第一阈值对应的水平线确定为人脸在自然状态下所对应的基准线;If the facial expression index corresponding to the horizontal center line is less than or equal to the first threshold, determining the horizontal line corresponding to the first threshold as the reference line corresponding to the human face in a natural state;
若所述水平中心线对应的人脸表情指数大于或等于第二阈值,则将所述第二阈值对应的水平线确定为人脸在自然状态下所对应的基准线。If the facial expression index corresponding to the horizontal center line is greater than or equal to the second threshold, the horizontal line corresponding to the second threshold is determined as the reference line corresponding to the human face in a natural state.
如图4所示,例如可以将区间宽度设置为20,在确定基准线的过程中,用这个区间从下往上扫描人脸表情频谱图,找到频次最高的区间,基准线即为该区间的水平中心。基准线的示意图如图3所示。As shown in Figure 4, for example, you can set the interval width to 20. In the process of determining the baseline, use this interval to scan the facial expression spectrogram from bottom to top to find the interval with the highest frequency. The baseline is the interval Horizontal center. The schematic diagram of the baseline is shown in Figure 3.
如图3所示,为避免全程都是积极情绪或消极情绪的情况,例如,可以将基准线设定在30-60之间,若实际测得的基准线高于60,则设定为60,若实际测得的基准线低于30,则设定为30,以此来避免全程都是积极情绪或消极情绪的特殊情况。基准线的第一阈值和第二阈值的设定值可以适应性调整,并不仅限于上述数值。As shown in Figure 3, in order to avoid positive or negative emotions throughout the process, for example, the baseline can be set between 30-60. If the actual measured baseline is higher than 60, set it to 60 , If the actual measured baseline is lower than 30, set it to 30 to avoid the special situation of positive emotions or negative emotions throughout the process. The set values of the first threshold and the second threshold of the baseline can be adjusted adaptively, and are not limited to the above-mentioned values.
自然情绪区域确定模块,被配置为以基准线为中心,将基准线的上下设定宽度范围内的第二区间作为人脸在自然状态下的自然情绪区域。The natural emotion area determination module is configured to take the reference line as the center, and use the second interval within the set width range of the upper and lower sides of the reference line as the natural emotion area of the human face in the natural state.
如图4所示,获取到基准线后,例如,可以将上下15,总宽30的区域代表表情处于自然状态,即人脸在自然状态下的自然情绪区域。自然情绪区域的宽度可以适应性调整,并不仅限于上述数值。As shown in FIG. 4, after the baseline is obtained, for example, an area of 15 up and down and a total width of 30 can be used to represent the expression in a natural state, that is, the natural emotional area of a human face in a natural state. The width of the natural emotion zone can be adjusted adaptively and is not limited to the above values.
表情分区模块被配置为以所述自然情绪区域为基准,将人脸表情频谱图划分为对应不同表情的多个情绪区域。如图5所示,在人脸表情频谱图中,可以将自然情绪区域以上的区域确定为积极情绪区域,将自然情绪区域以下的区域确定为消极情绪区域。通过基准区域即自然情绪区域,可以实现用户的情绪分层,避免全程都是积极或消极的情绪。The expression partition module is configured to use the natural emotional region as a reference to divide the facial expression spectrogram into multiple emotional regions corresponding to different expressions. As shown in FIG. 5, in the human facial expression spectrogram, the area above the natural emotion area can be determined as a positive emotion area, and the area below the natural emotion area can be determined as a negative emotion area. Through the reference area, the natural emotion area, the user's emotions can be stratified, avoiding positive or negative emotions throughout the process.
本公开实施例所述的一种人脸表情满意度分析系统,采用前述的一种人脸表情分析系统,不同之处在于,还包括满意度计算模块。满意度计算模块被配置为在待分析的人脸表情视频片段中的每个时间段内,对人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度。The facial expression satisfaction analysis system according to the embodiment of the present disclosure adopts the aforementioned facial expression analysis system, and the difference lies in that it also includes a satisfaction calculation module. The satisfaction calculation module is configured to analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram within each time period in the facial expression video clip to be analyzed to determine the user's satisfaction.
一种可选的实施方式中,满意度计算模块包括:In an optional implementation manner, the satisfaction calculation module includes:
时间段表情占比计算模块,被配置为将待分析的人脸表情视频片段划分为多个时间段,根据所述人脸表情频谱图,分别计算多个时间段内不同表情所占的比例。The time period expression percentage calculation module is configured to divide the facial expression video segment to be analyzed into multiple time periods, and calculate the proportions of different expressions in the multiple time periods according to the facial expression spectrogram.
时间段权重确定模块,被配置为确定多个时间段对应的权重。The time period weight determination module is configured to determine the weights corresponding to multiple time periods.
满意度结果计算模块,被配置为根据多个时间段内不同表情所占的比例以及多个时间段对应的权重,确定满意度结果。The satisfaction result calculation module is configured to determine the satisfaction result according to the proportions of different expressions in multiple time periods and the weights corresponding to the multiple time periods.
由于“哭着来笑着走”比“笑着来哭着走”的满意度肯定要高,因此不同情绪在时间上的权重需要设计成不同。例如权重可以设置为:前20%时间情绪占10%权重,最后10%的情绪占60%权重,中间部分占余下的30%权重。时间权重可以进行适当调整。根据前述表情分区模块获取每个情绪区域后,分别统计不同时间段内每个情绪区域在人脸表情频谱图中所占比例。在每个时间段内,根据每个情绪区域在人脸表情频谱图中所占比例确定每个情绪区域对应的权重,再对每个时间段的权重和每个时间段内每个情绪区域的权重进行加权计算得到用户的满意度。用户满意度可通过积极情绪的加权值减去消极情绪的加权值获得。对用户视频片段的时间段划分权重,并结合情绪区域在每个时间段内的占比,综合考虑了用户的情绪种类和时间权重,能够更为准确地确定用户的满意度。Since the satisfaction of "cry and laugh and walk" is definitely higher than that of "laugh and cry and walk", the weight of different emotions in time needs to be designed to be different. For example, the weights can be set as: the first 20% of the time emotions account for 10% of the weight, the last 10% emotions account for 60% of the weight, and the middle part account for the remaining 30% of the weight. The time weight can be adjusted appropriately. After acquiring each emotional region according to the aforementioned expression partitioning module, the proportion of each emotional region in the facial expression spectrogram in different time periods is separately counted. In each time period, determine the corresponding weight of each emotion area according to the proportion of each emotion area in the facial expression spectrogram, and then determine the weight of each time period and the weight of each emotion area in each time period. The weight is weighted and calculated to obtain the user's satisfaction. User satisfaction can be obtained by subtracting the weighted value of negative sentiment from the weighted value of positive sentiment. By dividing the weight of the time period of the user's video clip, and combining the proportion of the emotional area in each time period, comprehensively considering the user's emotional type and time weight, the user's satisfaction can be determined more accurately.
如图6所示,本实施例中,可以看到用户积极情绪逐渐增多,消极情绪逐渐减少。按照图6所示的示例,可以计算出用户的满意度系数=(31%*10%+28%*30%+37%*60%)-(31%*10%+22%*30%+11%*60%)=0.174。As shown in FIG. 6, in this embodiment, it can be seen that the positive emotions of the user gradually increase, and the negative emotions gradually decrease. According to the example shown in Figure 6, the user’s satisfaction coefficient can be calculated=(31%*10%+28%*30%+37%*60%)-(31%*10%+22%*30%+ 11%*60%)=0.174.
可选的,可以根据满意度系数确定满意度结果。例如,当满意度系数大于或等于满意阈值时,满意度结果为满意。当满意度系数小于或等于不满意阈值时,满意度结果为不满意。当满意度系数大于不满意阈值,且小于满意度阈值 时,满意度结果为一般。例如,可以设定高于0.1为满意,低于-0.1为不满意,-0.1-0.1之间为一般。其中,满意阈值和不满意阈值均可适应性调整。Optionally, the satisfaction result can be determined according to the satisfaction coefficient. For example, when the satisfaction coefficient is greater than or equal to the satisfaction threshold, the satisfaction result is satisfactory. When the satisfaction coefficient is less than or equal to the dissatisfaction threshold, the satisfaction result is unsatisfactory. When the satisfaction coefficient is greater than the dissatisfaction threshold and less than the satisfaction threshold, the satisfaction result is general. For example, it can be set higher than 0.1 as satisfactory, lower than -0.1 as unsatisfactory, and between -0.1-0.1 as normal. Among them, both the satisfactory threshold and the unsatisfactory threshold can be adjusted adaptively.
本公开提供的一种人脸表情分析方法和系统及人脸表情满意度分析方法和系统,利用了用户表情的完整视频信息,充分考虑了人脸表情的波动,并充分考虑了用户个体的自然状态差异,基于频次分析获取自然状态对应的基准线,能够确定用户真实的情绪。根据用户的基准线设定自然状态对应的基准区间,避免全程为积极表情或消极表情的情况。通过基准区域实现用户的情绪分层,并对用户视频片段的时间段划分权重,综合考虑了用户的情绪种类和时间权重,能够更为准确地确定用户的满意度。The facial expression analysis method and system and the facial expression satisfaction analysis method and system provided in the present disclosure utilize the complete video information of user expressions, fully consider the fluctuations of facial expressions, and fully consider the naturalness of the user. State differences, based on frequency analysis to obtain the baseline corresponding to the natural state, it is possible to determine the user’s true emotions. Set the reference interval corresponding to the natural state according to the user's reference line to avoid the situation of positive or negative expressions throughout the process. The user’s emotions are stratified through the reference area, and the time period of the user’s video clips are weighted. The user’s emotion type and time weight are comprehensively considered, and the user’s satisfaction can be determined more accurately.
本公开还涉及一种电子设备,包括服务器、终端等。该电子设备包括:至少一个处理器;与至少一个处理器通信连接的存储器;以及与存储器通信连接的通信组件,所述通信组件在处理器的控制下接收和发送数据;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行以实现上述实施例中的人脸表情分析方法和人脸表情满意度分析方法。The present disclosure also relates to an electronic device, including a server, a terminal, and the like. The electronic device includes: at least one processor; a memory that is communicatively connected with the at least one processor; and a communication component that is communicatively connected to the memory, and the communication component receives and sends data under the control of the processor; wherein the memory stores data The instructions are executed by at least one processor, and the instructions are executed by at least one processor to implement the facial expression analysis method and the facial expression satisfaction analysis method in the foregoing embodiments.
在一种可选的实施方式中,存储器作为一种非易失性计算机可读存储介质,可设置为存储非易失性软件程序、非易失性计算机可执行程序以及模块。处理器通过运行存储在存储器中的非易失性软件程序、指令以及模块,从而执行设备的多种功能应用以及数据处理,即实现上述人脸表情分析方法和人脸表情满意度分析方法。In an optional implementation manner, the memory, as a non-volatile computer-readable storage medium, can be configured to store non-volatile software programs, non-volatile computer-executable programs, and modules. The processor executes multiple functional applications and data processing of the device by running non-volatile software programs, instructions, and modules stored in the memory, that is, realizing the aforementioned facial expression analysis method and facial expression satisfaction analysis method.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储选项列表等。此外,存储器可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实施例中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至外接设备。上述网络的实例包括互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a program storage area and a data storage area, where the program storage area can store an operating system and an application program required by at least one function; the data storage area can store a list of options and the like. In addition, the memory may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices. In some embodiments, the memory may optionally include a memory remotely arranged with respect to the processor, and these remote memories may be connected to an external device through a network. Examples of the aforementioned networks include the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
一个或者多个模块存储在存储器中,当被一个或者多个处理器执行时,执行上述任意实施例中的人脸表情分析方法和人脸表情满意度分析方法。One or more modules are stored in the memory, and when executed by one or more processors, the facial expression analysis method and the facial expression satisfaction analysis method in any of the foregoing embodiments are executed.
上述产品可执行本申请实施例所提供的人脸表情分析方法和人脸表情满意度分析方法,具备执行方法相应的功能模块和效果,未在本实施例中描述的技术细节,可参见本申请实施例所提供的人脸表情分析方法和人脸表情满意度分析方法。The above products can execute the facial expression analysis method and the facial expression satisfaction analysis method provided by the embodiments of this application, and have the corresponding functional modules and effects of the execution method. For technical details not described in this embodiment, please refer to this application The facial expression analysis method and the facial expression satisfaction analysis method provided by the embodiment.
本公开还涉及一种计算机可读存储介质,用于存储计算机可读程序,所述 计算机可读程序用于供计算机执行上述部分或全部的人脸表情分析方法和人脸表情满意度分析方法。The present disclosure also relates to a computer-readable storage medium for storing a computer-readable program for a computer to execute part or all of the above-mentioned facial expression analysis method and facial expression satisfaction analysis method.
实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括多个指令用以使得一个设备(可以是单片机和芯片等)或处理器(processor)执行本申请多个实施例所述方法的全部或部分步骤。而前述的存储介质包括:通用串行总线闪存盘(Universal Serial Bus flash disk,U盘)、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等多种可以存储程序代码的介质。All or part of the steps in the method of the foregoing embodiments can be completed by instructing relevant hardware through a program. The program is stored in a storage medium and includes multiple instructions to enable a device (may be a single-chip microcomputer, a chip, etc.) or A processor (processor) executes all or part of the steps of the methods described in the multiple embodiments of the present application. The aforementioned storage media include: Universal Serial Bus flash disk (USB flash disk), mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory) , Magnetic disks or optical disks and other media that can store program codes.
在此处所提供的说明书中,说明了大量细节。本公开的实施例可以在没有这些细节的情况下实现。在一些实例中,并未示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the instructions provided here, a lot of details are explained. The embodiments of the present disclosure can be implemented without these details. In some instances, well-known methods, structures and technologies are not shown, so as not to obscure the understanding of this specification.
尽管在此所述的一些实施例包括其它实施例中所包括的一些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本公开的范围之内并且形成不同的实施例。例如,在权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Although some embodiments described herein include some features included in other embodiments but not others, the combination of features of different embodiments means to be within the scope of the present disclosure and form different embodiments. For example, in the claims, any one of the claimed embodiments can be used in any combination.

Claims (16)

  1. 一种人脸表情分析方法,包括:A method for analyzing facial expressions, including:
    获取待分析的人脸表情视频片段,并获取所述人脸表情视频片段中的图片流;Acquiring a face expression video clip to be analyzed, and acquiring a picture stream in the face expression video clip;
    根据所述图片流中每帧画面的人脸表情指数,确定所述图片流所对应的人脸表情频谱图;Determine the face expression spectrogram corresponding to the picture stream according to the facial expression index of each frame of the picture in the picture stream;
    根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定所述人脸在所述自然状态下的自然情绪区域;Determine the reference line corresponding to the human face in a natural state according to the human face expression spectrogram, and determine the natural emotional area of the human face in the natural state based on the reference line;
    以所述自然情绪区域为基准,将所述人脸表情频谱图划分为对应不同表情的多个情绪区域。Using the natural emotional region as a reference, the human facial expression spectrogram is divided into multiple emotional regions corresponding to different expressions.
  2. 如权利要求1所述的方法,其中,通过逐帧、固定间隔抽帧或抽取关键帧的方式获取所述人脸表情视频片段中的图片流。The method according to claim 1, wherein the picture stream in the facial expression video clip is obtained by frame-by-frame, fixed-interval frame extraction or key frame extraction.
  3. 如权利要求2所述的方法,其中,所述获取所述人脸表情视频片段中的图片流,包括:将所述人脸表情视频片段进行分割,得到多个视频子片段,随机或固定抽取每个视频子片段中的至少一帧;The method according to claim 2, wherein said acquiring the picture stream in said facial expression video segment comprises: dividing said facial expression video segment to obtain multiple video sub-segments, and extracting randomly or fixedly At least one frame in each video sub-segment;
    根据抽取的多个画面帧,确定所述人脸表情视频片段中的图片流。According to the extracted multiple picture frames, the picture stream in the facial expression video segment is determined.
  4. 如权利要求1所述的方法,还包括:对所述图片流中的每帧画面进行人脸检测,获取每帧画面中的人脸图像。The method according to claim 1, further comprising: performing face detection on each frame of the picture stream, and obtaining a face image in each frame of the picture.
  5. 如权利要求4所述的方法,其中,所述根据所述图片流中每帧画面的人脸表情指数,确定所述图片流所对应的人脸表情频谱图,包括:The method according to claim 4, wherein the determining the face expression spectrogram corresponding to the picture stream according to the facial expression index of each frame of the picture stream comprises:
    将所述人脸图像划分为多个区域,每个区域内包含用于确定所述人脸表情指数的多个关键特征点;Dividing the face image into a plurality of regions, and each region contains a plurality of key feature points for determining the facial expression index;
    针对每帧画面分别确定所述多个区域包含的关键特征点,确定所述多个区域对应的表情分值,并根据所述多个区域对应的表情分值,确定每帧画面的人脸表情指数;Determine the key feature points contained in the multiple regions for each frame, determine the expression scores corresponding to the multiple regions, and determine the facial expressions of each frame according to the expression scores corresponding to the multiple regions index;
    根据所有画面的人脸表情指数获取所述图片流所对应的人脸表情频谱图。Obtain the face expression spectrogram corresponding to the picture stream according to the face expression index of all the pictures.
  6. 如权利要求5所述的方法,其中,所述将所述人脸图像划分为多个区域,包括:The method of claim 5, wherein the dividing the face image into multiple regions comprises:
    对所述人脸图像进行人脸特征点识别,得到所述人脸图像的多个特征点;Performing face feature point recognition on the face image to obtain multiple feature points of the face image;
    从所述多个特征点中识别出用于确定所述人脸表情指数的多个关键特征点;Identifying multiple key feature points for determining the facial expression index from the multiple feature points;
    根据所述多个关键特征点,将所述人脸图像划分为多个区域。According to the multiple key feature points, the face image is divided into multiple regions.
  7. 如权利要求5所述的方法,其中,所述针对每帧画面分别确定所述多个区域包含的关键特征点,确定所述多个区域对应的表情分值,并根据所述多个区域对应的表情分值,确定每帧画面的人脸表情指数,包括:The method according to claim 5, wherein the key feature points contained in the multiple regions are determined for each frame of the screen, the expression scores corresponding to the multiple regions are determined, and the corresponding expression scores are based on the multiple regions. The expression score of, determines the facial expression index of each frame, including:
    确定每个区域中关键特征点连线之间的至少一个夹角,并根据所述至少一个夹角,确定每个区域对应的表情分值;Determine at least one included angle between the lines of key feature points in each area, and determine the expression score corresponding to each area according to the at least one included angle;
    确定每个区域对应的权重;Determine the weight corresponding to each area;
    根据多个区域对应的表情分值以及多个区域对应的权重,确定每帧画面的人脸表情指数。According to the expression scores corresponding to the multiple regions and the weights corresponding to the multiple regions, the facial expression index of each frame is determined.
  8. 如权利要求1所述的方法,其中,所述根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定所述人脸在所述自然状态下的自然情绪区域,包括:The method according to claim 1, wherein said determining a reference line corresponding to a human face in a natural state according to said facial expression spectrogram, and determining that said human face is in said natural state based on said reference line. Natural emotional areas in the state, including:
    在所述人脸表情频谱图中,确定所述人脸表情指数出现频次最高的第一区间;In the face expression spectrogram, determine the first interval in which the face expression index appears most frequently;
    根据所述第一区间,确定所述人脸在所述自然状态下所对应的基准线;Determine the reference line corresponding to the human face in the natural state according to the first interval;
    以所述基准线为中心,将所述基准线的上下设定宽度范围内的第二区间作为所述人脸在所述自然状态下的自然情绪区域。With the reference line as a center, a second interval within a set width range of the upper and lower sides of the reference line is taken as the natural emotional area of the human face in the natural state.
  9. 如权利要求8所述的方法,其中,所述根据所述第一区间,确定所述人脸在所述自然状态下所对应的基准线,包括:8. The method according to claim 8, wherein the determining the reference line corresponding to the human face in the natural state according to the first interval comprises:
    确定所述第一区间的水平中心线;Determine the horizontal centerline of the first interval;
    在所述水平中心线对应的人脸表情指数大于第一阈值,且小于第二阈值的情况下,将所述水平中心线确定为所述人脸在所述自然状态下所对应的基准线;In a case where the facial expression index corresponding to the horizontal center line is greater than a first threshold and less than a second threshold, determining the horizontal center line as the reference line corresponding to the human face in the natural state;
    在所述水平中心线对应的人脸表情指数小于或等于所述第一阈值的情况下,将所述第一阈值对应的水平线确定为所述人脸在所述自然状态下所对应的基准线;In the case that the facial expression index corresponding to the horizontal center line is less than or equal to the first threshold, the horizontal line corresponding to the first threshold is determined as the reference line corresponding to the human face in the natural state ;
    在所述水平中心线对应的人脸表情指数大于或等于所述第二阈值的情况下,将所述第二阈值对应的水平线确定为所述人脸在所述自然状态下所对应的基准线。In the case that the facial expression index corresponding to the horizontal center line is greater than or equal to the second threshold, the horizontal line corresponding to the second threshold is determined as the reference line corresponding to the human face in the natural state .
  10. 如权利要求1所述的方法,其中,所述以所述自然情绪区域为基准,将所述人脸表情频谱图划分为对应不同表情的多个情绪区域,包括:在所述人脸表情频谱图中,将所述自然情绪区域以上的区域确定为积极情绪区域,将所述自然情绪区域以下的区域确定为消极情绪区域。The method according to claim 1, wherein the dividing the human facial expression spectrogram into a plurality of emotional regions corresponding to different expressions based on the natural emotional region includes: In the figure, the area above the natural emotion area is determined as a positive emotion area, and the area below the natural emotion area is determined as a negative emotion area.
  11. 一种人脸表情满意度分析方法,在采用如权利要求1-10中任意一项所 述的人脸表情分析方法之后,还包括:在待分析的人脸表情视频片段中的每个时间段内,对所述人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度。A method for analyzing facial expression satisfaction, after adopting the method for analyzing facial expression according to any one of claims 1-10, further comprising: each time period in the facial expression video clip to be analyzed Inside, the multiple emotional regions corresponding to different expressions in the facial expression spectrogram are analyzed and calculated to determine the user's satisfaction.
  12. 如权利要求11所述的方法,其中,所述在待分析的人脸表情视频片段中的每个时间段内,对所述人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度,包括:The method of claim 11, wherein, in each time period in the facial expression video clip to be analyzed, multiple emotional regions corresponding to different expressions in the facial expression spectrogram are analyzed and calculated , To determine user satisfaction, including:
    将所述待分析的人脸表情视频片段划分为多个时间段,根据所述人脸表情频谱图,分别计算所述多个时间段内不同表情所占的比例;Divide the face expression video segment to be analyzed into multiple time periods, and calculate the proportions of different expressions in the multiple time periods according to the face expression spectrogram;
    确定多个时间段对应的权重;Determine the weights corresponding to multiple time periods;
    根据多个时间段内不同表情所占的比例以及所述多个时间段对应的权重,确定满意度结果。The satisfaction result is determined according to the proportions of different expressions in the multiple time periods and the weights corresponding to the multiple time periods.
  13. 一种人脸表情分析系统,包括:A facial expression analysis system, including:
    图片采集模块,设置为获取待分析的人脸表情的视频片段,并获取所述人脸表情视频片段中的图片流;The picture acquisition module is configured to acquire a video segment of the facial expression to be analyzed, and acquire the picture stream in the facial expression video segment;
    表情频谱模块,设置为根据所述图片流中每帧画面的人脸表情指数,确定所述图片流所对应的人脸表情频谱图;An expression spectrum module, configured to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream;
    表情基准模块,设置为根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定所述人脸在所述自然状态下的自然情绪区域;An expression reference module, configured to determine a reference line corresponding to a human face in a natural state according to the facial expression spectrogram, and determine the natural emotional region of the human face in the natural state based on the reference line;
    表情分区模块,设置为以所述自然情绪区域为基准,将所述人脸表情频谱图划分为对应不同表情的多个情绪区域。The expression partition module is set to divide the facial expression spectrogram into a plurality of emotional regions corresponding to different expressions based on the natural emotional region.
  14. 一种人脸表情满意度分析系统,基于权利要求13所述的人脸表情分析系统,包括:A facial expression satisfaction analysis system based on the facial expression analysis system according to claim 13, comprising:
    图片采集模块,设置为获取待分析的人脸表情的视频片段,并获取所述人脸表情视频片段中的图片流;The picture acquisition module is configured to acquire a video segment of the facial expression to be analyzed, and acquire the picture stream in the facial expression video segment;
    表情频谱模块,设置为根据所述图片流中每帧画面的人脸表情指数,确定所述图片流所对应的人脸表情频谱图;An expression spectrum module, configured to determine the facial expression spectrum map corresponding to the picture stream according to the facial expression index of each frame of the picture stream;
    表情基准模块,设置为根据所述人脸表情频谱图,确定人脸在自然状态下所对应的基准线,并基于所述基准线确定所述人脸在所述自然状态下的自然情绪区域;An expression reference module, configured to determine a reference line corresponding to a human face in a natural state according to the facial expression spectrogram, and determine the natural emotional region of the human face in the natural state based on the reference line;
    表情分区模块,设置为以所述自然情绪区域为基准,将所述人脸表情频谱图划分为对应不同表情的多个情绪区域;An expression partition module, configured to divide the facial expression spectrogram into a plurality of emotional regions corresponding to different expressions based on the natural emotional region;
    满意度计算模块,设置为在待分析的人脸表情视频片段中的每个时间段内,对所述人脸表情频谱图中对应不同表情的多个情绪区域进行分析计算,确定用户的满意度。The satisfaction calculation module is set to analyze and calculate multiple emotional regions corresponding to different expressions in the facial expression spectrogram within each time period in the facial expression video clip to be analyzed to determine user satisfaction .
  15. 一种电子设备,包括存储器和处理器,所述存储器设置为存储一条或多条计算机指令,其中,所述一条或多条计算机指令被所述处理器执行以实现如权利要求1-12中任一项所述的方法。An electronic device, comprising a memory and a processor, the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement any one of claims 1-12 The method described in one item.
  16. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行以实现如权利要求1-12中任一项所述的方法。A computer-readable storage medium storing a computer program, the computer program being executed by a processor to implement the method according to any one of claims 1-12.
PCT/CN2021/071233 2020-01-13 2021-01-12 Facial expression analysis method and system, and facial expression-based satisfaction analysis method and system WO2021143667A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010033040.1A CN113111690B (en) 2020-01-13 2020-01-13 Facial expression analysis method and system and satisfaction analysis method and system
CN202010033040.1 2020-01-13

Publications (1)

Publication Number Publication Date
WO2021143667A1 true WO2021143667A1 (en) 2021-07-22

Family

ID=76708830

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/071233 WO2021143667A1 (en) 2020-01-13 2021-01-12 Facial expression analysis method and system, and facial expression-based satisfaction analysis method and system

Country Status (2)

Country Link
CN (1) CN113111690B (en)
WO (1) WO2021143667A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850247A (en) * 2021-12-01 2021-12-28 环球数科集团有限公司 Tourism video emotion analysis system fused with text information

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114743252B (en) * 2022-06-10 2022-09-16 中汽研汽车检验中心(天津)有限公司 Feature point screening method, device and storage medium for head model
CN117131099A (en) * 2022-12-14 2023-11-28 广州数化智甄科技有限公司 Emotion data analysis method and device in product evaluation and product evaluation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447001A (en) * 2018-10-31 2019-03-08 深圳市安视宝科技有限公司 A kind of dynamic Emotion identification method
CN109886110A (en) * 2019-01-17 2019-06-14 深圳壹账通智能科技有限公司 Micro- expression methods of marking, device, computer equipment and storage medium
US20190384967A1 (en) * 2018-06-19 2019-12-19 Beijing Kuangshi Technology Co., Ltd. Facial expression detection method, device and system, facial expression driving method, device and system, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190384967A1 (en) * 2018-06-19 2019-12-19 Beijing Kuangshi Technology Co., Ltd. Facial expression detection method, device and system, facial expression driving method, device and system, and storage medium
CN109447001A (en) * 2018-10-31 2019-03-08 深圳市安视宝科技有限公司 A kind of dynamic Emotion identification method
CN109886110A (en) * 2019-01-17 2019-06-14 深圳壹账通智能科技有限公司 Micro- expression methods of marking, device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850247A (en) * 2021-12-01 2021-12-28 环球数科集团有限公司 Tourism video emotion analysis system fused with text information
CN113850247B (en) * 2021-12-01 2022-02-08 环球数科集团有限公司 Tourism video emotion analysis system fused with text information

Also Published As

Publication number Publication date
CN113111690B (en) 2024-01-30
CN113111690A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
WO2021143667A1 (en) Facial expression analysis method and system, and facial expression-based satisfaction analysis method and system
US10909356B2 (en) Facial tracking method and apparatus, storage medium, and electronic device
CN109522815B (en) Concentration degree evaluation method and device and electronic equipment
CN105005777B (en) Audio and video recommendation method and system based on human face
US9852327B2 (en) Head-pose invariant recognition of facial attributes
US9076030B2 (en) Liveness detection
US9104907B2 (en) Head-pose invariant recognition of facial expressions
JP5287333B2 (en) Age estimation device
US20140143183A1 (en) Hierarchical model for human activity recognition
WO2020042542A1 (en) Method and apparatus for acquiring eye movement control calibration data
CN108875452A (en) Face identification method, device, system and computer-readable medium
CN106056064A (en) Face recognition method and face recognition device
KR20190020779A (en) Ingestion Value Processing System and Ingestion Value Processing Device
CN108198130B (en) Image processing method, image processing device, storage medium and electronic equipment
CN113179421B (en) Video cover selection method and device, computer equipment and storage medium
CN110232331B (en) Online face clustering method and system
CN111860091A (en) Face image evaluation method and system, server and computer readable storage medium
CN110866139A (en) Cosmetic treatment method, device and equipment
Augereau et al. Estimation of english skill with a mobile eye tracker
CN113436735A (en) Body weight index prediction method, device and storage medium based on face structure measurement
CN111860057A (en) Face image blurring and living body detection method and device, storage medium and equipment
KR101145672B1 (en) A smile analysis system for smile self-training
US20210158565A1 (en) Pose selection and animation of characters using video data and training techniques
RU2768797C1 (en) Method and system for determining synthetically modified face images on video
KR20210019182A (en) Device and method for generating job image having face to which age transformation is applied

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21741141

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21741141

Country of ref document: EP

Kind code of ref document: A1