WO2020244074A1 - Procédé et appareil d'interaction d'expressions, dispositif informatique, et support d'informations lisible - Google Patents

Procédé et appareil d'interaction d'expressions, dispositif informatique, et support d'informations lisible Download PDF

Info

Publication number
WO2020244074A1
WO2020244074A1 PCT/CN2019/103370 CN2019103370W WO2020244074A1 WO 2020244074 A1 WO2020244074 A1 WO 2020244074A1 CN 2019103370 W CN2019103370 W CN 2019103370W WO 2020244074 A1 WO2020244074 A1 WO 2020244074A1
Authority
WO
WIPO (PCT)
Prior art keywords
expression
feature vector
preset
recognized
facial
Prior art date
Application number
PCT/CN2019/103370
Other languages
English (en)
Chinese (zh)
Inventor
郭玲玲
黄帅
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020244074A1 publication Critical patent/WO2020244074A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/175Static expression

Definitions

  • This application relates to the field of electronic communication technology, and in particular to an expression interaction method, device, computer equipment, and non-volatile readable storage medium.
  • Facial expressions are a basic way for humans to express emotions and an effective means in nonverbal communication.
  • Existing electronic devices are generally equipped with virtual robots to realize human-machine interaction, but virtual robots generally only support human-machine voice interaction, cannot distinguish the user's expression, and cannot realize human-computer interaction based on the user's expression.
  • An embodiment of the present application provides an expression interaction method, the method includes:
  • a face image If a face image is detected, locate a key feature area of the face image, and extract an expression feature that characterizes the facial expression to be recognized from the key feature area;
  • the extracted expression features are compared with the expression features of each expression in the preset expression library to obtain the similarity probability of the facial expression to be recognized and each expression in the preset expression library, which will be compared with The expression with the greatest probability of similarity in the preset expression library is used as the facial expression to be recognized;
  • the step of performing face detection includes:
  • the extracted expression feature is compared with the expression feature of each expression in the preset expression library to obtain
  • the step of the similarity probability of the facial expression to be recognized and each expression in the preset expression library includes:
  • the probability of similarity between the facial expression to be recognized and each expression in the preset expression library is determined according to the calculated distance value.
  • the extracted expression feature is compared with the expression feature of each expression in the preset expression library to obtain
  • the step of the similarity probability of the facial expression to be recognized and each expression in the preset expression library includes:
  • the probability of similarity between the facial expression to be recognized and each expression in the preset expression library is determined according to the calculated distance value.
  • the distance value is calculated by the following formula:
  • y is the shape feature vector/texture feature vector of the facial expression to be recognized
  • x j is the shape feature vector/texture feature vector of the j-th expression in the preset expression library
  • M is the preset target metric matrix
  • j is An integer greater than or equal to 1
  • d M (y, x j ) is between the shape feature vector/texture feature vector of the facial expression to be recognized and the shape feature vector/texture feature vector of the j-th expression in the preset expression library
  • Yx j is the difference between the shape feature vector/texture feature vector of the facial expression to be recognized and the shape feature vector/texture feature vector of the j-th expression in the preset expression library
  • (yx j ) T Is the transposition of the difference between the shape feature vector/texture feature vector of the facial expression to be recognized and the shape feature vector/texture feature vector of the j-th expression in the preset expression library
  • the similarity probability is calculated by the following formula:
  • p ⁇ 1+exp[Db] ⁇ -1 , where p is the similarity probability, D is the distance value, and b is the preset offset.
  • the feedback information includes voice information or expression information after watching the terminal device output the interactive content.
  • the feedback information is expression information after watching the terminal device output the interactive content
  • the step of continuously controlling the content output of the terminal device according to the feedback information includes:
  • the interactive content output by the terminal device is not adjusted.
  • An embodiment of the present application provides an expression interaction device, the device includes:
  • the detection module is configured to receive an interaction request instruction, and pop up a detection box to perform face detection according to the interaction request instruction;
  • the judgment module is used to judge whether a face image is detected
  • An extraction module used for locating a key feature area of the face image when a face image is detected, and extracting expression features that characterize the facial expression to be recognized from the key feature area;
  • the comparison module is used to compare the extracted expression characteristics with the expression characteristics of each expression in the preset expression library to obtain the facial expression to be recognized and the expression of each expression in the preset expression library Similarity probability, and use an expression with the greatest probability of similarity in the preset expression library as the facial expression to be recognized;
  • the output module is used to control the terminal device to output corresponding interactive content according to the recognition result of the facial expression to be recognized;
  • the control module is configured to obtain feedback information after the interactive content is output, and continuously control the content output of the terminal device according to the feedback information.
  • An embodiment of the present application provides a computer device that includes a processor and a memory, and a number of computer-readable instructions are stored on the memory.
  • the processor is used to execute the computer-readable instructions stored in the memory, The steps of the aforementioned facial expression interaction method.
  • An embodiment of the present application provides a non-volatile readable storage medium having computer readable instructions stored thereon, and when the computer readable instructions are executed by a processor, the steps of the expression interaction method as described above are realized.
  • the above-mentioned expression interaction method, device, computer equipment and non-volatile readable storage medium can recognize the user’s expression and control the computer device to output corresponding interactive content according to the result of expression recognition, and can realize functions such as alleviating the user’s tension, anxiety, and comforting the user’s mood. At the same time, it can further analyze the user's expression after the interactive content is played, and continuously control the interactive content output of the computer device according to the analysis result, so that the interaction with the computer device is more vivid and interesting, and the user experience is improved.
  • FIG. 1 is a flowchart of the steps of an expression interaction method in an embodiment of this application.
  • Fig. 2 is a functional block diagram of an expression interaction device in an embodiment of the application.
  • Figure 3 is a schematic diagram of a computer device in an embodiment of the application.
  • the expression interaction method of the present application is applied to one or more computer devices.
  • the computer device is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor and an Application Specific Integrated Circuit (ASIC) , Field-Programmable Gate Array (FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Processor
  • embedded equipment etc.
  • the computer device may be a computing device such as a desktop computer, a notebook computer, a tablet computer, a server, and a mobile phone.
  • the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
  • Fig. 1 is a flowchart of the steps of a preferred embodiment of the expression interaction method of the present application. According to different needs, the order of the steps in the flowchart can be changed, and some steps can be omitted.
  • the expression interaction method specifically includes the following steps.
  • Step S11 Receive an interaction request instruction, and pop up a detection box to perform face detection according to the interaction request instruction.
  • the computer device when receiving an interaction request instruction from a user, the computer device will pop up a detection frame according to the interaction request instruction, and perform face detection through the detection frame.
  • the user can input an interaction request instruction through touch screen, input an interaction request instruction through keys, or input an interaction request instruction through voice.
  • face detection can be realized by establishing and training a convolutional neural network model.
  • the face detection can be implemented in the following manner: a face sample database can be constructed first and a convolutional neural network model for face detection can be established, the face sample database containing face information of multiple people, The face information of each person can include multiple angles, and the face information of each angle can have multiple pictures; the face image in the face sample database is input to the convolutional neural network model, and the convolutional neural network is used
  • the default parameters of the model are trained for convolutional neural network; according to the intermediate training results, the initial weight, training rate, number of iterations, etc. of the default parameters are continuously adjusted until the optimal network parameters of the convolutional neural network model are obtained.
  • the convolutional neural network model with optimal network parameters is used as the final recognition model. After the training is completed, the finally obtained convolutional neural network model can be used for face detection.
  • Step S12 Determine whether a face image is detected.
  • Step S13 If a face image is detected, a key feature area of the face image is located, and an expression feature representing the facial expression to be recognized is extracted from the key feature area.
  • a prompt message may be output.
  • a face image is detected at the preset time, locate the key feature area of the face image, and extract the expression features that characterize the facial expression to be recognized from the key feature area, because it is not for the face image Feature extraction and calculation are performed on all areas of, which can reduce the amount of calculation and improve the speed of facial expression recognition.
  • the key feature regions of the face image may include eyes, nose, mouth, eyebrows, etc.
  • the key feature areas such as eyes, nose, mouth, and eyebrows of the face image can be located by integral projection. Since eyes are the more prominent facial features in the face, the eyes can be located first, and other organs of the face, such as eyebrows, mouth, nose, etc., can be more accurately located based on the potential distribution relationship. For example, the location of the key feature area is performed by corresponding to the peaks or troughs generated under different integral projection methods.
  • the integral projection is divided into vertical projection and horizontal projection.
  • f (x, y) represent the image (x, y)
  • the gray value of, the horizontal integral projection M h (y) and vertical integral projection M v (x) in the image [y1, y2] and [x1, x2] area are expressed as:
  • the horizontal integral projection is to accumulate the gray values of all pixels in a row before displaying
  • the vertical integral projection is to accumulate the gray values of all pixels in a column before displaying.
  • the eyebrows and eyes are the relatively black areas in the face image, which correspond to the first two minimum points on the horizontal integral projection curve.
  • the first minimum point corresponds to the position of the eyebrows on the vertical axis, denoted as y brow
  • the second minimum point corresponds to the position of the eye on the vertical axis, denoted as y eye
  • the third pole corresponds to the position of the nose on the vertical axis, recorded as y nose
  • the fourth minimum point corresponds to the position of the mouth on the vertical axis, recorded as y month .
  • the position of the mouth and nose on the horizontal axis is (x left-eye + x right-eye )/2, and the eye area, lips area, eyebrow area, and nose area can be determined according to the coordinates of key features and preset rules.
  • the eye area includes the area centered on the left eye coordinates, 15 pixels to the left, 15 pixels to the right, 10 pixels up, and 10 pixels down, and the area centered on the right eye coordinates, 15 pixels to the left, 15 pixels to the right, and up 10 pixels, 10 pixels down area.
  • human facial expressions may have the following expressions: facial movements when happy: the corners of the mouth are raised, the cheeks are lifted up, the eyelids are contracted, and "crow's feet" are formed at the tails of the eyes.
  • Facial features during contempt the corner of the mouth is raised, making a sneer or a smug smile, etc.
  • the facial expression features that characterize the facial expression can be extracted from the key feature region.
  • a differential energy map (DEI)-based method/centralized binary pattern (CGBP) method can be used to extract expression features that represent facial expressions from key feature regions.
  • Step S14 Compare the extracted expression features with the expression features of each expression in the preset expression library to obtain the probability of similarity between the facial expression to be recognized and each expression in the preset expression library. And an expression with the greatest probability of similarity with the preset expression library is used as the facial expression to be recognized.
  • the preset expression library may include a variety of expressions, such as happy, surprised, sad, angry, disgusted, fear and other expressions, and a variety of composite expressions, such as sad and fearful, sad and surprised, Anger and fear etc.
  • the expression feature of the expression to be recognized may be a shape feature vector or a texture feature vector.
  • the shape feature vector of the expression to be recognized is acquired
  • the texture feature vector of the expression to be recognized is acquired.
  • the probability of similarity between the extracted expression feature (shape feature vector or texture feature vector) and each expression in the preset expression library can be determined by the following method: acquiring the feature of the expression to be recognized The distance value between the vector (shape feature vector or texture feature vector) and the feature vector of each expression in the preset expression library; according to the distance value, the facial expression to be recognized and each expression in the preset expression library are determined Probability of similar expressions.
  • acquiring the shape feature vector of the facial expression to be recognized calculating the distance value between the shape feature vector of the facial expression to be recognized and the shape feature vector of each expression in the preset expression library, and The probability of similarity between the facial expression to be recognized and each expression in the preset expression library is determined according to the calculated distance value.
  • acquiring the texture feature vector of the facial expression to be recognized calculating the distance between the texture feature vector of the facial expression to be recognized and the texture feature vector of each expression in the preset expression library, And determining the similarity probability of the facial expression to be recognized and each expression in the preset expression library according to the calculated distance value.
  • the distance value may be a generalized Mahalanobis distance.
  • the distance value between the feature vector of the expression to be recognized and the feature vector of each expression in the preset expression library can be calculated by the following formula:
  • y is the shape feature vector (texture feature vector) of the facial expression to be recognized
  • x j is the shape feature vector (texture feature vector) of the j-th expression in the preset expression library
  • M is the preset target metric matrix
  • j is an integer greater than or equal to 1
  • d M (y, x j ) is the shape feature vector (texture feature vector) of the facial expression to be recognized and the shape feature vector of the j-th expression in the preset expression library (texture feature Vector)
  • (yx j ) is the difference between the shape feature vector (texture feature vector) of the facial expression to be recognized and the shape feature vector (texture feature vector) of the j-th expression in the preset expression library
  • (Yx j ) T is the transposition of the difference between the shape feature vector (texture feature vector) of the facial expression to be recognized and the shape feature vector (texture feature vector) of the j-th expression in the preset expression library
  • the similarity probability can be calculated by the
  • p ⁇ 1+exp[Db] ⁇ -1 , where p is the similarity probability, D is the distance value, and b is the preset offset.
  • the expression with the greatest probability of similarity to the preset expression library may be used as the Describe the expression to be recognized.
  • Step S15 Control the computer device to output corresponding interactive content according to the recognition result of the facial expression to be recognized.
  • a plurality of mapping relationship tables between expressions and interactive content output by the computer device may be established in advance, and the computer device can be controlled according to the expression recognition result according to the mapping relationship table.
  • the interactive content may be that the computer device provides corresponding actions, voices, pictures, texts, videos, etc. according to the facial expression recognition results to interact with the user, so as to relieve the user's tension and anxiety and delight the user's mood.
  • the computer device can be controlled to output soothing music to relieve the user's nervousness or the computer device can be controlled to output suggestions on ways to relieve tension (for example, the suggested content is: try to breathe slowly and deeply.
  • the computer device can be controlled to output articles, music, and videos that alleviate sadness, or the computer device can be controlled to output suggestions on how to relieve sadness for the user's reference .
  • Step S16 Obtain feedback information after the interactive content is output, and continuously control the content output of the computer device according to the feedback information.
  • the feedback information may include voice information or expression information after watching the computer device output the interactive content.
  • control the computer equipment to relieve the user’s current nervous expression and output soothing music to relieve the user’s nervousness.
  • the control terminal will Play the soothing music that was played the previous moment again; for example, to control the computer equipment to relieve the user’s current nervous expression, output soothing music to relieve the user’s nervousness, when the soothing music is played and the detected user expression is still nervous .
  • the interactive content output by the computer device can be adjusted directly according to the voice information requirements.
  • the feedback information is the user's expression after watching the interactive content output by the computer device
  • it can also be judged whether the expression change between the expression before watching and the expression after watching the interactive content meets the preset adjustment rules. If it meets the preset adjustment rules, then adjust the interactive content output by the computer device. If the preset adjustment rule is not adjusted, the interactive content output by the computer device is not adjusted.
  • the preset rule is the expression change from happy to sad. If the expression change between the expression recognized before watching and the expression after watching the interactive content is from sad to happy, it does not meet the preset Adjustment rules, no adjustments.
  • the above-mentioned expression interaction method can recognize the user's expression and control the computer device to output the corresponding interactive content according to the result of expression recognition. It can realize the functions of alleviating the user's tension, anxiety, and soothing the user's mood. At the same time, it can further improve the user's expression after the interactive content is played. Analyze, and continuously control the interactive content output of the computer equipment according to the analysis result, realize more vivid and interesting interaction with the computer equipment, and improve the user experience.
  • FIG. 2 is a diagram of functional modules of a preferred embodiment of the emoticon interaction device of this application.
  • the expression interaction device 10 may include a detection module 101, a judgment module 102, an extraction module 103, a comparison module 104, an output module 105, and a control module 106.
  • the detection module 101 is configured to receive an interaction request instruction, and pop up a detection box to perform face detection according to the interaction request instruction.
  • the detection module 101 when an interaction request instruction from a user is received, the detection module 101 will pop up a detection frame according to the interaction request instruction, and perform face detection through the detection frame.
  • the user can touch-input an interaction request instruction through a touch screen, input an interaction request instruction through a button, or input an interaction request instruction through voice.
  • the detection module 101 may implement face detection by establishing and training a convolutional neural network model in advance.
  • the detection module 101 can implement face detection in the following manner: a face sample database can be constructed first and a convolutional neural network model for face detection can be established.
  • the face sample database contains multiple persons.
  • the face information of each person can include multiple angles, and the face information of each angle can have multiple pictures;
  • the face images in the face sample database are input to the convolutional neural network model,
  • Network parameters Next, the convolutional neural network model with the optimal network parameters is used as the final recognition model.
  • the detection module 101 can use the finally obtained convolutional neural network model to perform face detection.
  • the judgment module 102 is used to judge whether a face image is detected.
  • the judgment module 102 may judge whether a face image is detected according to the output of the convolutional neural network model.
  • the extraction module 103 is used for locating the key feature area of the face image when the face image is detected, and extracting the expression features that characterize the facial expression to be recognized from the key feature area.
  • the judgment module 102 when the judgment module 102 judges that no face image is detected within a preset time, it may output a prompt message.
  • the extraction module 103 locates the key feature area of the face image, and extracts and characterizes the person to be recognized from the key feature area.
  • the expression features of facial expressions since feature extraction and calculation are not performed on all areas of the face image, the amount of calculation can be reduced and the speed of facial expression recognition can be improved.
  • the key feature regions of the face image may include eyes, nose, mouth, eyebrows, etc.
  • the extraction module 103 can locate the key feature regions such as eyes, nose, mouth, and eyebrows of the face image by means of integral projection. Since eyes are the more prominent facial features in the face, the eyes can be located first, and other organs of the face, such as eyebrows, mouth, nose, etc., can be more accurately located based on the potential distribution relationship. For example, the location of the key feature area is performed by corresponding to the peaks or troughs generated under different integral projection methods. The integral projection is divided into vertical projection and horizontal projection.
  • f (x, y) represent the image (x, y)
  • the gray value of, the horizontal integral projection M h (y) and vertical integral projection M v (x) in the image [y1, y2] and [x1, x2] area are expressed as:
  • the horizontal integral projection is to accumulate the gray values of all pixels in a row before displaying
  • the vertical integral projection is to accumulate the gray values of all pixels in a column before displaying.
  • the eyebrows and eyes are the relatively black areas in the face image, which correspond to the first two minimum points on the horizontal integral projection curve.
  • the first minimum point corresponds to the position of the eyebrows on the vertical axis, denoted as y brow
  • the second minimum point corresponds to the position of the eye on the vertical axis, denoted as y eye
  • the third pole corresponds to the position of the nose on the vertical axis, recorded as y nose
  • the fourth minimum point corresponds to the position of the mouth on the vertical axis, recorded as y month .
  • the position of the mouth and nose on the horizontal axis is (x left-eye + x right-eye )/2, and the eye area, lips area, eyebrow area, and nose area can be determined according to the coordinates of key features and preset rules.
  • the eye area includes the area centered on the left eye coordinates, 15 pixels to the left, 15 pixels to the right, 10 pixels up, and 10 pixels down, and the area centered on the right eye coordinates, 15 pixels to the left, 15 pixels to the right, and up 10 pixels, 10 pixels down area.
  • human facial expressions may have the following expressions: facial movements when happy: the corners of the mouth are raised, the cheeks are lifted up, the eyelids are contracted, and "crow's feet" are formed at the tails of the eyes.
  • Facial features during contempt the corner of the mouth is raised, making a sneer or a smug smile, etc.
  • the facial expression features that characterize the facial expression can be extracted from the key feature region.
  • a differential energy map (DEI)-based method/centralized binary pattern (CGBP) method can be used to extract expression features that represent facial expressions from key feature regions.
  • the comparison module 104 is configured to compare the extracted expression features with the expression features of each expression in the preset expression library, to obtain the facial expression to be recognized and each expression in the preset expression library.
  • the similarity probability of the expression, and the expression with the greatest similarity probability in the preset expression library is used as the facial expression to be recognized.
  • the preset expression library may include a variety of expressions, such as happy, surprised, sad, angry, disgusted, fear and other expressions, and a variety of composite expressions, such as sad and fearful, sad and surprised, Anger and fear etc.
  • the expression feature of the expression to be recognized may be a shape feature vector or a texture feature vector.
  • the shape feature vector of the expression to be recognized is acquired .
  • the texture feature vector of the expression to be recognized is obtained for comparison.
  • the comparison module 104 may determine the similarity probability between the extracted expression feature (shape feature vector or texture feature vector) and each expression in the preset expression library in the following manner: The distance value between the feature vector (shape feature vector or texture feature vector) of the expression to be recognized and the feature vector of each expression in the preset expression library; the distance value is used to determine the facial expression to be recognized and the prediction Suppose the similarity probability of each expression in the expression library.
  • the comparison module 104 obtains the shape feature vector of the facial expression to be recognized, and calculates the difference between the shape feature vector of the facial expression to be recognized and the shape feature vector of each expression in the preset expression library.
  • the comparison module 104 obtains the texture feature vector of the facial expression to be recognized, calculates the texture feature vector of the facial expression to be recognized and the texture feature vector of each expression in the preset expression library And determine the similarity probability between the facial expression to be recognized and each expression in the preset expression library according to the calculated distance value.
  • the distance value may be a generalized Mahalanobis distance.
  • the comparison module 104 may calculate the distance value between the feature vector of the expression to be recognized and the feature vector of each expression in the preset expression library by using the following formula:
  • y is the shape feature vector (texture feature vector) of the facial expression to be recognized
  • x j is the shape feature vector (texture feature vector) of the j-th expression in the preset expression library
  • M is the preset target metric matrix
  • j is an integer greater than or equal to 1
  • d M (y, x j ) is the shape feature vector (texture feature vector) of the facial expression to be recognized and the shape feature vector of the j-th expression in the preset expression library (texture feature Vector)
  • (yx j ) is the difference between the shape feature vector (texture feature vector) of the facial expression to be recognized and the shape feature vector (texture feature vector) of the j-th expression in the preset expression library
  • (Yx j ) T is the transposition of the difference between the shape feature vector (texture feature vector) of the facial expression to be recognized and the shape feature vector (texture feature vector) of the j-th expression in the preset expression library
  • the similarity probability can be calculated by the
  • p ⁇ 1+exp[Db] ⁇ -1 , where p is the similarity probability, D is the distance value, and b is the preset offset.
  • the comparison module 104 may compare the similarity with the expression in the preset expression library. Expressions with similar probabilities are used as the expressions to be recognized.
  • the output module 105 is configured to control the computer device to output corresponding interactive content according to the recognition result of the facial expression to be recognized.
  • a plurality of mapping relationship tables between expressions and interactive content output by the computer device may be established in advance, and the computer device can be controlled according to the expression recognition result according to the mapping relationship table.
  • the interactive content may be that the computer device provides corresponding actions, voices, pictures, texts, videos, etc. according to the facial expression recognition results to interact with the user, so as to relieve the user's tension and anxiety and delight the user's mood.
  • the output module 105 can control the computer device to output soothing music to relieve the user's nervousness or control the computer device to output suggestions on ways to relieve the nervousness (for example, the recommended content is : Try to breathe slowly and deeply to relieve tension) for the user's reference; when it is determined that the facial expression to be recognized is a sad expression, the output module 105 can control the computer device to output articles, music, and videos that relieve sadness or control the output of the computer device Suggestions on how to relieve grief are for users’ reference.
  • the control module 106 is configured to obtain feedback information after the interactive content is output, and continuously control the content output of the computer device according to the feedback information.
  • the feedback information may include voice information or expression information after watching the computer device output the interactive content.
  • the control The module 106 controls the terminal to play the soothing music played at the previous moment again; another example is to control the computer equipment to relieve the user’s current nervous expression and output soothing music to relieve the user’s nervousness.
  • the control module 106 can control the terminal to play another soothing music or not to play soothing music, and instead control the computer device to output suggestions on how to relieve tension to the user.
  • the control module 106 can directly adjust the interactive content output by the computer device according to the voice information requirements.
  • the control module 106 may also determine whether the expression change between the expression before viewing and the expression after viewing the interactive content meets the preset adjustment rule. If it meets the preset adjustment rule, the control module 106 The module 106 adjusts the interactive content output by the computer device, and if it does not meet the preset adjustment rule, does not adjust the interactive content output by the computer device.
  • the preset rule is the expression change from happy to sad. If the expression change between the expression recognized before watching and the expression after watching the interactive content is from sad to happy, it does not meet the preset Adjustment rules, no adjustments.
  • the above-mentioned expression interaction device can recognize the user’s expression and control the computer device to output the corresponding interactive content according to the result of expression recognition. It can realize functions such as alleviating the user’s tension, anxiety, and soothing the user’s mood. At the same time, it can further improve the user’s expression after the interactive content is played. Analyze, and continuously control the interactive content output of the computer equipment according to the analysis result, realize more vivid and interesting interaction with the computer equipment, and improve the user experience.
  • FIG. 3 is a schematic diagram of a preferred embodiment of the computer equipment of this application.
  • the computer device 1 includes a memory 20, a processor 30, and computer-readable instructions 40 stored in the memory 20 and running on the processor 30, such as an expression interaction program.
  • the processor 30 executes the computer-readable instruction 40
  • the steps in the embodiment of the above-mentioned expression interaction method are implemented, for example, steps S11 to S16 shown in FIG. 1.
  • the processor 30 executes the computer-readable instruction 40
  • the functions of the modules in the above-mentioned emoji interaction apparatus embodiment are implemented, for example, the modules 101 to 106 in FIG. 2.
  • the computer-readable instructions 40 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 20 and executed by the processor 30, To complete this application.
  • the one or more modules/units may be a series of computer-readable instruction instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instruction 40 in the computer device 1.
  • the computer-readable instruction 40 may be divided into the detection module 101, the judgment module 102, the extraction module 103, the comparison module 104, the output module 105, and the control module 106 in FIG. 2. Refer to the second embodiment for the specific functions of each module.
  • the computer device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, a mobile phone, a tablet computer, and a cloud server.
  • a computing device such as a desktop computer, a notebook, a palmtop computer, a mobile phone, a tablet computer, and a cloud server.
  • the schematic diagram is only an example of the computer device 1 and does not constitute a limitation on the computer device 1. It may include more or less components than those shown in the figure, or a combination of certain components, or different components. Components, for example, the computer device 1 may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 30 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor, or the processor 30 may also be any conventional processor, etc.
  • the processor 30 is the control center of the computer device 1 and connects the entire computer device 1 with various interfaces and lines. Parts.
  • the memory 20 may be used to store the computer-readable instructions 40 and/or modules/units, and the processor 30 can run or execute the computer-readable instructions and/or modules/units stored in the memory 20, and
  • the data stored in the memory 20 is called to realize various functions of the computer device 1.
  • the memory 20 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data) created in accordance with the use of the computer device 1 and the like are stored.
  • the memory 20 may include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, a flash memory card (Flash Card), At least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device.
  • non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, a flash memory card (Flash Card), At least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device.
  • the integrated modules/units of the computer device 1 are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through computer-readable instructions.
  • the computer-readable instructions can be stored in a non-volatile memory. In the read storage medium, when the computer-readable instructions are executed by the processor, the steps of the foregoing method embodiments can be implemented.
  • the computer-readable instruction includes computer-readable instruction code, and the computer-readable instruction code may be in the form of source code, object code, executable file, or some intermediate form.
  • the computer-readable medium may include: any entity or device capable of carrying the computer-readable instruction code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory).
  • the functional units in the various embodiments of the present application may be integrated in the same processing unit, or each unit may exist alone physically, or two or more units may be integrated in the same unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé d'interaction d'expressions qui comprend : la réception d'une instruction de demande d'interaction, et l'apparition d'une trame de détection en fonction de l'instruction de demande d'interaction pour effectuer une détection faciale ; le positionnement d'une zone de caractéristique clé d'une image de visage, et l'extraction, à partir de la zone de caractéristique clé, d'une caractéristique d'expression caractérisant une expression faciale à reconnaître ; la comparaison de la caractéristique d'expression extraite à la caractéristique d'expression de chaque expression dans une base de données d'expressions prédéfinie, et la définition d'une expression ayant la probabilité de similarité maximale dans la base de données d'expressions prédéfinie en tant qu'expression faciale à reconnaître ; la commande, en fonction du résultat de la reconnaissance d'expression, d'un dispositif terminal pour délivrer en sortie un contenu d'interaction correspondant ; et l'acquisition d'informations de rétroaction après que le contenu d'interaction a été délivré, et la commande en continu de la sortie du contenu d'interaction en fonction des informations de rétroaction. La présente invention concerne en outre un appareil d'interaction d'expressions, un dispositif informatique et un support d'informations lisible non volatil. La présente invention concerne le domaine de la technologie de reconnaissance faciale, et permet une interaction plus vive et intéressante avec un dispositif terminal, ce qui améliore l'expérience de l'utilisateur.
PCT/CN2019/103370 2019-06-05 2019-08-29 Procédé et appareil d'interaction d'expressions, dispositif informatique, et support d'informations lisible WO2020244074A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910487847.XA CN110363079A (zh) 2019-06-05 2019-06-05 表情交互方法、装置、计算机装置及计算机可读存储介质
CN201910487847.X 2019-06-05

Publications (1)

Publication Number Publication Date
WO2020244074A1 true WO2020244074A1 (fr) 2020-12-10

Family

ID=68215622

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103370 WO2020244074A1 (fr) 2019-06-05 2019-08-29 Procédé et appareil d'interaction d'expressions, dispositif informatique, et support d'informations lisible

Country Status (2)

Country Link
CN (1) CN110363079A (fr)
WO (1) WO2020244074A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269145A (zh) * 2021-06-22 2021-08-17 中国平安人寿保险股份有限公司 表情识别模型的训练方法、装置、设备及存储介质
CN113723299A (zh) * 2021-08-31 2021-11-30 上海明略人工智能(集团)有限公司 会议质量评分方法、系统和计算机可读存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764618A (zh) * 2019-10-25 2020-02-07 郑子龙 一种仿生交互系统、方法及相应的生成系统和方法
CN111507149B (zh) * 2020-01-03 2023-10-27 京东方艺云(杭州)科技有限公司 基于表情识别的交互方法、装置和设备
CN111638784B (zh) * 2020-05-26 2023-07-18 浙江商汤科技开发有限公司 人脸表情互动方法、互动装置以及计算机存储介质
CN112381019B (zh) * 2020-11-19 2021-11-09 平安科技(深圳)有限公司 复合表情识别方法、装置、终端设备及存储介质
CN112530543B (zh) * 2021-01-27 2021-11-02 张强 药品管理系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446753A (zh) * 2015-08-06 2017-02-22 南京普爱医疗设备股份有限公司 一种消极表情识别鼓励系统
KR20190008036A (ko) * 2017-07-14 2019-01-23 한국생산기술연구원 안드로이드 로봇의 얼굴 표정 생성 시스템 및 방법
CN109819100A (zh) * 2018-12-13 2019-05-28 平安科技(深圳)有限公司 手机控制方法、装置、计算机装置及计算机可读存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446753A (zh) * 2015-08-06 2017-02-22 南京普爱医疗设备股份有限公司 一种消极表情识别鼓励系统
KR20190008036A (ko) * 2017-07-14 2019-01-23 한국생산기술연구원 안드로이드 로봇의 얼굴 표정 생성 시스템 및 방법
CN109819100A (zh) * 2018-12-13 2019-05-28 平安科技(深圳)有限公司 手机控制方法、装置、计算机装置及计算机可读存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269145A (zh) * 2021-06-22 2021-08-17 中国平安人寿保险股份有限公司 表情识别模型的训练方法、装置、设备及存储介质
CN113269145B (zh) * 2021-06-22 2023-07-25 中国平安人寿保险股份有限公司 表情识别模型的训练方法、装置、设备及存储介质
CN113723299A (zh) * 2021-08-31 2021-11-30 上海明略人工智能(集团)有限公司 会议质量评分方法、系统和计算机可读存储介质

Also Published As

Publication number Publication date
CN110363079A (zh) 2019-10-22

Similar Documents

Publication Publication Date Title
WO2020244074A1 (fr) Procédé et appareil d'interaction d'expressions, dispositif informatique, et support d'informations lisible
CN109461167B (zh) 图像处理模型的训练方法、抠图方法、装置、介质及终端
JP7110502B2 (ja) 深度を利用した映像背景減算法
WO2021078157A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique et support de stockage
US11703949B2 (en) Directional assistance for centering a face in a camera field of view
US10599914B2 (en) Method and apparatus for human face image processing
WO2020078119A1 (fr) Procédé, dispositif et système de simulation d'utilisateur portant des vêtements et des accessoires
WO2021083125A1 (fr) Procédé de commande d'appel et produit associé
WO2020244160A1 (fr) Procédé et appareil de commande d'équipement terminal, dispositif informatique, et support de stockage lisible
WO2019114464A1 (fr) Procédé et dispositif de réalité augmentée
KR102045575B1 (ko) 스마트 미러 디스플레이 장치
US11409794B2 (en) Image deformation control method and device and hardware device
CN110531853B (zh) 一种基于人眼注视点检测的电子书阅读器控制方法及系统
WO2020151156A1 (fr) Procédé et système d'évaluation de risque d'avant vente, appareil informatique et support de stockage lisible
US20200410269A1 (en) Liveness detection method and apparatus, and storage medium
WO2021169736A1 (fr) Procédé et dispositif de traitement de beauté
US11216648B2 (en) Method and device for facial image recognition
WO2021208767A1 (fr) Procédé et appareil de correction de contour facial, et dispositif et support d'enregistrement
WO2023197648A1 (fr) Procédé et appareil de traitement de capture d'écran, dispositif électronique et support lisible par ordinateur
CN108021905A (zh) 图片处理方法、装置、终端设备及存储介质
CN109819100A (zh) 手机控制方法、装置、计算机装置及计算机可读存储介质
CN114779922A (zh) 教学设备的控制方法、控制设备、教学系统和存储介质
WO2024160105A1 (fr) Procédé et appareil d'interaction, et dispositif électronique et support d'enregistrement
CN112149599B (zh) 表情追踪方法、装置、存储介质和电子设备
WO2024055957A1 (fr) Procédé et appareil d'ajustement de paramètres photographiques, dispositif électronique et support de stockage lisible

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931688

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19931688

Country of ref document: EP

Kind code of ref document: A1