CN111860451A

CN111860451A - Game interaction method based on facial expression recognition

Info

Publication number: CN111860451A
Application number: CN202010766945.XA
Authority: CN
Inventors: 张胜利; 吕钊; 张超; 郭晓静; 穆雪; 欧阳蕊; 黄小鹏
Original assignee: Suzhou Xiaoma E Commerce Co ltd
Current assignee: Suzhou Xiaoma E Commerce Co ltd
Priority date: 2020-08-03
Filing date: 2020-08-03
Publication date: 2020-10-30

Abstract

The invention relates to a game interaction method based on facial expression recognition, which comprises the following steps: (1) extracting visual features by learning static images with different types of expressions by using a convolutional neural network, and determining the relationship between the conversion of facial expressions in an image sequence and facial basic expressions to obtain a training model; (2) collecting video information of a player to be detected, and intercepting the video information according to frames; (3) preprocessing a video image to generate a preprocessed image; (4) analyzing the facial expression, matching the facial expression with the expression characteristics in the training model, and analyzing the current expression of the player; (5) and controlling the game role through the current facial expression. The invention only shoots the face of the player through the camera, performs expression analysis and recognition in the computer, and converts the result into the control instruction of the game, thereby realizing the control of the role in the game, and also directly or assisting the player to have a conversation with other roles, thereby expanding the traditional game interaction mode.

Description

Game interaction method based on facial expression recognition

Technical Field

The invention relates to the technical field of facial expression image analysis and recognition, in particular to a game interaction method based on facial expression recognition.

Background

The expression can be said to be a world language, and is not distinguished from national boundaries, ethnicities and sexes, and all people can be said to have universal expressions. Facial expression recognition is widely applied to robots, medical treatment, driver driving fatigue detection and man-machine interaction systems, and in the earliest 20 th century, Ekman and Friesen define 6 basic expressions through cross-cultural research: the expression of 'slight' is added subsequently, wherein the expression is angry, afraid, disgust, happy, sad and frightened. Pioneering work and intuitive definition make this model still popular in automatic facial expression recognition (AFEA). According to the feature representation, the processing objects of the task of the facial expression recognition system can be divided into two types of pictures and videos. Thanks to the development of deep learning and the emergence of the more challenging dataset FER2013, more and more researchers are applying deep learning techniques to facial expression recognition.

In recent years, with the innovation of computer technology, the digital entertainment industry represented by computer games has been rapidly developed. As a special application software, the computer game realizes the interactive operation between the user and the game by providing a series of menu options and operation instructions for the game user. The traditional man-machine interaction modes for games are as follows: mouse, keyboard, joystick and special game equipment. However, with the development of game types and contents, these modes have not been able to meet the requirements of stronger human-computer interaction, and it is a necessary trend to apply the facial expression recognition technology to games.

Disclosure of Invention

The invention aims to provide a game interaction method based on facial expression recognition, which adopts the facial expression of a player to control a game role and realizes the control of the role in a game.

In order to achieve the purpose, the invention adopts the following technical scheme:

a game interaction method based on facial expression recognition comprises the following steps:

(1) extracting visual features by learning static images with different types of expressions through a convolutional neural network, determining the relationship between the conversion of facial expressions in an image sequence and facial basic expressions to obtain a training model, wherein the facial basic expressions comprise anger, fear, disgust, joy, sadness, surprise, slight sight and negation;

(2) collecting video information of a player to be detected, carrying out image interception on the video information according to frames, and collecting the video information of the player to be detected through a camera, wherein the camera comprises a high-definition camera and an infrared camera, and the camera is arranged at an included angle of 140-180 degrees at a distance of 50-80cm from a face;

(3) preprocessing the video image to generate a preprocessed image, wherein the preprocessed image comprises positioning and extracting organ characteristics and texture areas of the face and other predefined characteristic points, and the preprocessed image is positioned in a face area of a player through the characteristic points; the preprocessing the video image specifically comprises: preprocessing a video image of a player to be detected, extracting a key frame, then normalizing the acquired video key frame, and detecting a human face and extracting characteristics;

(4) analyzing the facial expression, matching the facial expression with the expression characteristics in the training model, and analyzing the current expression of the player;

(5) controlling the game role through the current facial expression;

(51) displaying characters and images of the game scenario on a screen through the game window according to the game scenario;

(52) in the interaction between a player and an NPC, comparing the system prompt expression with the expression made by the player, and triggering a preset scenario through expression error correction verification;

(53) when the branch line selection is carried out, the system prompts the player to make one of the basic expressions on the game window;

(54) when a player makes one of the basic expressions, the system triggers game scenario branches corresponding to the player expression as the basic expression according to the expression made by the player after the system passes the expression error correction verification, and outputs the corresponding branch scenario to a game window;

(55) in a specific plot, controlling various movement modes of the role in the game by using different expressions, and controlling the role action by using the basic expression; the basic expressions comprise joy, anger, sadness and surprise, wherein the joy controls the character to move forwards, the anger controls the character to jump, the sadness controls the character to squat and the fright controls the sliding, and the duration of the movement mode corresponding to the characters in the game can be controlled according to the duration of the expression made by the player.

Further, the normalization processing specifically includes:

(A) performing illumination normalization on the image by using threshold segmentation histogram equalization, and eliminating gray level difference and noise of edge pixels of the segmentation part by feathering;

(B) training a human eye region through a self-adaptive enhancement algorithm cascade detector, finding out coordinates of a central point of human eyes as a central position of horizontal rotation of affine transformation, and finally obtaining a distorted human face image to realize posture normalization;

(C) the face alignment is realized by aligning the coordinates of the center points of the two eyes among different images, so that the normalization of the scale is realized;

(D) and (4) effectively cutting out a local area of the face by adopting an ERT characteristic point segmentation algorithm, and finishing the primary preprocessing work of the image.

In the scheme, the characteristic extraction is to extract the textural characteristics of the human face by adopting an LBP price mode, then detect and mark a target, establish a cascade table by using a training result of a classifier of harr characteristics, and transmit the picture to be detected and the cascade table to a target detection algorithm together to obtain a detected human face set.

The expression error correction verification comprises the following steps: when the player expression pictures intercepted by frames within seconds after the system is prompted are identified, the expression with the largest proportion is calculated, and the expression is used as an identification result; and then displaying the expression recognition result to the user, detecting whether the user has negative expressions or not, if so, indicating detection errors, repeating the system prompt of the steps (52) and (54), detecting again, and eliminating previous error expressions when the expressions corresponding to the pictures are matched.

According to the technical scheme, the game interaction method based on facial expression recognition adopts the facial expression of the player to control the game role, namely, the facial expression information of the player is used as the supplement of the traditional keyboard and mouse interaction mode, so that the man-machine interaction mode is enriched. The face of a player is shot only through the camera, expression analysis and recognition are carried out in the computer, results are converted into control instructions of the game, the role in the game is controlled, the dialogue between the player and other roles can be directly or assisted, and the traditional game interaction mode is expanded. Since the game has high real-time requirements, the video detection method must be real-time and robust. Such a control method must also be easy to implement and operate for ease of use by the user. The invention can lead game users to expect that the interaction operation can be carried out in a new natural and intelligent mode, thereby leading the game to be more interactive and immersive. With the development of computer vision technology, natural human-computer interaction by applying vision becomes possible, and as a camera becomes a common configuration of a computer, the application of the technology has a wide prospect.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is a basic flow diagram of the model training of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings:

as shown in fig. 1, the method for controlling a game character based on facial expressions of this embodiment specifically includes the following steps:

step 1: the method comprises the steps of extracting visual features by learning a large number of static images with different types of expressions through a Convolutional Neural Network (CNN), determining the relationship between the conversion of facial expressions and inspiration, fear, disgust, happiness, sadness, surprise, slight vision and negation (shaking head) in an image sequence, and obtaining a training model, so that the system can accurately recognize the facial expressions;

step 2: video information of a player to be detected is collected through a camera, and image interception is carried out on the video information according to frames. The two cameras, namely the high-definition camera and the infrared camera, are used together, the distance of 50-80cm is kept between the cameras and the face, and the cameras are arranged at an included angle of 140-180 degrees, so that a good recognition effect is generated in adverse environments such as backlight, poor light and the like;

and step 3: the method comprises the steps of preprocessing a video image to generate a preprocessed image, wherein in order to facilitate emotion recognition, organ features and texture regions of a human face and other predefined feature points need to be positioned and extracted, and the human face region of a player is positioned through the feature points;

and 4, step 4: analyzing the facial expression, matching the facial expression with the expression characteristics in the training model, and analyzing the current expression of the player;

when the selection is carried out, the system prompts a player to make an expression, a camera picture appears at the lower left corner of the system, the face is framed, and real-time expression detection is carried out. When the tester makes an angry expression and is identified by the program, the system will select the first option corresponding to angry and then jump to the corresponding game scenario; when the tester makes a happy expression and is identified by the program, the second option that is happy is selected and then jumps to the corresponding game scenario.

And 5: using the obtained facial expression information to control a game character, comprising:

step 5.1: displaying characters and images of a game scenario on a screen according to a preset game scenario through a game window;

step 5.2: in the interaction between a Player and an NPC (Non-Player-Controlled Character), the system prompts the Player to make a confused expression, and when the Player makes the confused expression and is identified by the system, a predetermined scenario can be triggered through expression error correction verification;

step 5.3: when the system prompts on a game window when selecting branches in a game, a player is prompted to make one of angry, fear, disgust, joy, sadness, surprise and slight expressions;

step 5.4: when a player makes one of the expressions, the system respectively triggers the player expressions to be game plot branches corresponding to anger, fear, disgust, joy, sadness, surprise and slight according to the expressions made by the player after the system passes the expression error correction verification, and outputs the corresponding branch plot branches to a game window;

and 5.5 in a specific plot, controlling various motion modes of the interior characters of the game by using different expressions, moving happy expression control characters forwards, controlling character jumping, controlling character squatting under sadness and controlling sliding at surprise, and controlling the time length of the motion mode corresponding to the interior characters of the game according to the time length of the expression made by the player.

In the above method, the model training specifically includes:

the convolutional neural network uses a residual structure and a depth separable convolutional structure consisting of a depth convolution and a point-by-point convolution, the main purpose of these layers being to separate the spatial cross-correlation from the channel cross-correlation. The method comprises the steps of passing two layers of 8x8 convolutional layers for an input image, wherein convolution kernels are 3x3 and have a step size of 1x1, sequentially passing through residual convolutional layers of 16x16,16x16,32x32,32x32,64x64,64x64,128x128 and 128x128, wherein each residual convolutional layer consists of two separable convolutional layers with convolution kernels of 3x3, one residual block with convolution kernels of 1x1 and having a step size of 2x2, and one maximum pooling layer with convolution kernels of 3x3 and having a step size of 2x2, and all the convolutional layers use a linear rectification function (relu) as an activation function of the convolutional layers. Finally, a Softmax function is used as an activation function of the full link layer through a global average pooling layer and the full link layer. Wherein the classification residual module modifies a desired mapping between two subsequent layers in order to learn the difference of the original features and the desired features. Thus, the desired feature h (x) is modified to solve the easier learning problem f (x) such that: h (x) ═ f (x) + x. The basic flow of model training is shown in fig. 2.

The image preprocessing specifically comprises:

preprocessing a video image of a player to be detected, extracting a key frame, then normalizing the acquired video key frame, and detecting a human face and extracting characteristics. In the normalization processing process, in order to overcome the influence of a complex illumination environment on the recognition effect in reality, illumination normalization is carried out on the image by adopting threshold segmentation histogram equalization, and gray level difference and noise are eliminated from edge pixels of the segmentation part through feathering. And then training an eye region through an adaptive enhancement algorithm (Adaboost) cascade detector, finding out coordinates of the center point of the eye as the center position of horizontal rotation of affine transformation, and finally obtaining a distorted face image to realize posture normalization. And then, the face alignment is realized by aligning the coordinates of the center points of the two eyes among different images, so that the scale normalization is realized. And finally, in order to avoid the interference of a complex environment on the recognition effect, an ERT feature point segmentation algorithm is adopted to effectively cut out a local area of the face, and the preliminary preprocessing work of the image is completed. In the feature extraction stage, in order to correctly identify expressions, an improved LBP equivalent pattern (Uniform pattern) is selected to extract texture features of a human face, then a target is detected and marked, a cascade table is established by using a training result of a harr feature classifier, and a detected human face set can be obtained by transmitting a picture to be detected and the cascade table to a target detection algorithm together.

The expression verification error correction mechanism comprises:

when the facial expression pictures of the player captured by frames within 2s after the system prompts are identified, the facial expression with the largest proportion is obtained, and the facial expression is used as the identification result. And then displaying the expression recognition result to the user, detecting whether the user has negative expressions (shaking the head) or not, if so, representing detection errors, repeating the system prompt of the steps 5.2 and 5.4, detecting again, and eliminating previous error expressions when the expressions corresponding to the pictures are matched.

The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solution of the present invention by those skilled in the art should fall within the protection scope defined by the claims of the present invention without departing from the spirit of the present invention.

Claims

1. A game interaction method based on facial expression recognition is characterized by comprising the following steps:

(1) extracting visual features by learning static images with different types of expressions by using a convolutional neural network, and determining the relationship between the conversion of facial expressions in an image sequence and facial basic expressions to obtain a training model;

(2) collecting video information of a player to be detected, and intercepting the video information according to frames;

(3) preprocessing a video image to generate a preprocessed image;

(5) and controlling the game role through the current facial expression.

2. The game interaction method based on facial expression recognition of claim 1, wherein: in the step (1), the facial basic expressions comprise anger, fear, disgust, joy, sadness, surprise, slight and negation.

3. The game interaction method based on facial expression recognition of claim 1, wherein: in the step (2), video information of a player to be detected is acquired through a camera, wherein the camera comprises a high-definition camera and an infrared camera; the camera and the face keep a distance of 50-80cm and are placed at an included angle of 140-180 degrees.

4. The game interaction method based on facial expression recognition of claim 1, wherein: in the step (3), the video image is preprocessed to generate a preprocessed image, wherein the preprocessed image comprises the steps of positioning and extracting the organ characteristics and the texture area of the face and other predefined characteristic points, and the preprocessed image is positioned to the face area of the player through the characteristic points.

5. The game interaction method based on facial expression recognition of claim 1, wherein: in the step (5), the game role is controlled by the current facial expression, and the method comprises the following steps:

(55) in a specific plot, different expressions are used for controlling various movement modes of the characters in the game, and the basic expressions are used for controlling the actions of the characters.

6. The game interaction method based on facial expression recognition of claim 5, wherein: in the step (55), the basic expressions comprise joy, anger, sadness and surprise, wherein the joy controls the character to move forwards, the anger controls the character to jump, the sadness controls the character to squat and the surprise controls the sliding, and the duration of the movement mode corresponding to the role in the game can be controlled according to the duration of the expression made by the player.

7. The game interaction method based on facial expression recognition of claim 1, wherein: in the step (3), the preprocessing the video image specifically includes:

preprocessing a video image of a player to be detected, extracting a key frame, then normalizing the acquired video key frame, and detecting a human face and extracting characteristics.

8. The game interaction method based on facial expression recognition, according to claim 7, is characterized in that: the normalization processing specifically includes:

9. The game interaction method based on facial expression recognition of claim 7, wherein: the feature extraction is to extract the texture features of the human face by adopting an LBP price mode, then detect and mark a target, establish a cascade table by using a training result of a classifier of harr features, and transmit the picture to be detected and the cascade table together to a target detection algorithm to obtain a detected human face set.

10. The game interaction method based on facial expression recognition, as claimed in claim 5, wherein the facial expression error correction verification comprises:

when the player expression pictures intercepted by frames within seconds after the system is prompted are identified, the expression with the largest proportion is calculated, and the expression is used as an identification result;

and then displaying the expression recognition result to the user, detecting whether the user has negative expressions or not, if so, indicating detection errors, repeating the system prompt of the steps (52) and (54), detecting again, and eliminating previous error expressions when the expressions corresponding to the pictures are matched.