CN111881807A

CN111881807A - VR conference control system and method based on face modeling and expression tracking

Info

Publication number: CN111881807A
Application number: CN202010717430.0A
Authority: CN
Inventors: 何宝华; 沈睦生; 严明月
Original assignee: Shenzhen Cadre Science And Technology Industry Co ltd
Current assignee: Shenzhen Cadre Science And Technology Industry Co ltd
Priority date: 2020-07-23
Filing date: 2020-07-23
Publication date: 2020-11-03

Abstract

The invention discloses a VR conference control system and method based on face modeling and expression tracking, and the VR conference control system comprises a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period and mapping the feature points to a 3D face model in real time, and the interaction module is used for outputting a conference scene and the 3D face model. The invention can accurately and real-timely update each characteristic point in the face information to the 3D face model, namely, the expression change of each participant in the conference can be observed in the virtual scene, and the information except the sound can be more vivid and clearly understood.

Description

VR conference control system and method based on face modeling and expression tracking

Technical Field

The invention belongs to the technical field of virtual scenes, and particularly relates to a VR conference control system and method based on face modeling and expression tracking.

Background

At present, an existing VR system mainly adopts a virtual reality technology to perform scene 3D on an existing situational conference system, and combines a video conference technology and a 3D technology to realize a network video conference under the comprehensive combination of a voice system, a video system and a 3D virtual picture, so as to shorten the distance between people and realize cross-regional connection and interaction. The interactive exhibition in the world can be simulated through the screen. However, in the process of person-to-person communication, the transmission of sound is mainly used, and the sense of reality is poor.

Therefore, the prior art is to be improved.

Disclosure of Invention

The invention mainly aims to provide a VR conference control system and method based on face modeling and expression tracking so as to solve the technical problems mentioned in the background technology.

The invention relates to a VR conference control system based on face modeling and expression tracking, which comprises a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the feature point extraction module is used for acquiring the face information and extracting relative position information of a plurality of feature points, the 3D face model generation module is used for determining a 3D face model according to the position information, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period, and mapping the three-dimensional face model to a 3D face model in real time, and outputting a conference scene and the 3D face model corresponding to each wearer participating in the conference scene by an interaction module.

Preferably, the interaction module comprises a VR head-mounted display device.

Preferably, the VR head display device is internally provided with a display module, a camera and a storage module.

Preferably, the storage module stores at least two conference scenes and at least two 3D face models calculated by the ASM algorithm.

The invention also provides a VR conference control method based on face modeling and expression tracking, which comprises the following steps:

step S10, after the wearer wears the interaction module, the wearer receives the selection of the wearer and enters a conference scene;

step S20, a camera in the interaction module acquires a real-time picture in the equipment, and the face information of the wearer is identified from the real-time picture;

step S30, identifying at least three characteristic points from the face information and calculating relative position information;

step S40, determining a 3D face model according to the relative position information;

and step S50, acquiring a plurality of feature points in the face information every other preset time period, and mapping the feature points to the 3D face model in real time.

Preferably, the method further comprises the steps of:

step S60, the interaction module outputs a conference scene and a 3D face model corresponding to each wearer participating in the conference scene.

Preferably, the face information includes a face region part in the real-time photograph, and the feature points include at least a nose, a mouth, a left eye, a right eye, and ears.

Preferably, the relative position information includes a first relative distance and a second relative distance, and the step S30 specifically includes:

step S31, identifying the nose, the left eye and the right eye from the face information;

step S32, establishing a coordinate system by taking the nose as a coordinate center, and acquiring left eye coordinate information and right eye coordinate information;

step S33, calculating a first relative distance and a second relative distance according to the positions of the nose coordinate information, the left eye coordinate information, and the right eye coordinate information, wherein the first relative distance represents the distance between the left eye and the nose, and the second relative distance represents the distance between the left eye and the right eye.

Preferably, step S40 specifically includes:

step S41, selecting a plurality of first correlation models from the storage module according to the first relative distance;

and step S42, determining a 3D face model from the first correlation models according to the second relative distance.

The VR conference control system and method based on face modeling and expression tracking can accurately and real-timely update each feature point in face information to a 3D face model, namely the expression change of each participant in a conference can be observed in a virtual scene, and information except sound can be more vivid and clearly known.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic flow chart of a VR conference control method based on face modeling and expression tracking according to the present invention;

fig. 2 is a detailed flowchart of step S30 in the VR conference control method based on face modeling and expression tracking;

fig. 3 is a detailed flowchart of step S40 in the VR conference control method based on face modeling and expression tracking;

fig. 4 is a block diagram of a VR conference control system based on face modeling and expression tracking according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It is noted that relative terms such as "first," "second," and the like may be used to describe various components, but these terms are not intended to limit the components. These terms are only used to distinguish one component from another component. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present invention. The term "and/or" refers to a combination of any one or more of the associated items and the descriptive items.

As shown in fig. 4, fig. 4 is a block diagram of a VR conference control system based on face modeling and expression tracking according to the present invention. The invention relates to a VR conference control system based on face modeling and expression tracking, which comprises a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the feature point extraction module is used for acquiring the face information and extracting relative position information of a plurality of feature points, the 3D face model generation module is used for determining a 3D face model according to the position information, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period, and mapping the three-dimensional face model to a 3D face model in real time, wherein the interaction module is used for outputting a conference scene and the 3D face model corresponding to each wearer participating in the conference scene. The VR conference control system and method based on face modeling and expression tracking can accurately and real-timely update each feature point in face information to a 3D face model, namely the expression change of each participant in a conference can be observed in a virtual scene, and information except sound can be more vivid and clearly known.

Wherein, the interaction module comprises VR head display equipment so as to be convenient for a wearer needing to participate in the conference to wear. A display module, a camera and a storage module are arranged in the VR head display equipment; the hand equipment is connected with the VR head display equipment; a sensor and a keyboard are arranged on the hand equipment to obtain a hand instruction of a user; the display module is used for playing VR video.

Preferably, the storage module stores at least two conference scenes and at least two 3D face models calculated by an ASM algorithm; the conference scene can be selected according to the number of people, the conference scene comprises a virtual conference scene, and the display module is used for outputting the virtual conference scene and the 3D face model corresponding to each wearer participating in the virtual conference scene. The ASM algorithm is a feature point extraction method based on a statistical learning model, and obtains the statistical description of the shape of an object by selecting a group of training samples, describing the shape of the object by using a group of feature points, registering the shapes of the samples, and then statistically modeling the registered shape vector by using a principal component analysis method. The created shape model may be used to search for objects in the new image that are similar to the model. The method comprises the following specific steps: the ASM training is carried out on key areas of the face, the feature points of the mouth, nose, canthus and face boundaries of the face are labeled, the coordinates of the feature points are recorded, the size, absolute position and angle of each image are different for the labeled face sample sets, therefore, the direct statistical modeling of the sample sets is irregular, in order to establish a model of the sample training set, the alignment operation is carried out through the rotation, translation, expansion and reduction of the images, the difference between the images is eliminated, and therefore a geometric model of the face is established. After the shape model is established, the target image and the model need to be matched by using an iteration principle, the coordinate set of the target image is still similar to the training set, and the iteration process is repeated until convergence is realized.

As shown in fig. 1, the present invention further provides a VR conference control method based on face modeling and expression tracking, including the following steps:

Preferably, the method includes, after step S50, the steps of:

and step S60, the interaction module outputs the 3D face model to the VR video in real time.

As shown in fig. 2, preferably, the relative position information includes a first relative distance and a second relative distance, and step S30 specifically includes:

step S31, identifying the nose, the left eye and the right eye from the face information; in step S31, the face information includes a face region in the real-time photograph;

The preferred embodiment calculates the relative position information by three feature points of the nose, the left eye, and the right eye based on step S31, step S32, and step S33; so as to determine the 3D face model with higher matching degree with the wearer;

as shown in fig. 3, preferably, step S40 specifically includes:

The storage module stores a plurality of 3D face models in advance; for example, a 3D face model for children is different from a 3D face model for adults.

The VR conference control system and method based on face modeling and expression tracking can accurately and real-timely update each feature point in face information to a 3D face model, namely the expression change of each participant in a conference can be observed in a virtual scene, and information except sound can be more vivid and clearly known. The real perception degree of the participants is met to the maximum extent, and not only the existence of the opposite side can be seen, but also the opposite side can be exactly sensed to be at the side of the participant. The vivid effect is realized through wearable equipment, including VR head display, force feedback data gloves and other human-computer interaction equipment.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A VR conference control system based on face modeling and expression tracking is characterized by comprising a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the feature point extraction module is used for acquiring the face information and extracting relative position information of a plurality of feature points, the 3D face model generation module is used for determining a 3D face model according to the position information, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period, and mapping the three-dimensional face model to a 3D face model in real time, wherein the interaction module is used for outputting a conference scene and the 3D face model corresponding to each wearer participating in the conference scene.

2. The VR conference control system based on face modeling and expression tracking of claim 1, wherein the interaction module includes a VR head-mounted display.

3. The VR conference control system based on face modeling and expression tracking of claim 2, wherein the VR head display device has a display module, a camera, and a storage module built in.

4. The VR conference control system of claim 3 that is based on face modeling and expression tracking, wherein the storage module stores at least two conference scenes and at least two 3D face models computed using ASM algorithms.

5. A VR conference control method based on face modeling and expression tracking is characterized by comprising the following steps:

6. The VR conference control method of claim 5 based on face modeling and expression tracking, further comprising:

7. The VR conference control method of claim 5 that is based on face modeling and expression tracking, wherein the face information includes a face region portion in the real-time photo, and the feature points include at least nose, mouth, left eye, right eye, and ears.

8. The VR conference control method based on face modeling and expression tracking of claim 5, wherein the relative position information includes a first relative distance and a second relative distance, and the step S30 specifically includes:

9. The VR conference control method based on face modeling and expression tracking as claimed in claim 8, wherein the step S40 specifically includes: