CN111881807A - VR conference control system and method based on face modeling and expression tracking - Google Patents
VR conference control system and method based on face modeling and expression tracking Download PDFInfo
- Publication number
- CN111881807A CN111881807A CN202010717430.0A CN202010717430A CN111881807A CN 111881807 A CN111881807 A CN 111881807A CN 202010717430 A CN202010717430 A CN 202010717430A CN 111881807 A CN111881807 A CN 111881807A
- Authority
- CN
- China
- Prior art keywords
- face
- module
- conference
- information
- expression tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a VR conference control system and method based on face modeling and expression tracking, and the VR conference control system comprises a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period and mapping the feature points to a 3D face model in real time, and the interaction module is used for outputting a conference scene and the 3D face model. The invention can accurately and real-timely update each characteristic point in the face information to the 3D face model, namely, the expression change of each participant in the conference can be observed in the virtual scene, and the information except the sound can be more vivid and clearly understood.
Description
Technical Field
The invention belongs to the technical field of virtual scenes, and particularly relates to a VR conference control system and method based on face modeling and expression tracking.
Background
At present, an existing VR system mainly adopts a virtual reality technology to perform scene 3D on an existing situational conference system, and combines a video conference technology and a 3D technology to realize a network video conference under the comprehensive combination of a voice system, a video system and a 3D virtual picture, so as to shorten the distance between people and realize cross-regional connection and interaction. The interactive exhibition in the world can be simulated through the screen. However, in the process of person-to-person communication, the transmission of sound is mainly used, and the sense of reality is poor.
Therefore, the prior art is to be improved.
Disclosure of Invention
The invention mainly aims to provide a VR conference control system and method based on face modeling and expression tracking so as to solve the technical problems mentioned in the background technology.
The invention relates to a VR conference control system based on face modeling and expression tracking, which comprises a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the feature point extraction module is used for acquiring the face information and extracting relative position information of a plurality of feature points, the 3D face model generation module is used for determining a 3D face model according to the position information, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period, and mapping the three-dimensional face model to a 3D face model in real time, and outputting a conference scene and the 3D face model corresponding to each wearer participating in the conference scene by an interaction module.
Preferably, the interaction module comprises a VR head-mounted display device.
Preferably, the VR head display device is internally provided with a display module, a camera and a storage module.
Preferably, the storage module stores at least two conference scenes and at least two 3D face models calculated by the ASM algorithm.
The invention also provides a VR conference control method based on face modeling and expression tracking, which comprises the following steps:
step S10, after the wearer wears the interaction module, the wearer receives the selection of the wearer and enters a conference scene;
step S20, a camera in the interaction module acquires a real-time picture in the equipment, and the face information of the wearer is identified from the real-time picture;
step S30, identifying at least three characteristic points from the face information and calculating relative position information;
step S40, determining a 3D face model according to the relative position information;
and step S50, acquiring a plurality of feature points in the face information every other preset time period, and mapping the feature points to the 3D face model in real time.
Preferably, the method further comprises the steps of:
step S60, the interaction module outputs a conference scene and a 3D face model corresponding to each wearer participating in the conference scene.
Preferably, the face information includes a face region part in the real-time photograph, and the feature points include at least a nose, a mouth, a left eye, a right eye, and ears.
Preferably, the relative position information includes a first relative distance and a second relative distance, and the step S30 specifically includes:
step S31, identifying the nose, the left eye and the right eye from the face information;
step S32, establishing a coordinate system by taking the nose as a coordinate center, and acquiring left eye coordinate information and right eye coordinate information;
step S33, calculating a first relative distance and a second relative distance according to the positions of the nose coordinate information, the left eye coordinate information, and the right eye coordinate information, wherein the first relative distance represents the distance between the left eye and the nose, and the second relative distance represents the distance between the left eye and the right eye.
Preferably, step S40 specifically includes:
step S41, selecting a plurality of first correlation models from the storage module according to the first relative distance;
and step S42, determining a 3D face model from the first correlation models according to the second relative distance.
The VR conference control system and method based on face modeling and expression tracking can accurately and real-timely update each feature point in face information to a 3D face model, namely the expression change of each participant in a conference can be observed in a virtual scene, and information except sound can be more vivid and clearly known.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a VR conference control method based on face modeling and expression tracking according to the present invention;
fig. 2 is a detailed flowchart of step S30 in the VR conference control method based on face modeling and expression tracking;
fig. 3 is a detailed flowchart of step S40 in the VR conference control method based on face modeling and expression tracking;
fig. 4 is a block diagram of a VR conference control system based on face modeling and expression tracking according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It is noted that relative terms such as "first," "second," and the like may be used to describe various components, but these terms are not intended to limit the components. These terms are only used to distinguish one component from another component. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present invention. The term "and/or" refers to a combination of any one or more of the associated items and the descriptive items.
As shown in fig. 4, fig. 4 is a block diagram of a VR conference control system based on face modeling and expression tracking according to the present invention. The invention relates to a VR conference control system based on face modeling and expression tracking, which comprises a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the feature point extraction module is used for acquiring the face information and extracting relative position information of a plurality of feature points, the 3D face model generation module is used for determining a 3D face model according to the position information, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period, and mapping the three-dimensional face model to a 3D face model in real time, wherein the interaction module is used for outputting a conference scene and the 3D face model corresponding to each wearer participating in the conference scene. The VR conference control system and method based on face modeling and expression tracking can accurately and real-timely update each feature point in face information to a 3D face model, namely the expression change of each participant in a conference can be observed in a virtual scene, and information except sound can be more vivid and clearly known.
Wherein, the interaction module comprises VR head display equipment so as to be convenient for a wearer needing to participate in the conference to wear. A display module, a camera and a storage module are arranged in the VR head display equipment; the hand equipment is connected with the VR head display equipment; a sensor and a keyboard are arranged on the hand equipment to obtain a hand instruction of a user; the display module is used for playing VR video.
Preferably, the storage module stores at least two conference scenes and at least two 3D face models calculated by an ASM algorithm; the conference scene can be selected according to the number of people, the conference scene comprises a virtual conference scene, and the display module is used for outputting the virtual conference scene and the 3D face model corresponding to each wearer participating in the virtual conference scene. The ASM algorithm is a feature point extraction method based on a statistical learning model, and obtains the statistical description of the shape of an object by selecting a group of training samples, describing the shape of the object by using a group of feature points, registering the shapes of the samples, and then statistically modeling the registered shape vector by using a principal component analysis method. The created shape model may be used to search for objects in the new image that are similar to the model. The method comprises the following specific steps: the ASM training is carried out on key areas of the face, the feature points of the mouth, nose, canthus and face boundaries of the face are labeled, the coordinates of the feature points are recorded, the size, absolute position and angle of each image are different for the labeled face sample sets, therefore, the direct statistical modeling of the sample sets is irregular, in order to establish a model of the sample training set, the alignment operation is carried out through the rotation, translation, expansion and reduction of the images, the difference between the images is eliminated, and therefore a geometric model of the face is established. After the shape model is established, the target image and the model need to be matched by using an iteration principle, the coordinate set of the target image is still similar to the training set, and the iteration process is repeated until convergence is realized.
As shown in fig. 1, the present invention further provides a VR conference control method based on face modeling and expression tracking, including the following steps:
step S10, after the wearer wears the interaction module, the wearer receives the selection of the wearer and enters a conference scene;
step S20, a camera in the interaction module acquires a real-time picture in the equipment, and the face information of the wearer is identified from the real-time picture;
step S30, identifying at least three characteristic points from the face information and calculating relative position information;
step S40, determining a 3D face model according to the relative position information;
and step S50, acquiring a plurality of feature points in the face information every other preset time period, and mapping the feature points to the 3D face model in real time.
Preferably, the method includes, after step S50, the steps of:
and step S60, the interaction module outputs the 3D face model to the VR video in real time.
Preferably, the face information includes a face region part in the real-time photograph, and the feature points include at least a nose, a mouth, a left eye, a right eye, and ears.
As shown in fig. 2, preferably, the relative position information includes a first relative distance and a second relative distance, and step S30 specifically includes:
step S31, identifying the nose, the left eye and the right eye from the face information; in step S31, the face information includes a face region in the real-time photograph;
step S32, establishing a coordinate system by taking the nose as a coordinate center, and acquiring left eye coordinate information and right eye coordinate information;
step S33, calculating a first relative distance and a second relative distance according to the positions of the nose coordinate information, the left eye coordinate information, and the right eye coordinate information, wherein the first relative distance represents the distance between the left eye and the nose, and the second relative distance represents the distance between the left eye and the right eye.
The preferred embodiment calculates the relative position information by three feature points of the nose, the left eye, and the right eye based on step S31, step S32, and step S33; so as to determine the 3D face model with higher matching degree with the wearer;
as shown in fig. 3, preferably, step S40 specifically includes:
step S41, selecting a plurality of first correlation models from the storage module according to the first relative distance;
and step S42, determining a 3D face model from the first correlation models according to the second relative distance.
The storage module stores a plurality of 3D face models in advance; for example, a 3D face model for children is different from a 3D face model for adults.
The VR conference control system and method based on face modeling and expression tracking can accurately and real-timely update each feature point in face information to a 3D face model, namely the expression change of each participant in a conference can be observed in a virtual scene, and information except sound can be more vivid and clearly known. The real perception degree of the participants is met to the maximum extent, and not only the existence of the opposite side can be seen, but also the opposite side can be exactly sensed to be at the side of the participant. The vivid effect is realized through wearable equipment, including VR head display, force feedback data gloves and other human-computer interaction equipment.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (9)
1. A VR conference control system based on face modeling and expression tracking is characterized by comprising a conference scene selection module, a face detection module, a feature point extraction module, a 3D face model generation module, a facial expression tracking module and an interaction module, wherein the conference scene selection module is used for entering a conference scene according to user selection, the face detection module is used for identifying face information of a wearer, the feature point extraction module is used for acquiring the face information and extracting relative position information of a plurality of feature points, the 3D face model generation module is used for determining a 3D face model according to the position information, the facial expression tracking module is used for acquiring a plurality of feature points in the face information every preset time period, and mapping the three-dimensional face model to a 3D face model in real time, wherein the interaction module is used for outputting a conference scene and the 3D face model corresponding to each wearer participating in the conference scene.
2. The VR conference control system based on face modeling and expression tracking of claim 1, wherein the interaction module includes a VR head-mounted display.
3. The VR conference control system based on face modeling and expression tracking of claim 2, wherein the VR head display device has a display module, a camera, and a storage module built in.
4. The VR conference control system of claim 3 that is based on face modeling and expression tracking, wherein the storage module stores at least two conference scenes and at least two 3D face models computed using ASM algorithms.
5. A VR conference control method based on face modeling and expression tracking is characterized by comprising the following steps:
step S10, after the wearer wears the interaction module, the wearer receives the selection of the wearer and enters a conference scene;
step S20, a camera in the interaction module acquires a real-time picture in the equipment, and the face information of the wearer is identified from the real-time picture;
step S30, identifying at least three characteristic points from the face information and calculating relative position information;
step S40, determining a 3D face model according to the relative position information;
and step S50, acquiring a plurality of feature points in the face information every other preset time period, and mapping the feature points to the 3D face model in real time.
6. The VR conference control method of claim 5 based on face modeling and expression tracking, further comprising:
step S60, the interaction module outputs a conference scene and a 3D face model corresponding to each wearer participating in the conference scene.
7. The VR conference control method of claim 5 that is based on face modeling and expression tracking, wherein the face information includes a face region portion in the real-time photo, and the feature points include at least nose, mouth, left eye, right eye, and ears.
8. The VR conference control method based on face modeling and expression tracking of claim 5, wherein the relative position information includes a first relative distance and a second relative distance, and the step S30 specifically includes:
step S31, identifying the nose, the left eye and the right eye from the face information;
step S32, establishing a coordinate system by taking the nose as a coordinate center, and acquiring left eye coordinate information and right eye coordinate information;
step S33, calculating a first relative distance and a second relative distance according to the positions of the nose coordinate information, the left eye coordinate information, and the right eye coordinate information, wherein the first relative distance represents the distance between the left eye and the nose, and the second relative distance represents the distance between the left eye and the right eye.
9. The VR conference control method based on face modeling and expression tracking as claimed in claim 8, wherein the step S40 specifically includes:
step S41, selecting a plurality of first correlation models from the storage module according to the first relative distance;
and step S42, determining a 3D face model from the first correlation models according to the second relative distance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010717430.0A CN111881807A (en) | 2020-07-23 | 2020-07-23 | VR conference control system and method based on face modeling and expression tracking |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010717430.0A CN111881807A (en) | 2020-07-23 | 2020-07-23 | VR conference control system and method based on face modeling and expression tracking |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111881807A true CN111881807A (en) | 2020-11-03 |
Family
ID=73154875
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010717430.0A Pending CN111881807A (en) | 2020-07-23 | 2020-07-23 | VR conference control system and method based on face modeling and expression tracking |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111881807A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114339134A (en) * | 2022-03-15 | 2022-04-12 | 中瑞云软件(深圳)有限公司 | Remote online conference system based on Internet and VR technology |
CN114615455A (en) * | 2022-01-24 | 2022-06-10 | 北京师范大学 | Teleconference processing method, teleconference processing device, teleconference system, and storage medium |
-
2020
- 2020-07-23 CN CN202010717430.0A patent/CN111881807A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114615455A (en) * | 2022-01-24 | 2022-06-10 | 北京师范大学 | Teleconference processing method, teleconference processing device, teleconference system, and storage medium |
CN114339134A (en) * | 2022-03-15 | 2022-04-12 | 中瑞云软件(深圳)有限公司 | Remote online conference system based on Internet and VR technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11887234B2 (en) | Avatar display device, avatar generating device, and program | |
US20130063560A1 (en) | Combined stereo camera and stereo display interaction | |
US20040104935A1 (en) | Virtual reality immersion system | |
CN110544301A (en) | Three-dimensional human body action reconstruction system, method and action training system | |
KR101711736B1 (en) | Feature extraction method for motion recognition in image and motion recognition method using skeleton information | |
JP2004537082A (en) | Real-time virtual viewpoint in virtual reality environment | |
CN111710036A (en) | Method, device and equipment for constructing three-dimensional face model and storage medium | |
JPWO2017094543A1 (en) | Information processing apparatus, information processing system, information processing apparatus control method, and parameter setting method | |
JP5833526B2 (en) | Video communication system and video communication method | |
CN102801994A (en) | Physical image information fusion device and method | |
WO2004012141A2 (en) | Virtual reality immersion system | |
CN111881807A (en) | VR conference control system and method based on face modeling and expression tracking | |
Kawai et al. | A support system for visually impaired persons to understand three-dimensional visual information using acoustic interface | |
CN117711066A (en) | Three-dimensional human body posture estimation method, device, equipment and medium | |
CN113965550B (en) | Intelligent interactive remote auxiliary video system | |
JP5759439B2 (en) | Video communication system and video communication method | |
Siegl et al. | An augmented reality human–computer interface for object localization in a cognitive vision system | |
WO2017147826A1 (en) | Image processing method for use in smart device, and device | |
CN112416124A (en) | Dance posture feedback method and device | |
JP5898036B2 (en) | Video communication system and video communication method | |
Magee et al. | Towards a multi-camera mouse-replacement interface | |
CN113342167B (en) | Space interaction AR realization method and system based on multi-person visual angle positioning | |
Zhang et al. | Behavior Recognition On Multiple View Dimension | |
CN116958353B (en) | Holographic projection method based on dynamic capture and related device | |
JP2014086774A (en) | Video communication system and video communication method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |