GB2600683A - A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device - Google Patents

A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device Download PDF

Info

Publication number
GB2600683A
GB2600683A GB2017094.0A GB202017094A GB2600683A GB 2600683 A GB2600683 A GB 2600683A GB 202017094 A GB202017094 A GB 202017094A GB 2600683 A GB2600683 A GB 2600683A
Authority
GB
United Kingdom
Prior art keywords
face
augmented reality
communication device
user
video call
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB2017094.0A
Other versions
GB202017094D0 (en
Inventor
Gautam Utkarsh
Saney Kavita
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mercedes Benz Group AG
Original Assignee
Daimler AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daimler AG filed Critical Daimler AG
Priority to GB2017094.0A priority Critical patent/GB2600683A/en
Publication of GB202017094D0 publication Critical patent/GB202017094D0/en
Publication of GB2600683A publication Critical patent/GB2600683A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Making an augmented reality (AR) video call in a motor vehicle comprising: capturing an image of a user’s face while the user is not wearing an AR display; capturing a current image of the user’s face while the user is wearing an AR display; estimating the part of the face that is covered by the AR display based on the captured image; fusing the current image with the estimated part of the face; transmitting the fused image. The AR display may be a head mounted display (HMD), headset or glasses. Captured images may be three dimensional, obtained using a 3D camera such as a stereo camera or camera with a depth sensor. Facial expressions, movements or gestures of the covered face portion may be detected and recreated using machine learning. Artificial intelligence may interpret, in real time, expressions of the upper, covered face regions, such as eyes and eyebrows, based on movements of lower face areas, such as lips, mouth or cheeks. Recreated expressions may be rendered into the captured 3D images.

Description

A METHOD FOR PERFORMING AN AUGMENTED REALITY VIDEO CALL IN A MOTOR VEHICLE BY A COMMUNICATION DEVICE, AS WELL AS A CORRESPONDING COMMUNICATION DEVICE
FIELD OF THE INVENTION
[0001] The invention relates to the field of automobiles. More specifically, the invention relates to a method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device.
BACKGROUND INFORMATION
[0002] The face of a user as it appears to other contacts during any augmented reality video call or virtual reality video call, according to the state of the art" is either a three-dimensional avatar or, in case actual video is taken of the user, a face partially covered with the augmented reality/virtual reality headset.
[0003] From an online article "Facebook animates photo-realistic avatars to mimic VR users' faces" it is known that in the virtual reality headset, sensors are embedded in order to capture the face inside the virtual reality headset. Furthermore, implementing photorealistic avatars is known.
[0004] According to the state of the art there is a dependency on the virtual reality /augmented reality headset.
SUMMARY OF THE INVENTION
[0005] It is an object of the invention to provide a method as well as a communication device, by which a more realistic augmented video call may be realized.
[0006] This object is solved by a method as well as a communication device according to the independent claims. Advantageous embodiments are presented in the dependent claims.
[0007] One aspect of the invention relates to a method for performing an augmented reality video call in a motor vehicle by a communication device. An image of a face of a user of the communication device is captured by a capturing device before the augmented video call and without the user wearing an augmented display device. A current image of the face of the user whilst the user is wearing the augmented reality display device is captured, wherein the display device is covering the face at least partially. The at least partially covered part of the face is estimated depending on the captured image before the augmented video call by an electronic computing device of the communication device. The current image is fused with the estimated part to an augmented image by the electronic computing device. The augmented image is transmitted to a further communication device for performing the augmented reality video call.
[0008] The method provides a solution to present the person's actual face in the augmented reality video call, even when the user is wearing the headset. The communication device uses an external camera which removes any dependency on the augmented reality headset and ensures multiple users with different headsets may watch each other in a more realistic way.
[0009] In an embodiment the augmented reality display device is provided as an augmented reality headset. The augmented reality headset may be provided as augmented reality glasses.
[0010] In an embodiment the capturing device is provided as a three-dimensional camera for capturing three-dimensional images.
[0011] In another embodiment a gesture of the covered face is detected by the electronic computing device and additionally the partially covered part is estimated depending on the captured facial gesture.
[0012] In another embodiment the electronic computing device uses a machine learning technique for performing the estimation.
[0013] Another aspect of the invention relates to a communication device for performing an augmented reality video call in a motor vehicle, the communication device comprising at least one capturing device and one electronic computing device, wherein the communication device is configured to perform a method according to the preceding aspect. In particular the method is performed by the communication device.
[0014] Advantageous forms of the configuration of the method are to be regarded as advantageous forms of configuration of the communication device. The communication device therefore comprises means for performing the method.
[0015] Further advantages, features, and details of the invention derive from the following description of a preferred embodiment as well as from the drawing. The features and feature combinations previously mentioned in the description as well as the features and feature combinations mentioned in the following description of the figure and/or shown in the figure alone can be employed not only in the respectively indicated combination but also in any other combination or taken alone without leaving the scope of the invention.
BRIEF DESCRIPTION OF THE DRAWING
[0016] The novel features and characteristics of the disclosure are set forth in the independent claims. The accompanying drawing, which is incorporated in and constitutes part of this disclosure, illustrates an exemplary embodiment and together with the description, serves to explain the disclosed principles. In the figure, the same reference signs are used throughout the figure to refer to identical features and components. Some embodiments of the system and/or methods in accordance with embodiments of the present subject matter are now described below, by way of example only, and with reference to the accompanying figure.
[0017] The Fig. 1 shows a schematic perspective view of an embodiment of the display device.
[0018] In the figure same elements or elements having the same function are indicated by the same reference signs.
DETAILED DESCRIPTION
[0019] In the present document, the word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any embodiment or implementation of the present subject matter described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
[0020] While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawing and will be described in detail below. It should be understood, however, that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the scope of the disclosure.
[0021] The terms "comprises', "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion so that a setup, device or method that comprises a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a system or apparatus preceded by "comprises" or "comprise" does not or do not, without more constraints, preclude the existence of other elements or additional elements in the system or method.
[0022] In the following detailed description of the embodiment of the disclosure, reference is made to the accompanying drawing that forms part hereof, and in which is shown by way of illustration a specific embodiment in which the disclosure may be practiced. This embodiment is described in sufficient detail to enable those skilled in the art to practice the disclosure, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present disclosure. The following description is, therefore, not to be taken in a limiting sense.
[0023] Fig. 1 shows a schematic view of an embodiment of a communication device 10. The communication device 10 is configured to perform an augmented reality video call 12 in a motor vehicle 14. In an example, the motor vehicle 14 is shown from the inside of the motor vehicle 14. The communication device 10 comprises at least one capturing device 16 and one electronic computing device 18. Furthermore, the communication device 10 comprises an augmented reality display device 20, which is shown as a headset or any equivalent thereof. In particular the augmented reality display device 20 is provided as augmented reality glasses. In another embodiment, the augmented reality video call 12 may also be regarded as a virtual reality video call.
[0024] According to an embodiment for performing the augmented reality video call 12 in the motor vehicle 14 by the communication device 10 an image of a face 22 of a user 24 of the communication device 10, without the user 24 wearing an augmented reality display device 20, is captured by the capturing device 16, in particular before the augmented reality video call 12. A current image of the face 22 of the user 24 is captured whilst the user 24 is wearing the augmented reality display device 20, wherein the augmented reality display device 20 is covering the face 22 at least partially. The at least partially covered part of the face 22 is estimated depending on the captured image by an electronic computing device 18. The current image is fused with the estimated part to an augmented image 26 by the electronic computing device 18 and the augmented image 26 is transmitted to a further communication device 28 for performing the augmented reality video call 12.
[0025] The capturing device 16 may be provided as a three-dimensional camera for capturing three-dimensional (3D) images. According to an embodiment a gesture of the covered face 22 is detected by the electronic computing device 18 and the partially covered part is additionally estimated depending on the captured facial gesture.
[0026] The embodiment shown in the Fig. 1 provides a method to send the current face 22 of the user 24 in the augmented reality video call 12, even when the user 24 is wearing the augmented reality display device 20. According to another embodiment the electronic computing device 18 uses a machine learning technique for performing the estimation. The three-dimensional avatars may depict the right facial expression, wherein the presented embodiment uses machine learning to recreate the right facial expressions using the movements of other parts of the face 22 which are not covered by the augmented reality display device 20.
[0027] According to an embodiment a three-dimensional camera, for example a stereo camera or a camera with a depth sensor, is embedded in the back of the headrest of the front seat of the motor vehicle 14.
[0028] In a first step, which may be called the calibration process, multiple three-dimensional images of the face 22 of the user 24, without the user 24 wearing the augmented reality display device 20, are captured by the capturing device 16. This first step is performed before the augmented video call 12.
[0029] During the augmented video call 12, in the second step, while the user 24 is wearing the augmented reality display device 20, an algorithm of the electronic computing device 18 aims to remove the headset from the display video of the user 24 and present only the natural face 22 of the user 24 during the augmented reality video call 12.
[0030] The algorithm therefore overlays the three-dimensional images of the face 22 of the user 24 without the headset taken in the calibration stage over the current face 22 of the user 24 in the current video with the headset and renders a final video with a natural image of the face 22 of the user 24 with the headset removed.
[0031] According to an embodiment, to give natural expressions to the digitally rendered face 22, the communication device 10 utilizes artificial intelligence and machine learning based methods to interpret the facial expressions of the upper part of the face 22, for example eyes and eyebrows, using the movements of the lower parts of the face 22, for example lips, mouth or cheeks in real time.
[0032] The capturing device 16, for example the camera, records the user 24 and processes the recording during the video call 12. The algorithm digitally removes the headset and interprets the facial expressions of the upper part of the face 22, which is covered by the headset. Then it renders these expressions into the three-dimensional images taken during the calibration stage and finally compiles everything into a final image, which may be called the augmented image 26 and which is closer to the natural face 22 of the user 24.
Reference Signs communication device 12 video call 14 motor vehicle 16 capturing device 18 electronic computing device augmented reality display device 22 face 24 user 26 augmented reality image 28 further communication device

Claims (6)

  1. CLAIMS1. A method for performing an augmented reality video call (12) in a motor vehicle (14) by a communication device (10), the method comprising the steps: - capturing an image of a face (22) of a user (24) of the communication device (10), without the user (24) wearing an augmented reality display device (20) of the communication device (10), by a capturing device (16); - capturing a current image of the face (22) of the user (24) whilst the user (24) is wearing the augmented reality display device (20) during the video call (12), wherein the augmented reality display device (20) is covering the face (22) at least partially, by the capturing device (16); - estimating the at least partially covered part of the face (22) depending on the captured image by an electronic computing device (18) of the communication device (10); - fusing the current image with the estimated part to an augmented image (26) by the electronic computing device (18); and - transmitting the augmented image (26) to a further communication device (28) for performing the augmented reality video call (12).
  2. 2. The method according to claim 1, characterized in that the augmented reality display device (20) is provided as an augmented reality headset.
  3. 3. The method according to claim 1 or 2, characterized in that the capturing device (16) is provided as a three-dimensional camera for capturing three-dimensional images.
  4. 4. The method according to any one of claims 1 to 3, characterized in that a gesture of the covered face (22) of the user (24) is detected by the electronic computing device (18) and additionally the partially covered part of the face (22) is estimated depending on the captured facial gesture.
  5. 5. The method according to any one of claims 1 to 4, characterized in that the electronic computing device (18) uses a machine learning technique for performing the estimation.
  6. 6. A communication device (10) for performing an augmented reality video call (12) in a motor vehicle (14), the communication device (10) comprising at least one capturing device (16), one augmented reality display device (20), and one electronic computing device (18), wherein the communication device (10) is configured to perform a method according to any one of claims 1 to 5.
GB2017094.0A 2020-10-28 2020-10-28 A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device Withdrawn GB2600683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB2017094.0A GB2600683A (en) 2020-10-28 2020-10-28 A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2017094.0A GB2600683A (en) 2020-10-28 2020-10-28 A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device

Publications (2)

Publication Number Publication Date
GB202017094D0 GB202017094D0 (en) 2020-12-09
GB2600683A true GB2600683A (en) 2022-05-11

Family

ID=73727150

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2017094.0A Withdrawn GB2600683A (en) 2020-10-28 2020-10-28 A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device

Country Status (1)

Country Link
GB (1) GB2600683A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217621A1 (en) * 2015-01-28 2016-07-28 Sony Computer Entertainment Europe Limited Image processing
EP3096208A1 (en) * 2015-05-18 2016-11-23 Samsung Electronics Co., Ltd. Image processing for head mounted display devices
US20180101227A1 (en) * 2016-10-06 2018-04-12 Google Inc. Headset removal in virtual, augmented, and mixed reality using an eye gaze database

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217621A1 (en) * 2015-01-28 2016-07-28 Sony Computer Entertainment Europe Limited Image processing
EP3096208A1 (en) * 2015-05-18 2016-11-23 Samsung Electronics Co., Ltd. Image processing for head mounted display devices
US20180101227A1 (en) * 2016-10-06 2018-04-12 Google Inc. Headset removal in virtual, augmented, and mixed reality using an eye gaze database

Also Published As

Publication number Publication date
GB202017094D0 (en) 2020-12-09

Similar Documents

Publication Publication Date Title
EP3794851B1 (en) Shared environment for vehicle occupant and remote user
CN107004296B (en) Method and system for reconstructing an occlusion face of a virtual reality environment
US11887234B2 (en) Avatar display device, avatar generating device, and program
CN109952759B (en) Improved method and system for video conferencing with HMD
CN102647606B (en) Stereoscopic image processor, stereoscopic image interaction system and stereoscopic image display method
CN106066701B (en) A kind of AR and VR data processing equipment and method
US11170521B1 (en) Position estimation based on eye gaze
EP3547672A1 (en) Data processing method, device, and apparatus
US20030202686A1 (en) Method and apparatus for generating models of individuals
CN110969658B (en) Localization and mapping using images from multiple devices
EP3541068A1 (en) Head-mountable apparatus and methods
CN108762508A (en) A kind of human body and virtual thermal system system and method for experiencing cabin based on VR
CN113366491A (en) Eyeball tracking method, device and storage medium
JP6978289B2 (en) Image generator, head-mounted display, image generation system, image generation method, and program
CN107016730A (en) The device that a kind of virtual reality is merged with real scene
CN114356072A (en) System and method for detecting spatial orientation of wearable device
US7006102B2 (en) Method and apparatus for generating models of individuals
CN113906736A (en) Video distribution system, video distribution method, and display terminal
CN109765693A (en) Three-dimensional imaging display system and the display methods for showing stereopsis
CN106981100A (en) The device that a kind of virtual reality is merged with real scene
GB2600683A (en) A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device
JP2014071871A (en) Video communication system and video communication method
US20190028690A1 (en) Detection system
CN113170075A (en) Information processing apparatus, information processing method, and program
US20230230333A1 (en) 3d virtual-reality display device, head-mounted display, and 3d virtual-reality display method

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)