GB2600683A

GB2600683A - A method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device

Info

Publication number: GB2600683A
Application number: GB2017094.0A
Authority: GB
Inventors: Gautam Utkarsh; Saney Kavita
Original assignee: Daimler AG
Current assignee: Mercedes Benz Group AG
Priority date: 2020-10-28
Filing date: 2020-10-28
Publication date: 2022-05-11
Also published as: GB202017094D0

Abstract

Making an augmented reality (AR) video call in a motor vehicle comprising: capturing an image of a user’s face while the user is not wearing an AR display; capturing a current image of the user’s face while the user is wearing an AR display; estimating the part of the face that is covered by the AR display based on the captured image; fusing the current image with the estimated part of the face; transmitting the fused image. The AR display may be a head mounted display (HMD), headset or glasses. Captured images may be three dimensional, obtained using a 3D camera such as a stereo camera or camera with a depth sensor. Facial expressions, movements or gestures of the covered face portion may be detected and recreated using machine learning. Artificial intelligence may interpret, in real time, expressions of the upper, covered face regions, such as eyes and eyebrows, based on movements of lower face areas, such as lips, mouth or cheeks. Recreated expressions may be rendered into the captured 3D images.

Description

A METHOD FOR PERFORMING AN AUGMENTED REALITY VIDEO CALL IN A MOTOR VEHICLE BY A COMMUNICATION DEVICE, AS WELL AS A CORRESPONDING COMMUNICATION DEVICE

FIELD OF THE INVENTION

[0001] The invention relates to the field of automobiles. More specifically, the invention relates to a method for performing an augmented reality video call in a motor vehicle by a communication device, as well as a corresponding communication device.

BACKGROUND INFORMATION

[0002] The face of a user as it appears to other contacts during any augmented reality video call or virtual reality video call, according to the state of the art" is either a three-dimensional avatar or, in case actual video is taken of the user, a face partially covered with the augmented reality/virtual reality headset.

[0003] From an online article "Facebook animates photo-realistic avatars to mimic VR users' faces" it is known that in the virtual reality headset, sensors are embedded in order to capture the face inside the virtual reality headset. Furthermore, implementing photorealistic avatars is known.

[0004] According to the state of the art there is a dependency on the virtual reality /augmented reality headset.

SUMMARY OF THE INVENTION

[0005] It is an object of the invention to provide a method as well as a communication device, by which a more realistic augmented video call may be realized.

[0006] This object is solved by a method as well as a communication device according to the independent claims. Advantageous embodiments are presented in the dependent claims.

[0007] One aspect of the invention relates to a method for performing an augmented reality video call in a motor vehicle by a communication device. An image of a face of a user of the communication device is captured by a capturing device before the augmented video call and without the user wearing an augmented display device. A current image of the face of the user whilst the user is wearing the augmented reality display device is captured, wherein the display device is covering the face at least partially. The at least partially covered part of the face is estimated depending on the captured image before the augmented video call by an electronic computing device of the communication device. The current image is fused with the estimated part to an augmented image by the electronic computing device. The augmented image is transmitted to a further communication device for performing the augmented reality video call.

[0008] The method provides a solution to present the person's actual face in the augmented reality video call, even when the user is wearing the headset. The communication device uses an external camera which removes any dependency on the augmented reality headset and ensures multiple users with different headsets may watch each other in a more realistic way.

[0009] In an embodiment the augmented reality display device is provided as an augmented reality headset. The augmented reality headset may be provided as augmented reality glasses.

[0010] In an embodiment the capturing device is provided as a three-dimensional camera for capturing three-dimensional images.

[0011] In another embodiment a gesture of the covered face is detected by the electronic computing device and additionally the partially covered part is estimated depending on the captured facial gesture.

[0012] In another embodiment the electronic computing device uses a machine learning technique for performing the estimation.

[0013] Another aspect of the invention relates to a communication device for performing an augmented reality video call in a motor vehicle, the communication device comprising at least one capturing device and one electronic computing device, wherein the communication device is configured to perform a method according to the preceding aspect. In particular the method is performed by the communication device.

[0014] Advantageous forms of the configuration of the method are to be regarded as advantageous forms of configuration of the communication device. The communication device therefore comprises means for performing the method.

[0015] Further advantages, features, and details of the invention derive from the following description of a preferred embodiment as well as from the drawing. The features and feature combinations previously mentioned in the description as well as the features and feature combinations mentioned in the following description of the figure and/or shown in the figure alone can be employed not only in the respectively indicated combination but also in any other combination or taken alone without leaving the scope of the invention.

BRIEF DESCRIPTION OF THE DRAWING

[0016] The novel features and characteristics of the disclosure are set forth in the independent claims. The accompanying drawing, which is incorporated in and constitutes part of this disclosure, illustrates an exemplary embodiment and together with the description, serves to explain the disclosed principles. In the figure, the same reference signs are used throughout the figure to refer to identical features and components. Some embodiments of the system and/or methods in accordance with embodiments of the present subject matter are now described below, by way of example only, and with reference to the accompanying figure.

[0017] The Fig. 1 shows a schematic perspective view of an embodiment of the display device.

[0018] In the figure same elements or elements having the same function are indicated by the same reference signs.

DETAILED DESCRIPTION

[0019] In the present document, the word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any embodiment or implementation of the present subject matter described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

[0020] While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawing and will be described in detail below. It should be understood, however, that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the scope of the disclosure.

[0021] The terms "comprises', "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion so that a setup, device or method that comprises a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a system or apparatus preceded by "comprises" or "comprise" does not or do not, without more constraints, preclude the existence of other elements or additional elements in the system or method.

[0022] In the following detailed description of the embodiment of the disclosure, reference is made to the accompanying drawing that forms part hereof, and in which is shown by way of illustration a specific embodiment in which the disclosure may be practiced. This embodiment is described in sufficient detail to enable those skilled in the art to practice the disclosure, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present disclosure. The following description is, therefore, not to be taken in a limiting sense.

[0023] Fig. 1 shows a schematic view of an embodiment of a communication device 10. The communication device 10 is configured to perform an augmented reality video call 12 in a motor vehicle 14. In an example, the motor vehicle 14 is shown from the inside of the motor vehicle 14. The communication device 10 comprises at least one capturing device 16 and one electronic computing device 18. Furthermore, the communication device 10 comprises an augmented reality display device 20, which is shown as a headset or any equivalent thereof. In particular the augmented reality display device 20 is provided as augmented reality glasses. In another embodiment, the augmented reality video call 12 may also be regarded as a virtual reality video call.

[0024] According to an embodiment for performing the augmented reality video call 12 in the motor vehicle 14 by the communication device 10 an image of a face 22 of a user 24 of the communication device 10, without the user 24 wearing an augmented reality display device 20, is captured by the capturing device 16, in particular before the augmented reality video call 12. A current image of the face 22 of the user 24 is captured whilst the user 24 is wearing the augmented reality display device 20, wherein the augmented reality display device 20 is covering the face 22 at least partially. The at least partially covered part of the face 22 is estimated depending on the captured image by an electronic computing device 18. The current image is fused with the estimated part to an augmented image 26 by the electronic computing device 18 and the augmented image 26 is transmitted to a further communication device 28 for performing the augmented reality video call 12.

[0025] The capturing device 16 may be provided as a three-dimensional camera for capturing three-dimensional (3D) images. According to an embodiment a gesture of the covered face 22 is detected by the electronic computing device 18 and the partially covered part is additionally estimated depending on the captured facial gesture.

[0026] The embodiment shown in the Fig. 1 provides a method to send the current face 22 of the user 24 in the augmented reality video call 12, even when the user 24 is wearing the augmented reality display device 20. According to another embodiment the electronic computing device 18 uses a machine learning technique for performing the estimation. The three-dimensional avatars may depict the right facial expression, wherein the presented embodiment uses machine learning to recreate the right facial expressions using the movements of other parts of the face 22 which are not covered by the augmented reality display device 20.

[0027] According to an embodiment a three-dimensional camera, for example a stereo camera or a camera with a depth sensor, is embedded in the back of the headrest of the front seat of the motor vehicle 14.

[0028] In a first step, which may be called the calibration process, multiple three-dimensional images of the face 22 of the user 24, without the user 24 wearing the augmented reality display device 20, are captured by the capturing device 16. This first step is performed before the augmented video call 12.

[0029] During the augmented video call 12, in the second step, while the user 24 is wearing the augmented reality display device 20, an algorithm of the electronic computing device 18 aims to remove the headset from the display video of the user 24 and present only the natural face 22 of the user 24 during the augmented reality video call 12.

[0030] The algorithm therefore overlays the three-dimensional images of the face 22 of the user 24 without the headset taken in the calibration stage over the current face 22 of the user 24 in the current video with the headset and renders a final video with a natural image of the face 22 of the user 24 with the headset removed.

[0031] According to an embodiment, to give natural expressions to the digitally rendered face 22, the communication device 10 utilizes artificial intelligence and machine learning based methods to interpret the facial expressions of the upper part of the face 22, for example eyes and eyebrows, using the movements of the lower parts of the face 22, for example lips, mouth or cheeks in real time.

[0032] The capturing device 16, for example the camera, records the user 24 and processes the recording during the video call 12. The algorithm digitally removes the headset and interprets the facial expressions of the upper part of the face 22, which is covered by the headset. Then it renders these expressions into the three-dimensional images taken during the calibration stage and finally compiles everything into a final image, which may be called the augmented image 26 and which is closer to the natural face 22 of the user 24.

Reference Signs communication device 12 video call 14 motor vehicle 16 capturing device 18 electronic computing device augmented reality display device 22 face 24 user 26 augmented reality image 28 further communication device

Claims

CLAIMS1. A method for performing an augmented reality video call (12) in a motor vehicle (14) by a communication device (10), the method comprising the steps: - capturing an image of a face (22) of a user (24) of the communication device (10), without the user (24) wearing an augmented reality display device (20) of the communication device (10), by a capturing device (16); - capturing a current image of the face (22) of the user (24) whilst the user (24) is wearing the augmented reality display device (20) during the video call (12), wherein the augmented reality display device (20) is covering the face (22) at least partially, by the capturing device (16); - estimating the at least partially covered part of the face (22) depending on the captured image by an electronic computing device (18) of the communication device (10); - fusing the current image with the estimated part to an augmented image (26) by the electronic computing device (18); and - transmitting the augmented image (26) to a further communication device (28) for performing the augmented reality video call (12).
2. The method according to claim 1, characterized in that the augmented reality display device (20) is provided as an augmented reality headset.
3. The method according to claim 1 or 2, characterized in that the capturing device (16) is provided as a three-dimensional camera for capturing three-dimensional images.
4. The method according to any one of claims 1 to 3, characterized in that a gesture of the covered face (22) of the user (24) is detected by the electronic computing device (18) and additionally the partially covered part of the face (22) is estimated depending on the captured facial gesture.
5. The method according to any one of claims 1 to 4, characterized in that the electronic computing device (18) uses a machine learning technique for performing the estimation.
6. A communication device (10) for performing an augmented reality video call (12) in a motor vehicle (14), the communication device (10) comprising at least one capturing device (16), one augmented reality display device (20), and one electronic computing device (18), wherein the communication device (10) is configured to perform a method according to any one of claims 1 to 5.