GB2539183A - A method, an apparatus, a computer program product for augmented reality - Google Patents

A method, an apparatus, a computer program product for augmented reality Download PDF

Info

Publication number
GB2539183A
GB2539183A GB1509476.6A GB201509476A GB2539183A GB 2539183 A GB2539183 A GB 2539183A GB 201509476 A GB201509476 A GB 201509476A GB 2539183 A GB2539183 A GB 2539183A
Authority
GB
United Kingdom
Prior art keywords
parts
video
overlay
speaker
altered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1509476.6A
Other versions
GB201509476D0 (en
Inventor
Vetek Akos
Uusitalo Mikko
Honkala Mikko
Mikko Johannes Karkkainen Leo
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to GB1509476.6A priority Critical patent/GB2539183A/en
Publication of GB201509476D0 publication Critical patent/GB201509476D0/en
Publication of GB2539183A publication Critical patent/GB2539183A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Ophthalmology & Optometry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Architecture (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention relates to a method; an apparatus and a computer program product. The method comprises receiving video frames in electronic form: segmenting a video frame to detect interesting parts 520 such as eyes, mouth or lips: processing the interesting parts, e.g. enlarge them, to obtain altered parts 530; generating an overlay 540 comprising data on the altered parts: and applying the overlay 540 to the video frame to produce a presentation of the video frames including altered parts. The embodiments can be utilized as an augmented reality conversation aid to improve lip-reading by enlarging the mouth of a speaker. In an alternative embodiment the invention may be used to interpret feelings, emotions or gestures of a speaker.

Description

Intellectual Property Office Application No. GII1509476.6 RTM Date:26 November 2015 The following terms are registered trade marks and should be read as such wherever they occur in this document: Bluetooth (Page 5) Fire wire (Page 6) Intellectual Property Office is an operating name of the Patent Office www.gov.uk /ipo A METHOD, AN APPARATUS, A COMPUTER PROGRAM PRODUCT FOR AUGMENTED REALITY
Technical Field
The present embodiments relate to augmented reality.
Background
This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.
Augmented reality refers to a technology where real-world environment is supplemented by computer-generated sensory input, such as sound, video, graphics. This is made possible by electronic devices comprising a processor, a display, sensors and input devices. Such electronic devices may contain smartphones, tablet computers, head-mounted displays, eyewear (e.g. smartglasses, bionic contact lenses), handheld displays, speech and gesture recognition systems.
Summary
An improved method for an augmented reality conversation aid to enhance the understanding of speech, and a technical equipment implementing the same has been invented.. Various aspects of the invention include a method, an apparatus, a server, a client and a computer readable medium comprising a computer program stored therein, which are characterized by what is stated in the independent claims. Various embodiments of the invention are disclosed in the dependent claims. This invention can be used for many other application areas than understanding speech, e.g., understanding other elements of communication.
According to a first aspect, there is provided a method, comprising receiving video frames in electronic form; segmenting a video frame to detect interesting parts; processing the interesting parts to obtain altered parts; generating an overlay comprising data on the altered parts; and applying the overlay to have a presentation of the video frames with altered parts.
According to a second aspect, there is provided an apparatus comprising at least one processor, memory including computer program code, the memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following: to receive video frames in electronic form; to segment a video frame to detect interesting parts; to process the interesting parts to obtain altered parts; to generate an overlay comprising data on the altered parts; and to apply the overlay to have a presentation of the video frames with altered parts.
According to a third aspect, there is provided an apparatus comprising at least processing means and memory means, the apparatus further comprising means for receiving video frames in electronic form; means for segmenting a video frame to detect interesting parts; means for processing the interesting parts to obtain altered parts; means for generating an overlay comprising data on the altered parts; and means for applying the overlay to have a presentation of the video frames with altered parts.
According to a fourth aspect, there is provided a computer program product embodied on a non-transitory computer readable medium, comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to receive video frames in electronic form; to segment a video frame to detect interesting parts; to process the interesting parts to obtain altered parts; to generate an overlay comprising data on the altered parts; and to apply the overlay to have a presentation of the video frames with altered parts.
According to an embodiment, the video frames present a person.
According to an embodiment, the overlay is applied on a display of an apparatus.
According to an embodiment, the processing of the interesting parts comprises one or more of the following: morphing, superimposing, amplifying, deamplifying.
According to an embodiment, the method comprises preprocessing the received video frame before segmentation.
According to an embodiment, the preprocessing comprises determining and amplifying at least one of the following: motion vectors; color changes.
According to an embodiment, a machine learning technique is used for segmenting the video frame.
According to an embodiment, the video frames are received via wireless data transfer network.
According to an embodiment, the method comprises recording the video frames.
According to an embodiment, the method comprises receiving the video frames via a data transfer network.
Description of the Drawings
In the following, various embodiments of the invention will be described in more detail with reference to the appended drawings, in which Fig. 1 shows an apparatus according to an embodiment; Fig. 2 shows a layout of the apparatus according to an embodiment; Fig. 3 shows a layout of the apparatus according to another embodiment; Fig. 4 shows a layout of the apparatus according to another embodiment; Fig. 5 shows a method according to an embodiment; and Fig. 6 is a flowchart of a method according to an embodiment.
Description of Example Embodiments
In the following, several embodiments of the invention will be described in the context of augmented reality utilizing head-mounted display system. It is to be noted, however, that the invention is not limited to head-mounted display system. In fact, and as will be shown, the different embodiments have applications in any environment where conversation aid is expected, and with any display device, including smartphones, tablet computers, head-mounted displays, eyewear (e.g. smartglasses, bionic contact lenses), handheld displays, etc. The present embodiments proposes an augmented reality (AR) technology that improves understanding of a speech for a listener by altering (e.g. morphing, superimposing) parts of the face of the other party ("speaker"). This enhances the features that aid lip-reading. Particularly, the embodiments are about a wearable conversation aid that by using AR techniques through a head-mounted display (HMD) enhances parts of the face of the speaker so that the listener has a better understanding on what the other person is saying and meaning.
An apparatus according to an embodiment is shown in Figure 1 as a simplified block chart.
The apparatus 50 is an electronic device for example a head-mounted display, an augmented reality glasses (optical see-through or virtual see-through glasses or a near-eye display) or any apparatus that can be used to view a person either through a direct view or on a display or both. The apparatus 50 may comprise a housing 30 for incorporating and protecting the device. The apparatus 50 further may comprise a display 32, for example, a liquid crystal display or any other display technology capable of displaying images and/or videos. The display 32 can be transparent (i.e. see-through) or non-transparent display, and it may comprise various optics, e.g. diffraction optics, holographic optics, polarized optics, and/or reflective optics.
The apparatus 50 may further comprise a keypad 34. According to another embodiment, any suitable data or user interface mechanism may be employed. For example, the user interface may be implemented as a virtual keyboard or data entry system as part of a touch-sensitive display. The apparatus may comprise a microphone 36 or any suitable audio input which may be a digital or analogue signal input. The apparatus 50 may further comprise an audio output device, which may be any of the following: an earpiece 38, a speaker or an analogue audio or digital audio output connection. The apparatus 50 may also comprise a battery (according to another embodiment, the device may be powered by any suitable mobile energy device, such as solar cell, fuel cell or clockwork generator). The apparatus may comprise one or more cameras 42 capable of recording or capturing images and/or video, or may be connected to one. The apparatus may further comprise various sensors, e.g. a depth sensor, an acceleration sensor, a gyroscope sensor.
The apparatus 50 may comprise a controller 56 or processor for controlling the apparatus. The controller 56 may be connected to memory 58 which, according to an embodiment, may store both data in the form of image and audio data and/or may also store instructions for implementation on the controller 56. The controller 56 may further be connected to codec circuitry 54 suitable for carrying out coding and decoding or audio and/or video data or assisting in coding and decoding carried out by the controller 56.
According to an embodiment, the camera 42 is capable of recording or detecting individual video frames which are then passed to the codec 54 or controller for processing. According to an embodiment, the apparatus may receive the video image data for processing from another device prior to transmission and/or storage. According to an embodiment, the apparatus 50 may receive the images for processing either wirelessly or by a wired connection.
According to an embodiment, the apparatus 50 may comprise an infrared port for short range line of sight communication to other devices. According to an embodiment, the apparatus 50 may further comprise any suitable short range communication solution such as for example a Bluetooth wireless connection or a USB/firewire wired solution. The apparatus 50 may comprise radio interface circuitry 52 connected to the controller and suitable for generating wireless communication signals for example for communication with a cellular communications network, a wireless communications system or a wireless local area network. The apparatus 50 may further comprise an antenna 44 connected to the radio interface circuitry 52 for transmitting radio frequency signals generated at the radio interface circuitry 52 to other apparatus(es) and for receiving radio frequency signals from other apparatus(es).
As discussed, the present embodiments are for improving understanding of a speech by altering (e.g. morphing, superimposing) parts of the face of the other party ("speaker"). This means that when a person ("listener") needs conversation aid according to the present embodiments, the listener puts on an apparatus, for example the one shown in Figure 2. The listener starts listening to the speaker, and looking at the speaker through the apparatus. A camera of the apparatus may be configured to record video image of the speaker. Alternatively, the apparatus may receive video image from an external recording system. The video is processed according to present embodiments to generate an overlay, and the result of the processing (i.e. the overlay) is retrieved to the display of the apparatus. The display can be a see-through display, wherein the listener is able to see the speaker through the overlay, which alters the face of the speaker. Alternatively, the display can be a non-see-through display, wherein the video data (i.e. the speaker with altered face) with the overlay is played. It is appreciated that in both cases a live video is recorded of the speaker for the purpose of the present embodiments. The speaker's voice can be caught by the microphone of the apparatus, wherein the voice is transmitted to the earpiece, for example. Alternatively, the speaker's voice can be transmitted from the microphone of the speaker to the earpiece of the listener e.g. via wireless data transfer technology.
As mentioned, in the present embodiments, the face or any other interesting body part of the speaker is seen on the listener's apparatus, wherein parts of the face/body have been altered to enhance the verbal and non-verbal expressions.
Alteration can be accomplished by a video amplification or attenuation of the selected parts, e.g. mouth area and/or eyebrows. A video of the speaker is recorded by a camera mounted on the listener's near-eye apparatus. The video is processed by the apparatus so that the movement of the selected parts can be made greater or smaller. The processing may be started with a preprocessing phase, where motion vectors in the video are amplified or deamplified. Color changes in the video content may be utilized as well in the alteration process. The alteration may not need prior information of the video content. It is also appreciated that the preprocessing is optional.
For the next phase of the processing, machine learning models are used to semantically segment the video and amplify only the interesting parts, such as mouth, eyebrows. As a segmentation technique, any known or future technology may be utilized, for example deep learning (e.g., Recurrent Neural Networks combined with Convolutional Neural Networks). In general, any unsupervised, semi-supervised, or fully supervised clustering, segmentation or classification method may be utilized. If the processing does not comprise the preprocessing phase, the machine learning method can be applied to the raw input video signal. In one embodiment, a large corpus of annotated or semi-annotated video data is collected, and a machine learning algorithm is trained using this corpus.
The detection of facial information can be further improved by integrating sensors such as e.g. a camera(s) looking down at the face, capacitive sensors, EMG (electromyography) into the headset. For example, in a situation, where also the other party is wearing a head-mounted display, in order to pick up on facial expressions one can use a camera-based approach, e.g. small cameras mounted along parts of the frame of the glasses pointing at the face, with electrodes attached to the skin and measuring the muscle activity bio-electrically, or with a depth sensing approach, e.g. capacitive sensors mounted on the frames.
In addition to the alteration, effects such as modifying the appearance of the skin (e.g. "removing" facial hair that otherwise impedes lip reading) to become more transparent and reveal more information on the mouth, teeth, tongue etc. can be applied.
According to an embodiment, the listener is able to decide how much alternation is applied to the image. Alternatively, the apparatus may automatically make the decision based on listener's earlier selections. Those selections may also be transmitted to a service, that collects multiple users' selections and thus is able to make better default selections for new users and new kinds of content.
If the audio signal is also available, especially in noisy conditions, this information can also be used (based on knowledge on speech production and related generative models) to augment/modify the appearance of the face of the speaker even further. Such a further enhancement reveals more information on the vocal apparatus and provides useful additional information and improves lip-reading. The audio signal can be obtained by a microphone in the glasses of the listener, or in noisy conditions the audio signal can be transmitted from the speaker's personal microphone (e.g. if both parties wear AR glasses or some other wearable device with audio recording capability) to the listener.
Figure 2 illustrates an embodiment of an apparatus. Figures 3 and 4 illustrate yet further embodiments of the apparatus. The apparatus of Figure 3 represents optically see-through AR glasses, while the apparatus of Figure 4 represents a virtual/video see-through solution with a camera. The present embodiments may be implemented in both of them.
Figure 5 illustrates a method according to an embodiment. The method comprises - receiving video stream comprising video frames 510, said video frames 510 presenting a person; - segmenting a video frame to detect semantically interesting parts 520; -altering the semantically interesting parts to obtain altered parts 530; - generating an overlay 540 comprising data on the altered parts 530; - applying the overlay on a display 550 of the apparatus to have a presentation (video or direct view) of the person with the altered parts. The video stream is continuously analyzed and the overlay and the person are aligned both spatially and temporally.
As mentioned, the image frames can be received from a camera of the apparatus recording the person. Alternatively, the image frames can be received from an external camera via wireless communication network. Such an embodiment is useful e.g. in an auditorium, where a speaker is given a speech at front and a camera system of the auditorium is recording video of the speaker. This video is passed to an apparatus (head-mounted display, a mobile device, etc.) of the listener (perhaps sitting far from the speaker), and the apparatus is configured to process the video to generate an overlay for the received video, where the overlay comprises altered (e.g. facial) parts of the speaker. The overlay is laid over the received video and this combination is displayed to the listener.
Yet, as a further alternative, the image frames can be broadcast to the apparatus (e.g. a head-mounted display, a mobile device, etc.). For example, when a person is watching news, a talk-show, or any other program being presented on a television, such program can be transmitted to the apparatus. The apparatus is configured to process the transmitted image frames of the program according to present embodiments in order to alter the face(s) appearing in the image frames. This would enhance the television watching experience because there is no need (even if possible) to provide any subtitles for the speech.
Yet further, Figure 6 is a flowchart of a method according to an embodiment.
The method comprises -receiving video frames 610 in electronic form; - segmenting a video frame to detect interesting parts 620; - processing the interesting parts to obtain altered parts 630; - generating an overlay comprising data on the altered parts 640; - applying the overlay 650 to have a presentation of the video frames with the altered parts.
The present embodiments rely on people's natural ability to use cues from the face (e.g. on the mouth area, but not limited to that), and enhance (magnify/distort in a meaningful way) them using AR overlays, etc. parts of the other person's face to aid the process of understanding what the other(s) say and mean.
In previous, alterations of mouth area and eyebrows have been discussed. It is appreciated that alterations can be applied to other parts of the face as well. For example, it is often difficult for an autistic people to understand and interpet other people's thoughts, feelings and emotions from their gestures and expression. By the present embodiments, it is possible to help autistic people better read other people's faces and to interpret their emotions and intents. It has been realized that autistic people may observe other people's mouth area expressions better than e.g. eye expressions. Thus, by the present embodiments, the mouth area of another person may be enhanced more than the eyes' area for an autistic person who is co-operating or communicating with the other person.
In addition, the present embodiments may be utilized vice versa. Sometimes, the expressions and gestures of an autistic person are stiff and hard to interpret. In such a situation, the present embodiments may be applied to the video image of an autistic person to enhance his/her expressions to another person.
For reading the emotion of the other party, the present embodiments may utilize integrated sensors e.g. motion sensors, gaze and pupil dilation by video camera or bioelectrical sensors, e.g. EOG (electrooculography), heartrate, EEG (electroencephalogram). In addition, also other sensors can be mounted for example in the temple area on the headset providing affective or emotional cues. Further, audio information can be used as well (if available) and possibly other information that can be picked up by sensors in the headset. In addition, other effects can be applied such as modifying the appearance of the skin -for example -to become more transparent and reveal more information on the mouth, teeth, tongue or any other body parts.
Yet further, in a similar fashion, gestures in general can be enhanced to e.g. emphasize spoken information or amplify/de-amplify/normalize cultural differences when it comes to gesturing, since people rely on a multitude of non-verbal cues in face-to-face communication. In such an embodiment, the whole body of the speaker is recorded and segmentable interesting parts include not only the face of the person but also a head, arms, a pose, etc. Yet, as a further example, the present embodiments may be applied in other fields, for example in medical fields. For example, by using a camera for taking video stream from an inner ear or from eyeground, the video provided for a displaying apparatus may be altered by the present embodiments. The medical expert may define interesting parts to be segmented, e.g. an optic nerve head in an eyeground. According to an embodiment, the segmented parts may be altered (e.g. enlarged) in order to generate an overlay (with better view to the part) to be applied to the video stream to improve observation. According to another embodiment, an overlay may contain data on the pre-defined normal scale. Such an overlay being applied to a video frame of a corresponding part of the patient may give indications on whether there are abnormalities in the patient's organ. Also some other data may be provided to such an overlay in real-time, e.g. guidance, other reference data (e.g. patient's diagnosis history, measures, etc.) Yet further, in previous it has been discussed that the listener is wearing a head-mounted display or such, where video captured of the speaker is processed. However, it is also possible, that both the listener and the speaker are wearing the head-mounted display systems for the purposes of the present embodiments.
The various embodiments may provide advantages. The present embodiments provide a solution where the understanding of speech is enhanced e.g. by improving listener's possibilities for lip reading. This further improves a quality of life for people with hearing impairment, or for autistic people and people around them. The present embodiments also improve ability and safety for people interacting in noisy environments.
The various embodiments of the invention can be implemented with the help of computer program code that resides in a memory and causes the relevant apparatuses to carry out the invention. For example, a device may comprise circuitry and electronics for handling, receiving and transmitting data, computer program code in a memory, and a processor that, when running the computer program code, causes the device to carry out the features of an embodiment. Yet further, a network device like a server may comprise circuitry and electronics for handling, receiving and transmitting data, computer program code in a memory, and a processor that, when running the computer program code, causes the network device to carry out the features of an embodiment.
It is obvious that the present invention is not limited solely to the above-presented embodiments, but it can be modified within the scope of the appended claims.
GB1509476.6A 2015-06-02 2015-06-02 A method, an apparatus, a computer program product for augmented reality Withdrawn GB2539183A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1509476.6A GB2539183A (en) 2015-06-02 2015-06-02 A method, an apparatus, a computer program product for augmented reality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1509476.6A GB2539183A (en) 2015-06-02 2015-06-02 A method, an apparatus, a computer program product for augmented reality

Publications (2)

Publication Number Publication Date
GB201509476D0 GB201509476D0 (en) 2015-07-15
GB2539183A true GB2539183A (en) 2016-12-14

Family

ID=53677596

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1509476.6A Withdrawn GB2539183A (en) 2015-06-02 2015-06-02 A method, an apparatus, a computer program product for augmented reality

Country Status (1)

Country Link
GB (1) GB2539183A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563333A (en) * 2018-04-12 2018-09-21 京东方科技集团股份有限公司 A kind of wearable device and its control method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100153847A1 (en) * 2008-12-17 2010-06-17 Sony Computer Entertainment America Inc. User deformation of movie character images
US20140184644A1 (en) * 2013-01-03 2014-07-03 Qualcomm Incorporated Rendering augmented reality based on foreground object
WO2014205239A1 (en) * 2013-06-20 2014-12-24 Elwha Llc Systems and methods for enhancement of facial expressions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100153847A1 (en) * 2008-12-17 2010-06-17 Sony Computer Entertainment America Inc. User deformation of movie character images
US20140184644A1 (en) * 2013-01-03 2014-07-03 Qualcomm Incorporated Rendering augmented reality based on foreground object
WO2014205239A1 (en) * 2013-06-20 2014-12-24 Elwha Llc Systems and methods for enhancement of facial expressions

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563333A (en) * 2018-04-12 2018-09-21 京东方科技集团股份有限公司 A kind of wearable device and its control method
CN108563333B (en) * 2018-04-12 2022-02-01 京东方科技集团股份有限公司 Wearable device and control method thereof

Also Published As

Publication number Publication date
GB201509476D0 (en) 2015-07-15

Similar Documents

Publication Publication Date Title
Olszewski et al. High-fidelity facial and speech animation for VR HMDs
US9807291B1 (en) Augmented video processing
US11205426B2 (en) Information processing device, information processing method, and program
CN110874129A (en) Display system
US20160210407A1 (en) Method and device for processing content based on bio-signals
KR20190038900A (en) Word Flow Annotation
US20080253695A1 (en) Image storage processing apparatus, image search apparatus, image storage processing method, image search method and program
US11482238B2 (en) Audio-visual sound enhancement
KR20160105439A (en) Systems and methods for gaze-based media selection and editing
CN112034977A (en) Method for MR intelligent glasses content interaction, information input and recommendation technology application
US9794475B1 (en) Augmented video capture
JP5729692B2 (en) Robot equipment
US20210113129A1 (en) A system for determining emotional or psychological states
Bulling et al. Eyewear computers for human-computer interaction
CN110199244B (en) Information processing apparatus, information processing method, and program
Nijholt Capturing obstructed nonverbal cues in augmented reality interactions: a short survey
Bello et al. Inmyface: Inertial and mechanomyography-based sensor fusion for wearable facial activity recognition
US11755277B2 (en) Daydream-aware information recovery system
US11947722B2 (en) Devices and headsets
US20230282080A1 (en) Sound-based attentive state assessment
GB2539183A (en) A method, an apparatus, a computer program product for augmented reality
US11227148B2 (en) Information processing apparatus, information processing method, information processing program, and information processing system
US20200250498A1 (en) Information processing apparatus, information processing method, and program
Zhang et al. I Am an Earphone and I Can Hear My User’s Face: Facial Landmark Tracking Using Smart Earphones
CN113906368A (en) Modifying audio based on physiological observations

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)