WO2017209777A1 - Face and eye tracking and facial animation using facial sensors within a head-mounted display - Google Patents

Face and eye tracking and facial animation using facial sensors within a head-mounted display Download PDF

Info

Publication number
WO2017209777A1
WO2017209777A1 PCT/US2016/046375 US2016046375W WO2017209777A1 WO 2017209777 A1 WO2017209777 A1 WO 2017209777A1 US 2016046375 W US2016046375 W US 2016046375W WO 2017209777 A1 WO2017209777 A1 WO 2017209777A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial
face
user
hmd
light sources
Prior art date
Application number
PCT/US2016/046375
Other languages
English (en)
French (fr)
Inventor
Dov Katz
Michael John TOKSVIG
Ziheng Wang
Timothy Paul OMERNICK
Torin Ross HERNDON
Original Assignee
Oculus Vr, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/172,484 external-priority patent/US10430988B2/en
Priority claimed from US15/172,473 external-priority patent/US9959678B2/en
Application filed by Oculus Vr, Llc filed Critical Oculus Vr, Llc
Priority to KR1020187037042A priority Critical patent/KR102144040B1/ko
Priority to JP2018563028A priority patent/JP6560463B1/ja
Priority to CN201680088273.3A priority patent/CN109643152B/zh
Priority to EP16200100.2A priority patent/EP3252566B1/en
Publication of WO2017209777A1 publication Critical patent/WO2017209777A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0093Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • G02B27/0172Head mounted characterised by optical features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0138Head-up displays characterised by optical features comprising image capture systems, e.g. camera

Definitions

  • the present disclosure generally relates to head-mounted displays (HMDs), and specifically relates to eye and facial tracking within a HMD.
  • the present disclosure further relates to virtual rendering, and specifically relates to virtual animation of a portion of a user's face within a HMD.
  • Virtual reality (VR) systems typically include a display screen that presents virtual reality images, which may depict elements such as objects and users of the systems. Users can be represented by an avatar in the virtual environment.
  • avatars are depicted with only one facial expression, e.g., a default smiling or neutral facial expression, which prevents the user from having a fully immersive experience in a virtual environment.
  • Facial tracking systems provide a more immersive interface.
  • Existing systems that track facial expressions of users include a dedicated peripheral such as a camera, in addition to markers that must be positioned on the face of a user being tracked. These traditional peripherals and markers artificially separate the user from the virtual environment. Thus, existing facial tracking systems are unsuitable for use in a portable, lightweight, and high- performance virtual reality headset.
  • a head mounted display (HMD) in a VR system includes sensors for tracking the eyes and face of a user wearing the HMD.
  • the VR system records calibration attributes such as landmarks of the face of the user. For instance, a landmark describes the location of the user's nose relative to the user's face.
  • the calibration attributes can also be retrieved from an online database of global calibration attributes.
  • the HMD includes facial sensors positioned inside the HMD, and in some embodiments, the HMD also includes light sources that are also positioned inside the HMD. The light sources illuminate portions of the user's face covered by the HMD.
  • the facial sensors capture facial data describing the illuminated portions of the face.
  • the facial data can be images referred to as facial data frames.
  • the VR system analyzes the facial data frames to determine the orientation of planar sections (i.e., small portions of the user's face that are approximated with a plane) of the illuminated portions of face.
  • planar sections i.e., small portions of the user's face that are approximated with a plane
  • the VR system uses pixel brightness information to determine the orientation of surfaces.
  • the pixel brightness depends on the position and/or orientation of the light sources because reflected light is brightest when the angle of incidence equals the angle of reflectance.
  • the VR system aggregates the planar sections of the face and maps the planar sections to landmarks of the face to generate facial animation information describing the face of the user.
  • a HMD in a VR system includes a facial tracking system for tracking a portion of a user's face within the HMD.
  • the facial tracking system illuminates, via one or more light sources, portions of the user's face inside the HMD.
  • the facial tracking system captures a plurality of facial data of the portion of the face.
  • the facial data is captured using one or more facial sensors located inside the HMD.
  • the facial sensors may be imaging sensors, non-imaging sensors, or a combination thereof.
  • the facial tracking system identifies a plurality of planar sections (i.e., small portions of the user's face that are approximated with a plane) of the portion of the face based at least in part on the plurality of facial data.
  • the facial tracking system may map the plurality of planar sections to one or more landmarks of the face, and generate facial animation information based at least in part on the mapping.
  • the facial animation information describes the portion of the user's face (e.g., the portion captured in the facial data).
  • the facial tracking system provides the facial data to, e.g., a console which generates the facial animation.
  • the facial tracking system provides the facial animation to a display of the HMD for presentation to the user.
  • the facial animation information may be used to update portions of a face of a virtual avatar of the user.
  • the user views the virtual avatar when using the HMD, and thus experiences an immersive VR experience.
  • the VR system can also track the eyes of the user wearing the HMD. Accordingly, the facial animation information can be used to update eye orientation of the virtual avatar of the user.
  • a head mounted display comprises:
  • a display element configured to display content to a user wearing the HMD
  • an optics block configured to direct light from the display element to an exit pupil of the
  • a plurality of light sources positioned at discrete locations around the optics block, the plurality of light sources configured to illuminate portions of a face, inside the HMD, of the user;
  • a facial sensor configured to capture one or more facial data of a portion of the face illuminated by one or more of the plurality of light sources
  • a controller configured to:
  • facial animation information describing the portion of the face of the user based at least in part on the plurality of captured facial data.
  • a head mounted display may comprise:
  • a display element configured to display content to a user wearing the HMD
  • an optics block configured to direct light from the display element to an exit pupil of the
  • a plurality of light sources positioned at discrete locations around the optics block, the plurality of light sources configured to illuminate portions of a face, inside the HMD, of the user;
  • a facial sensor configured to capture one or more facial data of a portion of the face illuminated by one or more of the plurality of light sources
  • a controller configured to:
  • VR virtual reality
  • the HMD further may comprise an optics block configured to direct light from the display element to an exit pupil of the HMD.
  • the controller may be further configured to:
  • facial animation information describing the portion of the face of the user is further based on the one or more landmarks.
  • the display element may be configured to display an avatar to the user, and the avatar's face may be based on the facial animation information.
  • the controller may be further configured to:
  • mapping maps the plurality of surfaces to one or more landmarks describing a section of the face; and wherein the facial animation information describing the portion of the face of the user is further based on the mapping.
  • the facial sensor may be selected from a group consisting of: a camera, audio sensor, strain gauge, electromagnetic sensor, and proximity sensor. .
  • the controller may be further configured to:
  • the plurality of light sources may be positioned in a ring arrangement around the optics block, and the instructions provided to the plurality of light sources may be coordinated such that only one light source of the plurality of light sources illuminates the portions of the face at any given time.
  • the controller may be further configured to:
  • the facial animation information may be further based on the position of the eye of the user.
  • a head mounted display comprises:
  • a display element configured to display content to a user wearing the HMD
  • a plurality of light sources positioned at discrete locations outside a line of sight of the user, the plurality of light sources configured to illuminate portions of a face, inside the HMD, of the user;
  • a facial sensor configured to capture one or more facial data of a portion of the face illuminated by one or more of the plurality of light sources
  • a controller configured to:
  • facial animation information describing the portion of the face of the user based at least in part on the plurality of captured facial data.
  • a head mounted display comprises:
  • a display element configured to display content to a user wearing the HMD
  • an optics block configured to direct light from the display element to an exit pupil of the
  • a plurality of light sources positioned at discrete locations around the optics block, the plurality of light sources configured to illuminate portions of a face, inside the HMD, of the user;
  • a facial sensor configured to capture one or more facial data of a portion of the face illuminated by one or more of the plurality of light sources
  • a controller configured to:
  • VR virtual reality
  • a method preferably for providing facial animation information of a user waring a head mounted display (HMD) comprises: illuminating, via one or more light sources, portions of a face, inside a head mounted display (HMD), of a user wearing the HMD;
  • mapping the plurality of planar sections to one or more landmarks of the face ; and generating facial animation information based at least in part on the mapping, the facial animation information describing a portion of a virtual face corresponding to the portion of the user's face.
  • a method may comprises: :
  • HMD head mounted display
  • mapping the plurality of planar sections to one or more landmarks of the face and generating facial animation information based at least in part on the mapping, the facial animation information describing a portion of a virtual face corresponding to the portion of the user's face;
  • VR virtual reality
  • the method further may comprise: updating a virtual face of an avatar using the facial animation information; and providing the virtual face to a display element of the HMD for presentation to the user.
  • the method further may comprise: providing instructions to the user to perform one or more facial expressions in a calibration process;
  • generating the facial animation information is further based on the one or more landmarks of the face.
  • the method further may comprise storing the calibration attributes in an online database including global calibration attributes received from a plurality of HMDs.
  • the one or more landmarks of the face may describe the location of one or more of the following: an eye of the user, an eyebrow of the user, a nose of the user, a mouth of the user, and a cheek of the user.
  • the facial animation information may describe a three-dimensional virtual representation of the portion of the user's face.
  • the facial data may describe frames of an image, the image may compriseg a plurality of pixels, each pixel associated with a coordinate (x, y) location of the image, and identifying the plurality of planar sections of the portion of the face based at least in part on the plurality of facial data may comprise:
  • generating the facial animation information may further be based on the virtual surface.
  • the one or more facial sensors may be selected from a group consisting of:
  • Illuminating the portions of the face may comprise:
  • the plurality of light sources may be positioned in a ring arrangement, and the instructions provided to the plurality of light sources may be coordinated such that only one light source of the plurality of light sources illuminates the portions of the face at any given time.
  • the method further may comprise: receiving specular reflection information from the plurality of facial data; and
  • the facial animation information may further be based on the position of the eye of the user.
  • a method may comprise:
  • calibration attributes including one or more landmarks of a face, inside a head mounted display (HMD), of a user wearing the HMD;
  • facial animation information based at least in part on the mapping, the facial animation information describing a virtual face including the portion of the user's face; and providing the facial animation information to a display of the HMD for presentation to the user.
  • the method further may comprise storing the calibration attributes in an online database including global calibration attributes received from a plurality of HMDs.
  • the method further may comprise: receiving specular reflection information from the plurality of facial data; and
  • Capturing the plurality of facial data of the portion of the face using the one or more facial sensors located inside the HMD and off the line of sight of the user further may comprise illuminating the portions of the face using a plurality of light sources.
  • a method preferably for providing facial animation information of a user waring a head mounted display (HMD) may comprise:
  • HMD head mounted display
  • VR virtual reality
  • one or more computer- readable non-transitory storage media embody software that is operable when executed to perform a method according to the invention or any of the above mentioned embodiments.
  • a system comprises: one or more processors; and at least one memory coupled to the processors and comprising instructions executable by the processors, the processors operable when executing the instructions to perform a method according to the invention or any of the above mentioned embodiments.
  • a computer program product preferably comprising a computer-readable non-transitory storage media, is operable when executed on a data processing system to perform a method according to the invention or any of the above mentioned embodiments.
  • FIG. 1 is a block diagram of a VR system in accordance with an embodiment.
  • FIG. 2 is a block diagram of a facial tracking system of a VR system in accordance with an embodiment.
  • FIG. 3 is a wire diagram of a virtual reality HMD in accordance with an embodiment.
  • FIG. 4 is a wire diagram of an embodiment of a front rigid body of the virtual reality HMD shown in FIG. 3, in accordance with an embodiment.
  • FIG. 5 is a cross section of the front rigid body of the virtual reality HMD in FIG. 4, in accordance with an embodiment.
  • FIG. 6 is a flow chart illustrating a process of facial animation, in accordance with an embodiment.
  • FIG. 1 is a block diagram of a VR system 100 in accordance with an embodiment.
  • the VR system 100 operates in augmented reality (AR) and/or mixed reality (MR) environments.
  • the system 100 shown in FIG. 1 comprises a head mounted display (HMD) 105, an imaging device 135, and a VR input interface 140 that are each coupled to a console 110. While FIG. 1 shows an example system 100 including one HMD 105, one imaging device 135, and one VR input interface 140, in other embodiments, any number of these components are included in the system 100.
  • HMDs 105 each having an associated VR input interface 140 and being monitored by one or more imaging devices 135, with each HMD 105, VR input interface 140, and imaging device 135 communicating with the console 110.
  • imaging devices 135, different and/or additional components may be included in the system 100.
  • the HMD 105 presents content to a user. Examples of content presented by the HMD 105 include one or more images, video, audio, or some combination thereof. In some embodiments, audio is presented via an external device (e.g., speakers and/or headphones) that receives audio information from the HMD 105, the console 1 10, or both, and presents audio data based on the audio information.
  • an external device e.g., speakers and/or headphones
  • An embodiment of the HMD 105 is further described below in conjunction with FIG. 3 through FIG. 5.
  • the HMD 105 comprises one or more rigid bodies, which are rigidly or non-rigidly coupled to each other. A rigid coupling between rigid bodies causes the coupled rigid bodies to act as a single rigid entity. In contrast, a non-rigid coupling between rigid bodies allows the rigid bodies to move relative to each other.
  • the HMD 105 includes an electronic display 115, an optics block 118, one or more locators 120, one or more position sensors 125, an inertial measurement unit (IMU) 130, and a facial tracking system 160.
  • the electronic display 1 15 displays images to the user in accordance with data received from the console 110.
  • the electronic display 1 15 may comprise a single electronic display or multiple electronic displays (e.g., a display for each eye of a user). Examples of the electronic display 1 15 include: a liquid crystal display (LCD), an organic light emitting diode (OLED) display, an active-matrix organic light-emitting diode display (AMOLED), some other display, or some combination thereof.
  • LCD liquid crystal display
  • OLED organic light emitting diode
  • AMOLED active-matrix organic light-emitting diode display
  • the optics block 118 magnifies received image light from the electronic display 115, corrects optical errors associated with the image light, and presents the corrected image light to a user of the HMD 105.
  • the optics block 118 includes one or more optical elements and/or combinations of different optical elements.
  • an optical element is an aperture, a Fresnel lens, a convex lens, a concave lens, a filter, or any other suitable optical element that affects the image light emitted from the electronic display 115.
  • one or more of the optical elements in the optics block 1 18 may have one or more coatings, such as anti-reflective coatings.
  • Magnification of the image light by the optics block 118 allows the electronic display 1 15 to be physically smaller, weigh less, and consume less power than larger displays. Additionally, magnification may increase a field of view of the displayed content. For example, the field of view of the displayed content is such that the displayed content is presented using almost all (e.g., 110 degrees diagonal), and in some cases all, of the user's field of view.
  • the optics block 1 18 is designed so its effective focal length is larger than the spacing to the electronic display 115, which magnifies the image light proj ected by the electronic display 1 15. Additionally, in some embodiments, the amount of magnification is adjusted by adding or removing optical elements.
  • the optics block 118 is designed to correct one or more types of optical errors in addition to fixed pattern noise (i.e., the screen door effect).
  • optical errors include: two-dimensional optical errors, three-dimensional optical errors, or some combination thereof.
  • Two-dimensional errors are optical aberrations that occur in two dimensions.
  • Example types of two-dimensional errors include: barrel distortion, pincushion distortion, longitudinal chromatic aberration, transverse chromatic aberration, or any other type of two-dimensional optical error.
  • Three-dimensional errors are optical errors that occur in three dimensions.
  • Example types of three-dimensional errors include spherical aberration, comatic aberration, field curvature, astigmatism, or any other type of three-dimensional optical error.
  • content provided to the electronic display 1 15 for display is pre-distorted, and the optics block 1 18 corrects the distortion when it receives image light from the electronic display 115 generated based on the content.
  • the locators 120 are objects located in specific positions on the HMD 105 relative to one another and relative to a specific reference point on the HMD 105.
  • a locator 120 may be a light emitting diode (LED), a corner cube reflector, a reflective marker, a type of light source that contrasts with an environment in which the HMD 105 operates, or some combination thereof.
  • LED light emitting diode
  • corner cube reflector a corner cube reflector
  • a reflective marker a type of light source that contrasts with an environment in which the HMD 105 operates, or some combination thereof.
  • the locators 120 may emit light in the visible band (i.e., -380 nm to 750 nm), in the infrared (IR) band (i.e., -750 nm to 1 mm), in the ultraviolet band (i.e., 10 nm to 380 nm), some other portion of the electromagnetic spectrum, or some combination thereof.
  • the visible band i.e., -380 nm to 750 nm
  • IR infrared
  • the ultraviolet band i.e., 10 nm to 380 nm
  • the locators 120 are located beneath an outer surface of the HMD 105, which is transparent to wavelengths of light emitted or reflected by the locators 120 or is thin enough not to substantially attenuate the wavelengths of light emitted or reflected by the locators 120. Additionally, in some embodiments, the outer surface or other portions of the HMD 105 are opaque in the visible band of wavelengths of light. Thus, the locators 120 may emit light in the IR band under an outer surface that is transparent in the IR band but opaque in the visible band.
  • the IMU 130 is an electronic device that generates fast calibration attributes based on measurement signals received from one or more of the position sensors 125.
  • a position sensor 125 generates one or more measurement signals in response to motion of the HMD 105.
  • Examples of position sensors 125 include: one or more accelerometers, one or more gyroscopes, one or more magnetometers, another suitable type of sensor that detects motion, a type of sensor used for error correction of the IMU 130, and/or some combination thereof.
  • the position sensors 125 may be located external to the IMU 130, internal to the IMU 130, and/or some combination thereof.
  • the IMU 130 Based on the one or more measurement signals from one or more position sensors 125, the IMU 130 generates fast calibration attributes indicating an estimated position of the HMD 105 relative to an initial position of the HMD 105.
  • the position sensors 125 include multiple accelerometers to measure translational motion (forward/back, up/down, and left/right) and multiple gyroscopes to measure rotational motion (e.g., pitch, yaw, and roll).
  • the IMU 130 rapidly samples the measurement signals and calculates the estimated position of the HMD 105 from the sampled data.
  • the IMU 130 integrates the measurement signals received from the accelerometers over time to estimate a velocity vector and integrates the velocity vector over time to determine an estimated position of a reference point on the HMD 105.
  • the IMU 130 provides the sampled measurement signals to the console 1 10, which determines the fast calibration attributes.
  • the reference point is a point that is used to describe the position of the HMD 105. While the reference point may generally be defined as a point in space, however, in practice the reference point is defined as a point within the HMD 105 (e.g., a center of the IMU 130).
  • the IMU 130 receives one or more calibration parameters from the console 110. As further discussed below, the one or more calibration parameters are used to maintain tracking of the HMD 105. Based on a received calibration parameter, the IMU 130 may adjust one or more IMU parameters (e.g., sample rate). In some embodiments, certain calibration parameters cause the IMU 130 to update an initial position of the reference point so it corresponds to a next calibrated position of the reference point. Updating the initial position of the reference point as the next calibrated position of the reference point helps reduce accumulated error associated with the determined estimated position.
  • the IMU parameters e.g., sample rate
  • drift error causes the estimated position of the reference point to "drift" away from the actual position of the reference point over time.
  • the facial tracking system 160 tracks portions of a face of a user (e.g., including eyes of the user).
  • the portions of the face are, for example, portions of the face covered by the HMD 105 worn by the user.
  • the facial tracking system 160 collects calibration attributes.
  • the calibration attributes describes landmarks (e.g., a location of an eyebrow or nose of the user) of the face covered by the HMD 105.
  • the facial tracking system 160 uses the tracked portions of the face (may also include eye position) and the calibration attributes to generate facial animation information describing the tracked portions of the user's face.
  • the facial tracking system 160 generates tracking information based on, e.g., tracked portions of a face of the user (may include eye position), calibration attributes, facial animation information, or some combination thereof. Tracking information is information passed to the console 1 10 that may be used for virtual animation of a portion of a user's face. The facial tracking system 160 passes the tracking information to the console 110. In some embodiments, the tracking information does not include facial animation information which is generated by the console 1 10.
  • the facial tracking system 160 includes one or more light sources, one or more facial sensors, and a controller, which is further described in FIG. 2.
  • the facial tracking system 160 tracks eye movements of the user, e.g., corneal sphere tracking to track one or both eyes of a user while the user is wearing the HMD 105.
  • the light sources and the facial sensors are communicatively coupled to a controller that performs data processing for generating the facial animation, performing optical actions, and the like.
  • the imaging device 135 generates slow calibration attributes in accordance with calibration parameters received from the console 1 10.
  • Slow calibration attributes includes one or more images showing observed positions of the locators 120 that are detectable by the imaging device 135.
  • the imaging device 135 includes one or more cameras, one or more video cameras, any other device capable of capturing images including one or more of the locators 120, or some combination thereof.
  • the imaging device 135 may include one or more filters (e.g., used to increase signal to noise ratio).
  • the imaging device 135 is configured to detect light emitted or reflected from locators 120 in a field of view of the imaging device 135.
  • the imaging device 135 may include a light source that illuminates some or all of the locators 120, which retro-reflect the light towards the light source in the imaging device 135. Slow calibration attributes is communicated from the imaging device 135 to the console 1 10, and the imaging device 135 receives one or more calibration parameters from the console 1 10 to adjust one or more imaging parameters (e.g., focal length, focus, frame rate, ISO, sensor temperature, shutter speed, aperture, etc.).
  • the VR input interface 140 is a device that allows a user to send action requests to the console 110.
  • An action request is a request to perform a particular action.
  • an action request may be to start or end an application or to perform a particular action within the application.
  • the VR input interface 140 may include one or more input devices.
  • Example input devices include: a keyboard, a mouse, a game controller, or any other suitable device for receiving action requests and communicating the received action requests to the console 110.
  • An action request received by the VR input interface 140 is communicated to the console 1 10, which performs an action corresponding to the action request.
  • the VR input interface 140 may provide haptic feedback to the user in accordance with instructions received from the console 110. For example, haptic feedback is provided when an action request is received, or the console 110 communicates instructions to the VR input interface 140 causing the VR input interface 140 to generate haptic feedback when the console 110 performs an action.
  • the console 1 10 provides content to the HMD 105 for presentation to a user in accordance with information received from one or more of: the imaging device 135, the HMD 105, and the VR input interface 140.
  • the console 1 10 includes an application store 145, a tracking module 150, and a VR engine 155.
  • Some embodiments of the console 1 10 have different modules than those described in conjunction with FIG. 1.
  • the functions further described below may be distributed among components of the console 1 10 in a different manner than is described here.
  • the application store 145 stores one or more applications for execution by the console 110.
  • An application is a group of instructions, that when executed by a processor, generates content for presentation to the user. Content generated by an application may be in response to inputs received from the user via movement of the HMD 105 or the VR interface device 140. Examples of applications include: gaming applications, conferencing applications, video playback application, or other suitable applications.
  • the tracking module 150 calibrates the system 100 using one or more calibration parameters and may adjust one or more calibration parameters to reduce error in
  • the tracking module 150 determines the position of the HMD 105. For example, the tracking module 150 adjusts the focus of the imaging device 135 to obtain a more accurate position for observed locators on the HMD 105. Moreover, calibration performed by the tracking module 150 also accounts for information received from the IMU 130. Additionally, if tracking of the VR headset 105 is lost (e.g., the imaging device 135 loses line of sight of at least a threshold number of the locators 120), the tracking module 140 re-calibrates some or the entire system 100.
  • the tracking module 150 tracks movements of the HMD 105 using slow calibration information from the imaging device 135.
  • the tracking module 150 determines positions of a reference point of the HMD 105 using observed locators from the slow calibration information and a model of the HMD 105.
  • the tracking module 150 also determines positions of a reference point of the HMD 105 using position information from the fast calibration information. Additionally, in some embodiments, the tracking module 150 uses portions of the fast calibration information, the slow calibration information, or some combination thereof, to predict a future location of the HMD 105.
  • the tracking module 150 provides the estimated or predicted future position of the HMD 105 to the VR engine 155.
  • the VR engine 155 executes applications within the system 100 and receives position information, acceleration information, velocity information, predicted future positions, or some combination thereof of the HMD 105 from the tracking module 150. Based on the received information, the VR engine 155 determines content to provide to the HMD 105 for presentation to a user. In some embodiments, the VR engine 155 generates facial animation information based on tracking information received from the HMD 105. In alternate embodiments, the VR engine 155 receives facial animation information directly from the HMD 105 as part of the tracking information. For example, the VR engine 155 receives facial animation information from the facial animation module 260 of the facial tracking system 160 (further described in FIG. 2).
  • the VR engine 155 Based on the facial animation information, the VR engine 155 generates a facial expression of an avatar and/or a virtual face of an avatar, including eye movements of the avatar, corresponding to a user of a HMD 105. For instance, a facial expression or eye movement of the avatar corresponds to a facial expression or eye movement that the user performs in real life.
  • the VR engine 155 provides the virtual face for presentation to the user via the electronic display 1 15 of the HMD 105.
  • the VR engine 155 if the received information indicates that the user has looked to the left, the VR engine 155 generates content for the HMD 105 that mirrors the user's movement in a virtual environment.
  • the VR engine 155 performs an action within an application executing on the console 1 10 in response to an action request received from the VR input interface 140 and provides feedback to the user that the action was performed.
  • the provided feedback includes visual or audible feedback via the HMD 105 or haptic feedback via the VR input interface 140.
  • FIG. 2 is a block diagram of the facial tracking system 160 of the VR system 100 in accordance with an embodiment.
  • the facial tracking system 160 includes one or more light sources 200, one or more facial sensors 210, and a controller 220.
  • the facial tracking system 160 may be part of a system that is different from the VR system 100.
  • the one or more light sources 200 illuminate portions of a face of the user covered by the HMD 105 wearing the HMD 105, and are positioned at discrete locations on the HMD 105.
  • the light sources 200 are positioned in a ring arrangement.
  • each light source 200 of the plurality of light sources is positioned on a circumference of a circle, e.g., a virtual circle overlaying an eyecup assembly of the HMD 105 (further described in FIG. 4).
  • each light source 200 is positioned at an hour hand position of a typical analog clock.
  • the one or more light sources 200 are light-emitting diodes (LEDs) that emit light in the visible band (i.e., -380 nm to 750 nm), in the infrared (IR) band (i.e., -750 nm to 1 mm), in the ultraviolet band (i.e., 10 nm to 380 nm), some other portion of the electromagnetic spectrum, or some combination thereof.
  • the light sources 200 comprise different optical characteristics for either all of the light sources 200 or between a subset of the light sources 200. An optical characteristic is a feature of the light sources 200.
  • an optical characteristic may be a wavelength of light emitted by the light sources 200, a temporal coherence that describes the correlation between light waves of the light sources 200 at different points in time, or some combination thereof.
  • the light from the light sources 200 can be modulated at different frequencies or amplitudes (i.e., varying intensity) and/or multiplexed in either time or frequency domain.
  • the one or more facial sensors 210 capture facial data of a user of the HMD 105.
  • Facial data describes features of a face of the user, e.g., features of portions of the face covered by the HMD 105.
  • the facial sensors 210 may be imaging type sensors and/or nonimaging type sensors.
  • Imaging type facial sensors 210 are, e.g., cameras that capture images of the portions of the face of the user. The images comprise a plurality of pixels, and the pixels each have a level of brightness.
  • Non-imaging type facial sensors 210 are, e.g., audio sensors, strain gauges, electromagnetic sensors, proximity sensor, or some other non-optical type sensor.
  • the facial sensors 210 may have plurality of parameters such as focal length, focus, frame rate, ISO, sensor temperature, shutter speed, aperture, resolution, etc. In some embodiments, the facial sensors 210 have a high frame rate and high resolution.
  • imaging type facial sensors 210 are positioned such that reflections in response to light from the light sources 200 incident upon a user (e.g., incident upon portions of a face of the user covered by the HMD 105) can be captured over a range of user movements.
  • the facial sensors 210 are positioned off-axis such that they are outside of a line of sight of the user wearing the HMD 105, i.e., if the user looks at the display element 115 of the HMD 105, the facial sensors 210 are not located within the user's direct line of sight.
  • the facial sensors 210 are positioned within the line of sight of the user wearing the HMD 105, i.e., the user can see the facial sensors 210 while looking at the display element 1 15.
  • the facial tracking system 160 does not necessarily need the light sources 200.
  • the facial sensors 210 are proximity sensors based on ultrasound.
  • the facial sensors 210 capture facial data indicating a distance between a facial sensor 210 and portions of a face of a user.
  • the facial sensor 210 determines the distance based on a time that an ultrasound wave takes to reflect off the portions of the face and travel back to the facial sensor 210.
  • the facial sensor 210 emits ultrasound waves toward the face of the user, and the reflected ultrasound waves are detected by the facial sensor 210.
  • the controller 220 controls the facial tracking system 160.
  • the controller 220 includes a facial tracking store 225, a facial data capture module 230, , a calibration module 240, a facial sensor processing module 250, an eye tracking module 255, a facial animation module 260, and a display interface module 270.
  • different and/or additional components may be included in the controller 220.
  • the controller 220 is part of the facial tracking system 160, and thus also part of the HMD 105.
  • some or all of the controller 220 is outside of the HMD 105, e.g., the controller 220 is included as part of the console 1 10 or included in another component and/or system outside of the system 100.
  • Having the controller 220 outside of the HMD 105 may be an advantage in some embodiments because the HMD 105 can reduce the required amount of processing power to execute functions of the controller; in embodiments where the HMD 105 is powered with a rechargeable battery, reducing the processing power increases the battery life of the HMD 105.
  • the facial tracking store 225 stores data recorded by, or used by, the facial tracking system 160.
  • Stored data may include, e.g., tracking information, facial data, eye tracking information, calibration attributes, facial animation information, some other information used for facial tracking, or some combination thereof.
  • Facial data includes information about tracked surfaces of a face of a user wearing the HMD 105.
  • Calibration attributes includes information about landmarks of the user's face. Facial data and calibration attributes are further described below.
  • the facial tracking store 225 may store information retrieved from a source outside of the facial tracking system 160, e.g., from the console 1 10 or from an online source. Other modules of the facial tracking system 160 store information to the facial tracking store 225 and/or retrieve information from the facial tracking store 225.
  • the facial data capture module 230 receives facial data from the facial sensors 210.
  • the facial data capture module 230 provides instructions to one or more light sources 200 to illuminate portions of a face of the user.
  • the facial data capture module 230 also provides instructions to one or more facial sensors 210 to capture one or more facial data frames of the illuminated portions of the face (e.g., portions inside the HMD 105).
  • the facial data capture module 230 stores the captured facial data frames in the facial tracking store 225 and/or any other database on or off of the system 100 that the facial tracking system 160 can access.
  • the facial data capture module 230 provides instructions to the facial sensors 210 to capture facial data of a portion of a face of a user wearing a HMD 105.
  • facial tracking system 160 may not include the light sources 200.
  • the facial data capture module 230 does not provide instructions to the light sources 200 in conjunction with the instructions to the facial sensors 210.
  • the facial data capture module 230 coordinates control of each light source 200.
  • the facial data capture module 230 provides instructions to the light sources 200 such that only one light source emits light at any given time (e.g., per eye or per HMD 105).
  • the light sources in the ring emit light, i.e., illuminate portions of a face of the user, in sequential order, e.g., starting at one light source of the ring and emitting in a clockwise or counterclockwise direction around the eye of the user.
  • the light sources emit light in any other order or type of sequence.
  • each light source 200 is positioned in a ring arrangement, each light source corresponding to an hour hand position (i.e., 1 to 12) of a typical analog clock, around an eye of the user.
  • the light sources corresponding to the even numbers sequentially emit light first, and then the light sources corresponding to the odd numbers sequentially emit light next.
  • the order of light sources emitting is: 2, 4, 6, 8, 10, 12, 1, 3, 5, 7, 9, and 11.
  • the order light sources emitting is random and/or changes over time.
  • the facial data capture module 230 can repeat the same sequence (or different sequences) of illumination over a certain period of time at various rates of illumination.
  • the facial data capture module 230 repeats a clockwise illumination sequence at a rate of 60 illuminations per second for a period of ten seconds.
  • the light sources 200 may be positioned in any other arrangement partem or arbitrary in the HMD 105.
  • the facial data capture module 230 provides instructions to the facial sensors 210 to capture facial data corresponding to each illumination, e.g., each instance of a light source of the plurality illuminating a portion of the user corresponds to a frame of facial data.
  • the facial data capture module 230 must synchronize the illuminations and the frame captures. For example, if the light sources 200 are emitting light at a rate of 24 illuminations per second, then the facial sensors 210 capture frames at a rate of at least 24 frames per second to achieve a desired facial data resolution.
  • the calibration module 240 calibrates the HMD 105 to a user.
  • the calibration module 240 retrieves calibration attributes from the facial tracking store 225, an online calibration server, or some combination thereof, using one or more selection parameters.
  • a selection parameter is a characteristic of a user that is mapped to calibration attributes.
  • a selection parameter may be, e.g., age, race, gender, citizenship, spoken language, some other characteristic that can affect facial expressions, or some combination thereof.
  • the calibration module 240 performs a quality check on the retrieved calibration attributes against captured calibration attributes.
  • the actual calibration attributes is captured during normal operation of the HMD 105.
  • the calibration module 240 generates instructions to guide the user through steps of a calibration process to capture calibration attributes.
  • the calibration module 240 compares the captured, i.e., actual, calibration attributes to retrieved calibration attributes. For example, the expected calibration attributes indicates an expected set of coordinate points of a landmark
  • the actual calibration attributes indicates an actual, i.e., experimentally captured, set of coordinate points of the landmark. If a difference between the retrieved calibration attributes and the actual calibration attributes is less than a threshold value, the retrieved calibration attributes provides sufficient quality for effective use of the HMD 105. In contrast, if a difference between the retrieved calibration attributes and the actual calibration attributes is greater than a threshold value, then the calibration module 240 determines that the expected calibration attributes is too different from the actual calibration attributes for effective use of the HMD 105. The calibration module 240 then uploads the actual calibration attributes and the user's selection parameters to the online server.
  • the calibration module 240 uploads the selection parameters and the actual calibration attributes to the online server regardless of whether or not the retrieved calibration attributes and the actual calibration attributes is greater than a threshold value. In this manner the online calibration server is able to augment a global calibration attributes set that is built from information from many (e.g., thousands) of different users. As the global calibration attributes set gets larger the accuracy of calibration attributes retrieved from it increases, thereby minimizing calibration time on individual HMDs 105.
  • the calibration module 240 may capture actual calibration attributes via instructions that guide the user through steps of a calibration process.
  • the calibration module 240 provides the instructions to the electronic display 115 (e.g., via the display interface module 270) for presentation to the user.
  • the calibration module 240 instructs a user wearing the HMD 105 to perform one or more facial expressions such as blinking, squinting, raising an eyebrow, smiling, frowning, looking in a particular direction, or maintaining a neutral face (i.e., a resting face without any particular facial expressions to provide a baseline for comparison against faces with facial expressions).
  • the facial data capture module 230 works in conjunction with the calibration module 240 to capture facial data corresponding to portions of the user's face as the user is performing the one or more facial expressions. Then, the calibration module 240 maps the captured facial data to corresponding facial expressions, e.g., the calibration module 240 maps a blinking facial expression to facial data captured after the user was instructed to blink. The calibration module 240 stores the captures facial data and the mappings in the facial tracking store 225 and/or any other database on or off of the system 100 that the facial tracking system 160 can access. [0082] In another example use case, the calibration module 240 identifies landmarks of a face of a user based at least in part on the captured facial data and the mappings.
  • Landmarks include locations of, e.g., an eyebrow of the user, an eyelid of the user, a pupil of the user, a nose of the user, a check of the user, a forehead of the user, and the like.
  • the facial data is represented by images captured by imaging type facial sensors 210 (e.g., cameras).
  • the calibration module 240 identifies landmarks by determining that one or more features indicative of a landmark are shown in the captured images. For instance, a feature of a facial expression "raising an eyebrow" is that the user's eyebrow will move.
  • the calibration module 240 identifies images and/or portions of images that correspond to a moving eyebrow based on the brightness and/or intensity levels of pixels of the images.
  • the calibration module 240 determines coordinate points of the one or more pixels and maps the coordinate points to a landmark associated with a location of the eyebrow of the user. For instance, in a two-dimensional image of the plurality of captured images, pixels are organized on a plane with an x-axis and a y-axis. Coordinate (x, y) points (8, 46), (8, 47), and (8, 48) are mapped to the eyebrow landmark.
  • the calibration module 240 stores the mapping in the facial tracking store 225 and/or any other database on or off of the system 100 that the facial tracking system 160 can access.
  • the calibration module 240 if the calibration module 240 is unable to map captured images to a landmark, then the calibration module 240 generates instructions for the user to repeat a facial expression to recapture images corresponding to the facial expression.
  • the calibration module 240 may perform facial calibrations passively (e.g., without alerting a user) or actively (e.g., prompting a user to go through a series of expressions as previously described).
  • the calibration attributes is used in normal operation of the VR system 100.
  • the facial sensor processing module 250 processes the facial data captured by the facial data capture module 230.
  • the captured facial data is based on light reflected off the user.
  • Light emitted by a light source 200 reflects off of planar sections of a face and/or eye of a user, the reflected light is captured by a facial sensor 210.
  • a planar section is a tiny portion of user's face which the facial tracking system 160 approximates as a plane.
  • the captured light is brightest (e.g., highest intensity) when an angle of incidence on the surface equals the angle of the captured light.
  • the brightest pixel of the plurality of pixels is based on a position and/or orientation of a light source 200 from which reflected light originated relative to the planar section that the light was reflected from.
  • the brightest pixel is, e.g., the pixel that has the greatest intensity value where the intensity value represents the amount of light captured by the one or more facial sensors 210.
  • the facial sensor processing module 250 determines the brightest pixel (or brightest pixels) in captured facial data frames using, e.g., image processing techniques known to one skilled in the art.
  • the facial sensor processing module 250 preprocesses the captured facial data frames using noise reduction methods to improve the quality (e.g., resolution of pixel brightness) of the facial data frames, and thus resulting in a more accurate determination of brightness of pixels. For instance, if the facial data frame is too bright or too dim, the facial sensor processing module 250 applies an image brightness offset correction and/or an image filter to the captured facial data frame.
  • noise reduction methods e.g., resolution of pixel brightness
  • the facial sensor processing module 250 may be configured to analyze facial data comprising a plurality of facial data frames captured by the facial sensors 210. Based on the analysis, the facial sensor processing module 250 generates information representing tracked portions of a face of a user wearing the HMD 105. In embodiments with imaging type facial sensors 210, the facial sensor processing module 250 determines which facial data frame of the plurality of facial data frames includes the brightest pixel at a particular pixel location. For instance, each frame of the plurality of frames is a two- dimensional image with pixel locations indicated by coordinate (x, y) pairs and has a dimension of 50 pixels by 50 pixels. Each coordinate pair maps to a particular planar section on the tracked portions of the user's face.
  • each facial data frame of the plurality of facial data frames the position and/or orientation of a light source 200 corresponding to the facial data frame is different.
  • each of the facial data frames may have been captured with a different light source 200 illuminating the tracked portions of the user's face.
  • the facial sensor processing module 250 determines which facial data frame of the plurality of facial data frames includes the brightest pixel at location (0, 0), and repeats this process for each pixel location of each facial data frame, e.g., (0, 1), (0, 2), (0, 3), etc.
  • the facial sensor processing module 250 is able to identify which of the one or more light sources 200 results in a brightest pixel for each coordinate pair, and hence for each corresponding planar section of the tracked portion of the user's face. In some embodiments, the facial sensor processing module 250 simply selects for each coordinate pair the source 200 which resulted in the brightest pixel value. In alternate embodiments, the facial processing module 250 generates an intensity curve for each coordinate pair using the pixels values from the captured facial data frames.
  • the facial sensor processing module 250 determines a normal vector to each planar section of the tracked portions of the user's surface. Accordingly, for each pixel there is a corresponding normal vector to a planar section imaged by that pixel. In some embodiments, for a given pixel imaging portions of the user's face, the facial sensor processing module 250 determines a normal vector using the identified light source 200 that resulted in the brightest pixel value. The orientation of the identified light source 200 is constant and known relative to the facial sensor 210. The facial sensor processing module 250 uses the orientation to estimate a normal vector to the planar section of the user's face. The facial sensor processing module 250 determines a normal vector to a corresponding planar section of the user's face for each pixel. In some embodiments, the facial sensor processing module 250 determines the normal vectors using intensity curves for each pixel.
  • a normal vector to a planar section of the user's face describes an orientation of part of the user's face. Tracked portions of a user's face may be described using a plurality of planar sections, one for each pixel. The facial sensor processing module 250 determines normal vectors corresponding to each of those planar sections. The normal vectors may then be used to generate a virtual surface that describes the tracked portions of the user's face. The virtual surface describes an orientation of an area of the illuminated portions of the face. For example, the virtual surface describes the curvature of the user's nose, eyelid, or cheek.
  • the facial sensor processing module 250 stores the information representing the tracked portions of the face, e.g., the virtual surface, in the facial tracking store 225 and/or any database accessible by the VR system 100.
  • the facial sensor processing module 250 may also provide the information representing the tracked portions to the facial animation module 260 for further processing.
  • the eye tracking module 255 processes the facial data captured by the facial data capture module 230.
  • the facial data describes a specular reflection of light, i.e., light from the light sources 200, and reflection off the cornea of an eye of a user wearing the HMD 105.
  • the specular reflection depends on the originating light source 200.
  • a specular reflection corresponding to a first light source 200 in certain position and orientation in the HMD 105 is different than a specular reflection corresponding to a second light source 200 in a different position and/or orientation in the HMD 105.
  • the specular reflections differ because reflected light is brightest at the angle of incidence.
  • the eye tracking module 255 can map specular reflections to a particular position of a light source 200 from which light corresponding to a specular reflection originated. Based on the mappings, the eye tracking module 255 determines eye tracking information (e.g., a position and/or orientation of the eye of the user), for example, whether an eye is looking in the straight, left, right, upward, or downward direction.
  • eye tracking information e.g., a position and/or orientation of the eye of the user
  • the eye tracking module 255 determines eye tracking information by identifying the brightest pixel (or brightest pixels) in a plurality of captured facial data frames using similar steps as the facial sensor processing module 250.
  • the eye tracking module 255 maps information from the captured images (e.g., (x, y) coordinate points of pixels of the images) and/or light sources 210 to orientations of an eye of the user, e.g., the eye looking up towards a forehead of the user, the eye looking downwards towards a cheek of the user, and the like.
  • the VR system 100 can provide a more immersive experience to the user in a VR environment.
  • the eye tracking module 255 stores the eye tracking information in the facial tracking store 225 and/or any database accessible by the VR system 100.
  • the eye tracking module 255 may also provide the eye tracking information to the facial animation module 260 for further processing.
  • the facial animation module 260 generates a facial animation of some or all of the face of a user of the HMD 105.
  • the facial animation module 260 retrieves facial data representing tracked portions of a face of the user and/or eye tracking information from the facial tracking store 225 and/or any other database with the same data.
  • the facial animation module 260 also retrieves mappings of the calibration module 240 (e.g., landmark information) from the facial tracking store 225 and/or any other database with the same data.
  • the facial animation module 260 generates the facial animation information by aggregating the retrieved facial data and the retrieved mappings.
  • the facial animation module 260 determines a plurality of planar sections of the face corresponding to a location of a landmark of the face, e.g., the planar sections corresponding to the user's nose.
  • the facial animation module 260 combines a plurality of planar sections each corresponding to a location of a landmark.
  • the retrieved mappings include five mappings indicating one-to-one mappings between five landmarks of the face of the user, e.g., a left pupil, a right pupil, a left cheek, a right cheek, and a tip of a nose and five locations, e.g., (x, y) coordinate points and/or sets of coordinate points of a facial data frame.
  • the resulting facial animation information describes a graphical representation of the user's face (e.g., the whole face and/or portions of the face), which includes planar sections
  • the facial animation module 260 combines a plurality of planar sections with the eye tracking information.
  • the facial animation may also include a graphical representation of a position and/or orientation of the user's eyes.
  • the facial animation information can be used to create an avatar of the user, e.g., a 3D virtual avatar representing the user's face in real life or a 3D virtual avatar representing the user's whole body in real life.
  • the 3D virtual avatar does not resemble the likeness of the user (e.g., a generic avatar), and the facial animation information is used to generate a facial expression such as a blink or smile on the virtual avatar.
  • the facial animation module 260 receives the information representing the tracked portions of the face of the user directly from the facial sensor processing module 250 and the eye tracking information directly from the eye tracking module 255.
  • the facial animation module 260 generates the facial animation information by aggregating virtual surfaces generated by the facial sensor processing module 250. For example, the facial animation module 260 combines virtual surfaces of the user's nose, the user's eyes, and the user's cheeks. The facial animation module 260 may use calibration attributes to aggregate the virtual surfaces. For instance, the coordinates of a landmark of the calibration attributes describe an expected position of the virtual surface of the user's nose relative to the virtual surface of the user's eyes.
  • the facial animation module 260 generates subsections of the facial animation information describing a virtual portion of a face of a user not corresponding to a mapping by interpolating data between other planar sections of the face corresponding to a mapping determined by the facial sensor processing module 250.
  • the facial animation module 260 generates subsections of the facial animation information based on other information including information (e.g., from an external source outside of the VR system 100 previously stored in the facial tracking store 225) describing typical geometries and characteristics of a face of a user based on data from a population of users, e.g., the average length of a nose of a user for a certain demographic range.
  • the facial animation module 260 generates the subsections of the facial animation further based on the landmarks identified by the calibration module 240. For example, the facial animation module 260 retrieves, from the facial tracking store 225, a landmark indicating coordinate points of pixels in a 2D image corresponding to the location of a left nostril of the user. Then the facial animation module 260 generates subsections of the facial animation information corresponding to a right nostril of the user by reflecting the coordinate points corresponding to the location of the left nostril across a line corresponding to a location of a centerline of a nose of the user, e.g., because a user's left and right nostrils are typically symmetric about the centerline of the nose.
  • the display interface module 270 provides information from the facial tracking system 160 to the electronic display 1 15 for presentation to a user of the HMD 105.
  • the display interface module 270 provides the facial animation information generated by the facial animation module 260 to the electronic display 1 15.
  • FIG. 3 is a wire diagram of a virtual reality HMD 300 in accordance with an embodiment.
  • the HMD 300 is an embodiment of the HMD 105 and includes a front rigid body 305 and a band 310.
  • the front rigid body 305 includes the electronic display 115 (not shown in FIG. 3), the IMU 130, the one or more position sensors 125, and the locators 120.
  • the position sensors 125 are located within the IMU 130, and neither the IMU 130 nor the position sensors 125 are visible to the user.
  • the locators 120 are located in fixed positions on the front rigid body 305 relative to one another and relative to a reference point 315.
  • the reference point 315 is located at the center of the IMU 130.
  • Each of the locators 120 emit light that is detectable by the imaging device 135.
  • Locators 120, or portions of locators 120 are located on a front side 320 A, a top side 320B, a bottom side 320C, a right side 320D, and a left side 320E of the front rigid body 305 in the example shown in FIG. 3.
  • FIG. 4 is a wire diagram of an embodiment of the front rigid body 305 of the virtual reality HMD 300 shown in FIG. 3, in accordance with an embodiment.
  • the front rigid body 305 includes eyecup assembly 400, eyecup assembly 405, light sources 410 and 415, and facial sensors 420 and 425.
  • the light sources 410, 415 are an embodiment of the light sources 200
  • the facial sensors 420, 425 are an embodiment of the facial sensors 210.
  • Eyecup assemblies 400 and 405 each comprise a plurality of light sources located outside the direct line of sight of a user wearing the HMD 300.
  • eyecup assembly 400 comprises a plurality of light sources including at least light source 410
  • eyecup assembly 405 comprises a plurality of light sources including at least light source 415.
  • five light sources of each plurality of light sources are illustrated at distinct positions around each eyecup assembly.
  • Eyecup assembly 400 is located on the right side of the front rigid body 305
  • eyecup assembly 405 is located on the left side of the front rigid body 305 from the perspective of the user.
  • Facial sensor 420 is located on the right side of the front rigid body 305
  • facial sensor 425 is located on the left side of the front rigid body 305 from the perspective of the user.
  • facial sensors 420 and 425 are positioned out of the line of sight of the user. Similar to the light sources 410 and 415, the facial sensors 420 and 425 are located outside the direct line of sight of the user and orientated toward a face (and eyes) of the user.
  • FIG. 5 is a cross section 500 of the front rigid body 305 of the virtual reality HMD 300 in FIG. 4, in accordance with an embodiment.
  • the front rigid body 305 includes an electronic display element 115 that emits image light toward the optics block 118.
  • the optics block 118 magnifies the image light, and in some embodiments, also corrects for one or more additional optical errors (e.g., distortion, astigmatism, etc.). Then, the optics block 118 directs the altered image light to the exit pupil 505 for presentation to the user.
  • the exit pupil 505 is the location of the front rigid body 305 where an eye 510 of a user wearing the HMD 300 is positioned. For purposes of illustration, FIG.
  • FIG. 5 shows a cross section 500 of a right side of the front rigid body 305 (from the perspective of the user) associated with a single eye 510, but another optical block, separate from the optical block 118, provides altered image light to another eye (i.e., a left eye) of the user.
  • the controller 220 is communicatively coupled to the electronic display 115 so that that the controller (e.g., via the display interface module 270) can provide media, e.g., image and/or video data such as the facial animation information generated by the facial animation module 260, for presentation to the user by the electronic display 1 15. Further, the controller 220 is also communicatively coupled to the light source 410 and facial sensor 420 such that the controller (e.g., via the facial data capture module 230) can provide instructions to the light source 410 and facial sensor 420 to illuminate and capture images of a portion of a face of the user.
  • media e.g., image and/or video data
  • the controller e.g., via the facial data capture module 230
  • a light ray 520 emitted from the light source 410 reflects off a planar section 530 of the face of the user (e.g., a lower eyelid of the user).
  • the light ray's angle of incidence equals the light ray's angle of reflection (i.e., both angles equal 45 degrees so that the angle shown in FIG. 5 is 90 degrees).
  • any particular pixel of the facial data frames is the brightest based on the specific location of the originating light source 410 and the facial sensor 420.
  • a facial data frame, captured by facial sensor 420, of the planar section 530 illuminated by light ray 540 has a different brightest pixel than that of another facial data frame of the same planar section 530 illuminated by a different light ray 520.
  • the location in the facial data frame e.g., a (x, y) coordinate position for a 2D image of the facial data frame
  • the two light rays reflect off different planar sections of the user's face (or eye), and thus have different angles of incidence and reflectance.
  • FIG. 6 is a flow chart illustrating a process 600 of facial animation in accordance with an embodiment.
  • the process 600 is used within the VR system 100 in FIG. 1.
  • the example process of FIG. 6 may be performed by the facial tracking system 160, the HMD 105, the console 1 10, and/or some other system (e.g., an AR or MR system).
  • Other entities perform some or all of the steps of the process in other embodiments.
  • embodiments include different and/or additional steps, or perform the steps in different orders. Additionally, the process includes different and/or additional steps than those described in conjunction with FIG. 6 in some embodiments.
  • the facial tracking system 160 retrieves 610 calibration attributes describing one or more landmarks of a face of a user wearing a HMD 105, e.g., the location of an eyebrow or nose of the user.
  • the facial tracking system 160 retrieves the calibration attributes using a local calibration procedure, from the facial tracking store 225 and/or from a using online calibration server including global calibration attributes.
  • the facial tracking system 160 illuminates 620 portions of the face using light sources 200. For instance, a portion of the face within the HMD 105 is an eye of the user, or the area around an eyebrow, nose, eye, and/or cheek of the user. In embodiments with nonimaging type facial sensors 210, the illuminating 620 is omitted.
  • the facial tracking system 160 (e.g., the facial data capture module 230) captures 630 a plurality of facial data frames of the portions of the face using facial sensors 210.
  • the facial tracking system 160 identifies 640 a plurality of planar sections of the face (or eye) based on the plurality of facial data frames. Each planar section of the face has a location on the face and orientation relative to the face.
  • the facial data frames are images each including a plurality of pixels. The plurality of planar sections are identified based on an analysis of the brightest pixels in each facial data frame of the plurality of facial data frames.
  • the facial tracking system 160 maps 650 the plurality of planar sections to the one more landmarks of the face. For instance, a surface corresponding to a surface of a nose of the user is mapped to a landmark indicating the location of the nose of the user.
  • the facial tracking system 160 generates 660 facial animation information (e.g., in 2D or 3D) describing a portion of the face based at least in part on the mapping.
  • the portion of the face is the portion captured by the facial data frames.
  • the facial tracking system 160 combines mappings to construct an aggregate of planar sections of the face of the user. For instance, a mapping of a surface of the user's nose to a nose landmark is aggregated with mappings of surfaces of the user's eyes to eye landmarks, mappings of surfaces of the user's eyebrows to eyebrow landmarks, and so forth.
  • the facial tracking system 160 provides 670 the facial animation information to a display of the HMD (e.g., electronic display 1 15 of HMD 300) for presentation to the user.
  • the facial animation information is used to generate a facial expression on a virtual avatar and/or a virtual avatar representing a face of the user in real life.
  • the virtual avatar helps provide an immersive VR experience for the user of the VR system 100.
  • the facial tracking system 160 stores, in a database, the facial animation information for future use. Further, the facial tracking system 160 can provide the facial animation information for presentation to other users of the VR system 100.
  • the facial tracking system 160 is described using light sources and facial sensors to track a user's facial expressions and eye movements, it should be noted that the process 600 can be used to track a user's facial expressions and eye movements using other techniques that do not require light sources and/or facial sensors.
  • the other techniques use ultrasound sensors, proximity sensors, etc.
  • a software module is implemented with a computer program product comprising a computer-readable medium containing computer program code, which can be executed by a computer processor for performing any or all of the steps, operations, or processes described.
  • Embodiments of the disclosure may also relate to an apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, and/or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program may be stored in a non-transitory, tangible computer readable storage medium, or any type of media suitable for storing electronic instructions, which may be coupled to a computer system bus.
  • any computing systems referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
  • Embodiments of the disclosure may also relate to a product that is produced by a computing process described herein.
  • a product may comprise information resulting from a computing process, where the information is stored on a non-transitory, tangible computer readable storage medium and may include any embodiment of a computer program product or other data combination described herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Optics & Photonics (AREA)
  • Processing Or Creating Images (AREA)
  • User Interface Of Digital Computer (AREA)
  • Controls And Circuits For Display Device (AREA)
PCT/US2016/046375 2016-06-03 2016-08-10 Face and eye tracking and facial animation using facial sensors within a head-mounted display WO2017209777A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020187037042A KR102144040B1 (ko) 2016-06-03 2016-08-10 헤드 마운트 디스플레이의 얼굴 센서를 사용한 얼굴과 안구 추적 및 얼굴 애니메이션
JP2018563028A JP6560463B1 (ja) 2016-06-03 2016-08-10 ヘッドマウントディスプレイ内の顔センサを使用する顔および目のトラッキングおよび顔アニメーション
CN201680088273.3A CN109643152B (zh) 2016-06-03 2016-08-10 使用头戴式显示器内的面部传感器的面部和眼睛跟踪以及面部动画
EP16200100.2A EP3252566B1 (en) 2016-06-03 2016-11-22 Face and eye tracking and facial animation using facial sensors within a head-mounted display

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US15/172,473 2016-06-03
US15/172,484 2016-06-03
US15/172,484 US10430988B2 (en) 2016-06-03 2016-06-03 Facial animation using facial sensors within a head-mounted display
US15/172,473 US9959678B2 (en) 2016-06-03 2016-06-03 Face and eye tracking using facial sensors within a head-mounted display

Publications (1)

Publication Number Publication Date
WO2017209777A1 true WO2017209777A1 (en) 2017-12-07

Family

ID=60477735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/046375 WO2017209777A1 (en) 2016-06-03 2016-08-10 Face and eye tracking and facial animation using facial sensors within a head-mounted display

Country Status (4)

Country Link
JP (1) JP6560463B1 (ko)
KR (1) KR102144040B1 (ko)
CN (1) CN109643152B (ko)
WO (1) WO2017209777A1 (ko)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113519154A (zh) * 2019-03-13 2021-10-19 惠普发展公司,有限责任合伙企业 检测眼睛跟踪校准误差

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11181973B2 (en) * 2019-05-09 2021-11-23 Apple Inc. Techniques related to configuring a display device
KR102362137B1 (ko) 2019-10-30 2022-02-11 주식회사다스 Bldc 모터 시스템 및 구동 장치
US11054659B2 (en) * 2019-11-07 2021-07-06 Htc Corporation Head mounted display apparatus and distance measurement device thereof
JP7295045B2 (ja) * 2020-01-16 2023-06-20 株式会社コロプラ プログラム、コンピュータが実行する方法及びコンピュータ
JP2022086027A (ja) * 2020-11-30 2022-06-09 株式会社電通 情報処理システム
KR102501719B1 (ko) 2021-03-03 2023-02-21 (주)자이언트스텝 비정면 이미지 기반의 학습 모델을 이용한 페이셜 애니메이션 생성 방법 및 장치
WO2024071632A1 (ko) * 2022-09-30 2024-04-04 삼성전자 주식회사 메타버스 영상을 표시하는 영상 표시 장치 및 그 표시 방법
KR102547358B1 (ko) 2022-11-15 2023-06-23 엠앤앤에이치 주식회사 볼류메트릭 동영상을 이용한 아바타 퍼포밍 장치 및 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100290126A1 (en) * 2002-11-19 2010-11-18 Headplay (Barbados) Inc. Multiple Imaging Arrangements for Head Mounted Displays
US20130169683A1 (en) * 2011-08-30 2013-07-04 Kathryn Stone Perez Head mounted display with iris scan profiling
WO2013173531A1 (en) * 2012-05-16 2013-11-21 Keane Brian E Synchronizing virtual actor's performances to a speaker's voice
US20140118357A1 (en) * 2012-10-26 2014-05-01 The Boeing Company Virtual Reality Display System
US20160054791A1 (en) * 2014-08-25 2016-02-25 Daqri, Llc Navigating augmented reality content with a watch

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6850872B1 (en) * 2000-08-30 2005-02-01 Microsoft Corporation Facial image processing methods and systems
EP1599830A1 (en) * 2003-03-06 2005-11-30 Animetrics, Inc. Generation of image databases for multifeatured objects
US8752963B2 (en) * 2011-11-04 2014-06-17 Microsoft Corporation See-through display brightness control
JP6066676B2 (ja) * 2012-11-06 2017-01-25 株式会社ソニー・インタラクティブエンタテインメント ヘッドマウントディスプレイおよび映像提示システム
US10203762B2 (en) * 2014-03-11 2019-02-12 Magic Leap, Inc. Methods and systems for creating virtual and augmented reality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100290126A1 (en) * 2002-11-19 2010-11-18 Headplay (Barbados) Inc. Multiple Imaging Arrangements for Head Mounted Displays
US20130169683A1 (en) * 2011-08-30 2013-07-04 Kathryn Stone Perez Head mounted display with iris scan profiling
WO2013173531A1 (en) * 2012-05-16 2013-11-21 Keane Brian E Synchronizing virtual actor's performances to a speaker's voice
US20140118357A1 (en) * 2012-10-26 2014-05-01 The Boeing Company Virtual Reality Display System
US20160054791A1 (en) * 2014-08-25 2016-02-25 Daqri, Llc Navigating augmented reality content with a watch

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113519154A (zh) * 2019-03-13 2021-10-19 惠普发展公司,有限责任合伙企业 检测眼睛跟踪校准误差
US11973927B2 (en) 2019-03-13 2024-04-30 Hewlett-Packard Development Company, L.P. Detecting eye tracking calibration errors

Also Published As

Publication number Publication date
JP2019525288A (ja) 2019-09-05
KR20190004806A (ko) 2019-01-14
CN109643152B (zh) 2020-03-13
CN109643152A (zh) 2019-04-16
KR102144040B1 (ko) 2020-08-13
JP6560463B1 (ja) 2019-08-14

Similar Documents

Publication Publication Date Title
US9959678B2 (en) Face and eye tracking using facial sensors within a head-mounted display
EP3252566B1 (en) Face and eye tracking and facial animation using facial sensors within a head-mounted display
US11604509B1 (en) Event camera for eye tracking
US10257507B1 (en) Time-of-flight depth sensing for eye tracking
US10430988B2 (en) Facial animation using facial sensors within a head-mounted display
US10614577B1 (en) Eye tracking system with single point calibration
CN109643152B (zh) 使用头戴式显示器内的面部传感器的面部和眼睛跟踪以及面部动画
US10268290B2 (en) Eye tracking using structured light
KR102062658B1 (ko) 안구 모델을 생성하기 위한 각막의 구체 추적
US10025384B1 (en) Eye tracking architecture for common structured light and time-of-flight framework
US10401625B2 (en) Determining interpupillary distance and eye relief of a user wearing a head-mounted display
US10684674B2 (en) Tracking portions of a user's face uncovered by a head mounted display worn by the user
US10120442B2 (en) Eye tracking using a light field camera on a head-mounted display
US10529113B1 (en) Generating graphical representation of facial expressions of a user wearing a head mounted display accounting for previously captured images of the user's facial expressions
US10725537B2 (en) Eye tracking system using dense structured light patterns
US10878594B1 (en) Boundary region glint tracking
US10957059B1 (en) Multi-pattern depth camera assembly
US10109067B2 (en) Corneal sphere tracking for generating an eye model
US10789777B1 (en) Generating content for presentation by a head mounted display based on data captured by a light field camera positioned on the head mounted display
US11435820B1 (en) Gaze detection pipeline in an artificial reality system
US10509467B1 (en) Determining fixation of a user's eyes from images of portions of the user's face enclosed by a head mounted display
US10495882B1 (en) Positioning cameras in a head mounted display to capture images of portions of a face of a user

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16904226

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018563028

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20187037042

Country of ref document: KR

Kind code of ref document: A

122 Ep: pct application non-entry in european phase

Ref document number: 16904226

Country of ref document: EP

Kind code of ref document: A1