WO2025019441A1 - Generation of object-based multi-sensory content for virtual worlds - Google Patents
Generation of object-based multi-sensory content for virtual worlds Download PDFInfo
- Publication number
- WO2025019441A1 WO2025019441A1 PCT/US2024/038077 US2024038077W WO2025019441A1 WO 2025019441 A1 WO2025019441 A1 WO 2025019441A1 US 2024038077 W US2024038077 W US 2024038077W WO 2025019441 A1 WO2025019441 A1 WO 2025019441A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- virtual world
- content
- data
- examples
- objects
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/25—Output arrangements for video game devices
- A63F13/28—Output arrangements for video game devices responding to control signals received from the game device for affecting ambient conditions, e.g. for vibrating players' seats, activating scent dispensers or affecting temperature or light
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/016—Input arrangements with force or tactile feedback as computer generated output to the user
Definitions
- the present disclosure relates to providing multi-sensory (MS) experiences, and is more specifically directed to providing MS experiences corresponding to virtual worlds.
- Luminaires for example, are used extensively as an expression of art and function for concerts. However, each installation is designed specifically for a unique set of luminaires. Delivering a lighting design beyond the set of fixtures the system was designed for is generally not feasible. Other systems that attempt to deliver light experiences more broadly simply do so by extending the screen visuals algorithmically, but are not specifically authored.
- Haptics content is designed for a specific haptics apparatus. If another device, such as a game controller, mobile phone or even a different brand of haptics device is used, there has been no way to translate the creative intent of content to the different actuators.
- At least some aspects of the present disclosure may be implemented via methods, such as audio processing methods.
- the methods may be implemented, at least in part, by a control system such as those disclosed herein.
- Some such methods involve receiving, by a control system, virtual world data.
- the virtual world data may include virtual world object data corresponding to one or more virtual world objects.
- Some disclosed methods involve receiving, by the control system, virtual world state data corresponding to a virtual world state.
- Some disclosed methods involve generating, by the control system, object-based multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data.
- Some disclosed methods involve providing, by the control system, the object-based MS content to an MS Tenderer.
- the object-based MS content may be, or may include, light-based content, haptic content, air flow content or combinations thereof.
- At least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects.
- at least some of the virtual world data may correspond to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
- the virtual world state may be a state of a virtual world in which the one or more virtual world objects exist.
- Some disclosed methods may involve rendering, by the MS Tenderer, the objectbased MS content to one or more control signals for one or more actuators residing in a real- world environment in which video data corresponding to a virtual world is being presented on one or more displays.
- the rendering may involve synchronizing the object-based MS content with the video data.
- audio data corresponding to the virtual world may be reproduced in the real-world environment.
- the rendering may involve synchronizing the object-based MS content with the audio data.
- Some disclosed methods may involve analyzing, by the control system and prior to the rendering, the virtual world data to determine rendering parameters corresponding to the virtual world data. According to some such examples, the rendering may be based, at least in part, on the rendering parameters.
- the rendering parameters may include scaling parameters.
- the scaling parameters may be based, at least in part, on maximum virtual world object parameter values.
- the one or more virtual world objects may include one or more virtual world entities, one or more interactive non-entity virtual world objects, one or more virtual world sound sources, one or more virtual world light sources, or combinations thereof.
- the virtual world state may be based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof.
- Some disclosed methods may involve receiving, by the control system, virtual world event data associated with the virtual world state and with one or more virtual world objects, wherein generating the object-based MS content may be based, at least in part, on the virtual world event data.
- Some disclosed methods may involve adding MS content metadata to the one or more virtual world objects.
- the MS content metadata may correspond to the object-based MS content.
- Some disclosed methods may involve analyzing, by the control system, virtual world object content of the one or more virtual world objects.
- the virtual world object content may be, or may include, virtual world audio content, virtual world video content, or combinations thereof.
- generating the object-based MS content may be based, at least in part, on one or more results of the analyzing process.
- analyzing the virtual world object content may involve analyzing an envelope of a virtual world audio signal. Some disclosed methods may involve temporally aligning generated MS content with the virtual world audio signal.
- analyzing the virtual world object content may involve analyzing a virtual world object portion of a virtual world video signal corresponding to a virtual world object, analyzing a context of the virtual world object portion, or combinations thereof.
- Some such examples also may involve estimating, by the control system, a creative intent corresponding to the virtual world object.
- generating the object-based MS content may be based, at least in part, on the estimated creative intent.
- analyzing the virtual world object content may involve a virtual light source object analysis of one or more virtual world light source objects.
- generating the object-based MS content may be based, at least in part, on the virtual light source object analysis.
- Some disclosed methods may involve determining at least one of a player position or a player viewpoint.
- generating the object-based MS content may be based, at least in part, on the player position, the player viewpoint, or a combination thereof.
- generating the object-based MS content may be based, at least in part, on a machine learning process implemented by the control system.
- generating the object-based MS content may be performed by a neural network implemented by the control system.
- Some or all of the operations, functions and/or methods described herein may be performed by one or more devices according to instructions (e.g., software) stored on one or more computer-readable non-transitory media.
- Such non-transitory media may include one or more memory devices such as those described herein, including but not limited to one or more random access memory (RAM) devices, read-only memory (ROM) devices, etc. Accordingly, some innovative aspects of the subject matter described in this disclosure can be implemented in one or more computer-readable non-transitory media having software stored thereon.
- an apparatus may include an interface system and a control system.
- the control system may include one or more general purpose single- or multi-chip processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs) or other programmable logic devices, discrete gates or transistor logic, discrete hardware components, or combinations thereof.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable gate arrays
- the control system may be configured to perform some or all of the disclosed methods.
- Figure 1 A is a block diagram that shows examples of components of an apparatus capable of implementing various aspects of this disclosure.
- Figure IB shows example elements of an endpoint.
- Figure 2 shows examples of actuator elements.
- Figure 3 shows example elements of a system for the creation and playback of multi- sensory (MS) experiences.
- Figure 4 shows example elements of a system for the creation and playback of multi- sensory experiences corresponding to virtual worlds.
- Figure 5 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds.
- Figure 6 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds.
- Figure 7 is a flow diagram that outlines one example of a method that may be performed by an apparatus or system such as those disclosed herein.
- MS multi-sensory
- This disclosure describes devices, systems and methods for analyzing the scenes, objects and context of virtual worlds and generating object-based multi-sensory (MS) content.
- object-based MS content objects and context of virtual worlds and generating object-based multi-sensory (MS) content.
- object-based MS content objects and context of virtual worlds and generating object-based multi-sensory (MS) content.
- object-based MS data objects-based MS data
- object-based MS data may be generalized for a wide range of playback environments with a wide range of actuator types, numbers of actuators, etc.
- object-based MS data may include object-based light data, object-based haptic data, object-based air flow data, or object-based positional actuator data, object-based olfactory data, object-based smoke data, object based data for one or more other types of sensor effects, or combinations thereof.
- the object-based MS data may include sensory objects and corresponding sensory metadata.
- the term “virtual world” refers generally to a computer-simulated environment. In some examples, the virtual world may be populated by one or more avatars corresponding to — but not necessarily resembling — one or more people in the real world.
- the terms “MS content,” “MS objects,” etc. generally refer to content other than audio content or video content.
- MS content also may be referred to herein as “sensory content,” “sensory objects,” etc., in part because in some instances only one type of sensory content may be involved, such as only light-based content, only haptics-based content, etc. However, in other examples combinations of sensory content may be provided.
- Figure 1 A is a block diagram that shows examples of components of an apparatus capable of implementing various aspects of this disclosure.
- the types and numbers of elements shown in Figure 1 A are merely provided by way of example. Other implementations may include more, fewer and/or different types and numbers of elements.
- the apparatus 101 may be, or may include, a device that is configured for performing at least some of the methods disclosed herein, such as a smart audio device, a laptop computer, a cellular telephone, a tablet device, a smart home hub, etc.
- the apparatus 101 may be, or may include, a server that is configured for performing at least some of the methods disclosed herein.
- the apparatus 101 includes at least an interface system 105 and a control system 110.
- the control system 110 may be configured for performing, at least in part, the methods disclosed herein.
- the control system 110 may be configured for receiving virtual world data, including virtual world object data corresponding to one or more virtual world objects. At least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects. In some examples, at least some of the virtual world data may correspond to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
- control system 110 may be configured for receiving virtual world state data corresponding to a virtual world state.
- the virtual world state may be a state of a virtual world in which the one or more virtual world objects exist.
- the virtual world state may be based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof.
- control system 110 may be configured for generating objectbased multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data.
- the control system 110 may be configured for providing the object-based MS content to an MS Tenderer.
- the MS Tenderer may, in some examples, be implemented by another instance of the control system 110.
- the MS renderer may be configured to render the object-based MS content to control signals for one or more actuators residing in a real-world environment in which video data corresponding to a virtual world is being presented on one or more displays.
- the rendering may involve synchronizing the object-based MS content with the video data.
- audio data corresponding to the virtual world is also being reproduced in the real-world environment.
- the rendering may involve synchronizing the object-based MS content with the audio data.
- the audio data may be, or may include, object-based audio data.
- the object-based audio data may include audio signals and corresponding audio metadata.
- the audio data may be, or may include, channel-based audio data.
- the MS Tenderer also may be referred to herein as a “sensory Tenderer,” because in some instances the MS Tenderer may be rendering only one type of MS data, such as object-based lighting data.
- the MS Tenderer may be configured for providing the actuator control signals to one or more controllable actuators of the set of controllable actuators.
- the object-based MS content may include sensory objects and sensory object metadata, which may also be referred to herein as object-based sensory metadata.
- the sensory object metadata may include sensory spatial metadata indicating at least a spatial position for rendering the object-based sensory metadata within the environment, an area or volume for rendering the object-based sensory metadata within the environment, or combinations thereof.
- the object-based sensory metadata does not correspond to any particular sensory actuator in the environment.
- the interface system 105 may include one or more network interfaces and/or one or more external device interfaces (such as one or more universal serial bus (USB) interfaces). According to some implementations, the interface system 105 may include one or more wireless interfaces. The interface system 105 may include one or more devices for implementing a user interface, such as one or more microphones, one or more speakers, a display system, a touch sensor system and/or a gesture sensor system. In some examples, the interface system 105 may include one or more interfaces between the control system 110 and a memory system, such as the optional memory system 115 shown in Figure 1 A. However, the control system 110 may include a memory system in some instances.
- the control system 110 may, for example, include a general purpose single- or multichip processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, and/or discrete hardware components.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- control system 110 may reside in more than one device.
- a portion of the control system 110 may reside in a device within an environment (such as a laptop computer, a tablet computer, a smart audio device, etc.) and another portion of the control system 110 may reside in a device that is outside the environment, such as a server.
- a portion of the control system 110 may reside in a device within an environment and another portion of the control system 110 may reside in one or more other devices of the environment.
- Non-transitory media may include memory devices such as those described herein, including but not limited to random access memory (RAM) devices, read-only memory (ROM) devices, etc.
- RAM random access memory
- ROM read-only memory
- the one or more non-transitory media may, for example, reside in the optional memory system 115 shown in Figure 1 A and/or in the control system 110. Accordingly, various innovative aspects of the subject matter described in this disclosure can be implemented in one or more non-transitory media having software stored thereon.
- the software may, for example, include instructions for controlling at least one device to process audio data.
- the software may, for example, be executable by one or more components of a control system such as the control system 110 of Figure 1A.
- the apparatus 101 may include the optional microphone system 120 shown in Figure 1 A.
- the optional microphone system 120 may include one or more microphones.
- one or more of the microphones may be part of, or associated with, another device, such as a speaker of the speaker system, a smart audio device, etc.
- the apparatus 101 may include the optional actuator system 125 shown in Figure 1 A.
- the optional actuator system 125 may include one or more loudspeakers, one or more haptic devices, one or more light fixtures, also referred to herein as luminaires, one or more fans or other air-moving devices, one or more display devices, including but not limited to one or more televisions, one or more positional actuators, one or more other types of devices for providing an MS experience, or combinations thereof.
- the term “light fixture” as used herein refers generally to any actuator that is configured to provide light.
- light fixture encompasses various types of light sources, including individual light sources such as light bulbs, groups of light sources such as light strips, light panels such as light-emitting diode (LED) panels, projectors, display devices such as television (TV) screens, etc.
- a “light fixture” may be moveable, and therefore the word “fixture” in this context does not mean that a light fixture is necessarily in a fixed position in space.
- positional actuators refers generally to devices that are configured to change a position or orientation of a person or object, such as motion simulator seats.
- Loudspeakers may sometimes be referred to herein as “speakers.”
- the optional actuator system 125 may include a display system including one or more displays, such as one or more light-emitting diode (LED) displays, one or more organic light-emitting diode (OLED) displays, etc.
- the optional sensor system 130 may include a touch sensor system and/or a gesture sensor system proximate one or more displays of the display system.
- the control system 110 may be configured for controlling the display system to present a graphical user interface (GUI), such as a GUI related to implementing one of the methods disclosed herein.
- GUI graphical user interface
- the apparatus 101 may include the optional sensor system 130 shown in Figure 1 A.
- the optional sensor system 130 may include a touch sensor system, a gesture sensor system, one or more cameras, etc.
- This disclosure describes methods for rendering and delivering a flexibly scaled multi-sensory (MS) immersive experience (MSIE) to different playback environments, which also may be referred to herein as endpoints.
- endpoints may include a room, such as the living room of a home, a car, a cinema, a night club or other venue, an AR/VR headset, a PC, a mobile device, etc.
- Figure IB shows example elements of an endpoint.
- the endpoint is a living room 1001 containing multiple actuators 008, some furniture 1010 and a person 1000 — also referred to herein as a user — who will consume a flexibly-scaled MS experience.
- Actuators 008 are devices capable of altering the environment 1001 that the user 1000 is in. Actuators 008 may include one or more haptic devices, one or more light fixtures, also referred to herein as luminaires, one or more fans or other air-moving devices, one or more display devices, including but not limited to one or more televisions, one or more positional actuators, one or more other types of devices for providing an MS experience, or combinations thereof.
- the number of actuators 008, the arrangement of actuators 008 and the capabilities of actuators 008 in the space 1001 may vary significantly between different endpoint types.
- the number, arrangement and capabilities of actuators 008 in a car will generally be different from the number, arrangement and capabilities of actuators 008 in a living room, a night club, etc.
- the number, arrangement and/or capabilities of actuators 008 may vary significantly between different instances of the same type, e.g., between a small living room with 2 actuators 008 and a large living room with 16 actuators 008.
- the present disclosure describes various methods for creating and delivering flexibly-scaled MSIEs to these non-homogenous endpoints.
- Figure 2 shows examples of actuator elements.
- the actuator is a luminaire 1100, which includes a network module 1101, a control module 1102 and a light emitter 1103.
- the light emitter 1103 includes one or more lightemitting devices, such as light-emitting diodes, which are configured to emit light into an environment in which the luminaire 1100 resides.
- the network module 1101 is configured to provide network connectivity to one or more other devices in the space, such as a device that sends commands to control the emission of light by the luminaire 1100.
- the network module 1101 is an instance of the interface system 105 of Figure 1A.
- the control module 1102 is configured to receive signals via the network module 1101 and to control the light emitter 1103 accordingly.
- the control module 1102 is an instance of the control system 110 of Figure 1A.
- actuators also may include a network module 1101 and a control module 1102, but may include other types of actuating elements. Some such actuators may include one or more haptic devices, one or more fans or other air-moving devices, one or more positional actuators, one or more loudspeakers, one or more display devices, etc.
- Figure 3 shows example elements of a system for the creation and playback of multi- sensory (MS) experiences.
- MS multi- sensory
- system 300 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein.
- system 300 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
- creating and providing an object-based MS Immersive Experience (MSIE) approach involves the application of a suite of technologies for creation, delivery and rendering of object-based sensory data, which may include sensory objects and corresponding sensory metadata, to the actuators 008.
- MSIE MS Immersive Experience
- multi-sensory (MS) effects are represented using what may be referred to herein as multi-sensory (MS) objects, or simply as “sensory objects.”
- properties such as layer-type and priority may be assigned to and associated with attached to each sensory object, enabling content creators’ intent to be represented in the rendered experiences. Detailed examples of sensory object properties are described below.
- system 300 includes a content creation tool 000 that is configured for designing multi-sensory (MS) immersive content and for outputting object-based sensory data 005, either separately or in conjunction with corresponding audio data 011 and/or video data 012, depending on the particular implementation.
- the object-based sensory data 005 may include time stamp information, as well as information indicating the type of sensory object, the sensory object properties, etc.
- the object-based sensory data 005 is not “channel -based” data that corresponds to one or more particular sensory actuators in a playback environment, but instead is generalized for a wide range of playback environments with a wide range of actuator types, numbers of actuators, etc.
- the object-based sensory data 005 may include object-based light data, object-based haptic data, object-based air flow data, or object-based positional actuator data, object-based olfactory data, object-based smoke data, object based data for one or more other types of sensor effects, or combinations thereof.
- the object-based sensory data 005 may include sensory objects and corresponding sensory metadata.
- the object-based light data may include light object position metadata, light object color metadata, light object size metadata, light object intensity metadata, light object shape metadata, light object diffusion metadata, light object gradient metadata, light object priority metadata, light object layer metadata, or combinations thereof.
- the object-based sensory data 005 may include time data, such as time stamp information.
- time data such as time stamp information.
- the content creation tool 000 is shown providing a stream of object-based sensory data 005 to the experience player 002 in this example, in alternative examples the content creation tool 000 may produce object-based sensory data 005 that is stored for subsequent use. Examples of control parameters exposed by a light-object-based content creation tool are described below.
- the “effect” of an MS object is a synonym for the type of MS object.
- An “effect” is, or indicates, the sensory effect that the MS object is providing. If an MS object is a light object, its effect will involve providing direct or indirect light. If an MS object is a haptic object, its effect will involve providing some type of haptic feedback. If an MS object is an air flow object, its effect will involve providing some type of air flow. As described in more detail below, some examples involve other “effect” categories.
- Some MS objects may contain a persistence property in their metadata. For example, as an moveable MS object moves around in a scene, the moveable MS object may persist for some period of time at locations that the moveable MS object passes through. That period of time may be indicated by persistence metadata.
- the MS Tenderer is responsible for constructing and maintaining the persistence state.
- Layers According to some examples, individual MS objects may be assigned to “layers,” in which MS objects are grouped together according to one or more shared characteristics. For example, layers may group MS objects together according to their intended effect or type, which may include but are not limited to the following:
- layers may be used to group MS objects together according to shared properties, which may include but are not limited to the following:
- MS objects may have a priority property that enables the Tenderer to determine which object(s) should take priority in an environment in which MS objects are contending for limited actuators. For example, if multiple light objects overlap with a single light fixture at a time during which all of the light objects are scheduled to be rendered, a Tenderer may refer to the priority of each light object in order to determine which light object(s) will be rendered.
- priority may be defined between layers or within layers. According to some examples, priority may be linked to specific properties such as intensity. In some examples, priority may be defined temporally: for example, the most recent MS object to be rendered may take precedence over MS objects that have been rendered earlier. According to some examples, priority may be used to specify MS objects or layers that should be rendered regardless of the limitations of a particular actuator system in a playback environment.
- Spatial panning laws may define how an MS object’s position and movement across a space results in activating MS actuators as the MS object moves between the MS actuators, etc.
- the mixing mode may specify how multiple MS objects are multiplexed onto a single actuator.
- mixing modes may include one or more of the following:
- Max mode select the MS object which activates an actuator the most
- Mix mode mix in some or all the objects according to a rule set, for example by summing activation levels, taking the average of activation levels, mixing color according to activation level or priority level, etc.;
- MaxNmix mix in the top N MS objects (by activation level), according to a rule set.
- MS content files may include metadata such as trim passes or mastering environment.
- trim controls may act as guidance on how to modulate the default rendering algorithm for specific environments or conditions at the endpoint.
- Trim controls may specify ranges and/or default values for various properties, including saturation, tone detail, gamma, etc.
- there may be automotive trim controls which provide specific defaults and/or rule sets for rendering in automotive environments, for example guidance that includes only objects of a certain priority or layer.
- Other examples may provide trim controls for environments with limited, complex or sparse multisensory actuators.
- a single piece of multisensory content may include metadata on the properties of the mastering environment such as room size, reflectivity and ambient bias lighting level. The specific properties may differ depending on the desired endpoint actuators. Mastering environment information can aid in providing reference points for rendering in a playback environment.
- MS Object Renderer Various disclosed implementations provide a Tenderer that is configured to render MS effects to actuators in a playback environment.
- system 300 includes an MS renderer 001 that is configured to render object-based sensory data 005 to actuator control signals 310, based at least in part on environment and actuator data 004.
- the MS renderer 001 is configured to output the actuator control signals 310 to MS controllers 003, which are configured to control the actuators 008.
- the MS renderer 001 may be configured to receive light objects and object-based lighting metadata indicating an intended lighting environment, as well as lighting information regarding a local lighting environment.
- the lighting information is one general type of environment and actuator data 004, and may include one or more characteristics of one or more controllable light sources in the local lighting environment.
- the MS renderer 001 may be configured to determine a drive level for each of the one or more controllable light sources that approximates the intended lighting environment.
- the MS renderer 001 (or one of the MS controllers 003) may be configured to output the drive level to at least one of the controllable light sources.
- Some alternative examples may include a separate renderer for each type of actuator 008, such as one renderer for light fixtures, another renderer for haptic devices, another renderer for air flow devices, etc.
- a single renderer may be configured as an MS renderer and as an audio renderer and/or as a video renderer.
- the MS renderer 001 may be configured to adapt to changing conditions.
- the environment and actuator data 004 may include what are referred to herein as “room descriptors” that describe actuator locations (e.g., according to an x,y,z coordinate system or a spherical coordinate system).
- the environment and actuator data 004 may indicate actuator orientation and/or placement properties (e.g., directional and north-facing, omnidirectional, occlusion information, etc.).
- the environment and actuator data 004 may indicate actuator orientation and/or placement properties according to a 3x3 matrix, in which three elements (for example, the elements of the first row) represent spatial position (x,y,z), three other elements (for example, the elements of the second row) represent orientation (roll, pitch, yaw), and three other elements (for example, the elements of the third row) indicate a scale or size (sx, sy, sz).
- the environment and actuator data 004 may include device descriptors that describe the actuator properties relevant to the MS Tenderer 001, such as intensity range and color gamut of a light fixture, the air flow speed range and direction(s) for an air-moving device, etc.
- system 300 includes an experience player 002 that is configured to receive object-based sensory data 005’, audio data 011’ and video data 012’, and to provide object-based sensory data 005 to the MS Tenderer 001, to provide audio data 011 to the audio Tenderer 006 and to provide video data 012 to the video Tenderer 007.
- the reference numbers for the object-based sensory data 005’, audio data 011’ and video data 012’ received by the experience player 002 include primes (‘), in order to suggest that the data may in some instances be encoded.
- the object-based sensory data 005, audio data 011 and video data 012 output by the experience player 002 do not include primes, in order to suggest that the data may in some instances have been decoded by the experience player 002.
- the experience player 002 may be a media player, a game engine or personal computer or mobile device, or a component integrated in a television, DVD player, sound bar, set top box, or a service provider media device such as a Chromecast, Apple TV device, or Amazon Fire TV.
- the experience player 002 may be configured to receive encoded object-based sensory data 005’ along with encoded audio data OI L and/or encoded video data 012’.
- the encoded object-based sensory data 005’ may be received as part of the same bitstream with the encoded audio data OI L and/or the encoded video data 012’ .
- the experience player 002 may be configured to extract the object-based sensory data 005 from the content bitstream and to provide decoded object-based sensory data 005 to the MS Tenderer 001, to provide decoded audio data 011 to the audio Tenderer 006 and to provide decoded video data 012 to the video Tenderer 007.
- time stamp information in the object-based sensory data 005 may be used — for example, by the experience player 102, the MS Tenderer 001, the audio Tenderer 106, the video Tenderer 107, or all of them — to synchronize effects relating to the object-based sensory data 005 with the audio data 111 and/or the video data 112, which may also include time stamp information.
- system 300 includes MS controllers 003 that are configured to communicate with a variety of actuator types using application program interfaces (APIs) or one or more similar interfaces.
- each actuator will require a specific type of control signal to produce the desired output from the Tenderer.
- the MS controllers 003 are configured to map outputs from the MS Tenderer 001 to control signals for each actuator.
- a Philips HueTM light bulb receives control information in a particular format to turn the light on, with a particular saturation, brightness and hue, and a digital representation of the desired drive level.
- the MS Tenderer 001 also may be configured to implement some or all of the MS controllers 003.
- the MS Tenderer 001 also may be configured to implement one or more lighting-based APIs but not haptic-based APIs, or vice versa.
- room descriptors also may describe the size and orientation of the playback environment itself, to establish a relative or absolute coordinate system to which all objects are positioned.
- a display screen may be regarded as the front, in some instances the front and center, and the floor and ceiling may be regarded as the vertical bounds.
- the room descriptors also may also indicate bounds corresponding with the left, right, front, and rear, walls relative to the front position.
- the room descriptor also may be provided in terms of a matrix, such as a 3x3 matrix. This room descriptor information is useful in describing the physical dimensions of the playback environment, for example in physical units of distance such as meters.
- sensory object locations, sensory object sizes, and sensory object orientations may be described in units that are relative to the room size, for example in a range from -1 to 1.
- Room descriptors may also describe a preferred viewing position, in some instances according to a matrix.
- actuators 008 may include lights and/or light strips (also referred to herein as “luminaires”), vibrational motors, air flow generators, positional actuators, or combinations thereof.
- audio data 011 and video data 012 are rendered by the audio Tenderer 006 and the video Tenderer 007 to the loudspeakers 009 and display devices 010, respectively.
- the system 300 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
- one instance of the control system 110 may implement the content creation tool 000 and another instance of the control system 110 may implement the experience player 002.
- one instance of the control system 110 may implement the audio Tenderer 006, the video Tenderer 007, the multi-sensory Tenderer 001, or combinations thereof.
- an instance of the control system 110 that is configured to implement the experience player 002 may also be configured to implement the audio Tenderer 006, the video Tenderer 007, the multi-sensory Tenderer 001, or combinations thereof.
- Object-based MS rendering involves different modalities being rendered flexibly to the endpoint/playback environment. Endpoints have differing capabilities according to various factors, including but not limited to the following:
- actuators e.g., light fixture vs. air flow control device vs. haptic device
- the types of those actuators e.g., a white smart light vs. a RGB smart light, or a haptic vest vs. a haptic seat cushion
- a white smart light vs. a RGB smart light
- a haptic vest vs. a haptic seat cushion
- Object-based haptics content conveys sensory aspects of the scene through an abstract sensory representation rather than a channel-based scheme only. For example, instead of defining haptics content as a single-channel time-dependent amplitude signal only, that is in turn played out of a particular haptics actuator such as a vibro-tactile motor in a vest the user wears, object-based haptics content may be defined by the sensations that it is intended to convey. More specifically, in one example, we may have a haptic object representing a collision haptic sensory effect. Associated with this object is:
- the haptic object s spatial location
- a haptic object of this type may be created automatically in an interactive experience such as a video game, e.g. in a car racing game when another car hits a player’s car from behind.
- the MS Tenderer will determine how to render the spatial modality of this effect to the set of haptic actuators in the endpoint. In some examples, the Tenderer does this according to information about the following:
- haptic vest vs. haptic glove vs. haptic seat cushion vs. haptic controller
- each haptic device with respect to the user(s) (some haptic devices may not be coupled to the user(s), e.g., a floor- or seat-mounted shaker);
- each haptic device provides, e.g. kinesthetic vs. vibro-tactile;
- a haptic vest may have dozens of addressable haptics actuators distributed over the user’ s torso;
- haptic sensors used to render closed-loop haptic effects (e.g., an active force-feedback kinesthetic haptic device.
- closed-loop haptic effects e.g., an active force-feedback kinesthetic haptic device.
- haptics modality of the endpoint will inform the render how best to render a particular haptic effect.
- a haptic shockwave effect is spatially located at the place where the car has collided into the player. The shockwave vector is dictated by the relative velocity of the player’s car and the car that has hit the player.
- the spatial and temporal frequency spectra of the shockwave effect are authored according to the type of material the virtual cars are intended to be made of, amongst other virtual world properties.
- the Tenderer then renders this shockwave through the set of haptics devices in the endpoint, according to the shockwave vector and the physical location of the haptics devices relative to the user.
- the signals sent to each specific actuator are preferably provided so that the sensory effect is congruent across all of the (potentially heterogenous) actuators available.
- the Tenderer may not render very high frequencies to just one of the haptic actuators (e.g., the haptic arm band) due to capabilities lacking in other actuators. Otherwise, as the shockwave moves through the player’s body, because the haptic vest and haptic gloves the user is wearing do not have the capability to render such high frequencies, there would a degradation of the haptic effect perceived by the user as the wave moves through the vest, into the arm band and finally into the gloves.
- Some types of abstract haptic effects include:
- Barrier effects such as haptic effects which are used to represent spatial limitations of a virtual world, for example in a video game. If there are kinesthetic actuators on input devices (e.g., force feedback on a steering wheel or joystick), either active or resistive, then rendering of such an effect can be done through the resistive force applied to the users input. If no such actuators are available in the endpoint then in some examples vibro-tactile feedback may be rendered that is congruent with the collision of the in-game avatar with a barrier;
- Presence for example to indicate the presence of a large object approaching the scene such as a train.
- This type of haptic effect may be rendered using a low timefrequency rumbling of some haptic devices’ actuators.
- This type of haptic effect may also be rendered through contact spatial feedback applied as pressure from aircuffs;
- User interface feedback such as clicks from a virtual button.
- this type of haptic effect may be rendered to the closest actuator on the body of the user that performed the click, for example haptic gloves that the user is wearing.
- this type of haptic effect may also be rendered to a shaker coupled to the chair in which the user is sitting.
- This type of haptic effect may, for example, be defined using time-dependent amplitude signals. However, such signals may be altered (modulated, frequency-shifted, etc.) in order to best suit the haptic device(s) that will be providing the haptic effect;
- haptic effects are designed so that the user perceives some form of motion.
- These haptic effects may be rendered by an actuator that actually moves the user, e.g. a moving platform/seat.
- an actuator may provide a secondary modality (via video, for example) to enhance the motion being rendered;
- Triggered sequences These haptic effects are characterized mainly by their timedependent amplitude signals. Such signals may be rendered to multiple actuators and may be augmented when doing so. Such augmentations may include splitting a signal in either time or frequency across multiple actuators. Some examples may involve augmenting the signal itself so that the sum of the haptic actuator outputs does not match the original signal.
- Spatial effects are those which are constructed in a way that convey some spatial information of the multi-sensory scene being rendered. For example, if the playback environment is a room, a shockwave moving through the room would be rendered differently to each haptic device given its location within the room, according to the position and size of one or more haptic objects being rendered at a particular time.
- Non-spatial effects may, in some examples, target particular locations on the user regardless of the user’s location or orientation.
- a haptic device that provides a swelling vibration on the users back to indicate immediate danger.
- a haptic device that provides a sharp vibration to indicate an injury to a particular body area.
- Some effects may be non-diegetic effects. Such effects are typically associated with user interface feedback, such as a haptic sensation to indicate the user completed a level or has clicked a button on a menu item. Non-diegetic effects may be either spatial or non- spatial.
- haptics device data indicating that the user is wearing both haptic gloves and a vibro-tactile vest — or at least local haptics device data indicating that that haptic gloves and a vibro-tactile vest are present in the playback environment — allows the Tenderer to render a congruent recoil effect across the two devices when a user shoots a gun in a virtual world.
- the actual actuator control signals sent to the haptic devices may be different than in the situation where only a single device is available. For example, if the user is only wearing a vest, the actuator control signals used to actuate the vest may differ with regard to the timing of the onset, the maximum amplitude, frequency and decay time of the actuator control signals, or combinations thereof.
- Haptic devices can provide a range of different actuations and thus perceived sensations. These are typically classed in two basic categories:
- Kinesthetic e.g., resistive or active force feedback.
- Either category of actuations may be static or dynamic, where dynamic effects are altered in real time according to some sensor input. Examples include a touch screen rendering a texture using a vibro-tactile actuator and a position sensor measuring the user’s finger position(s). [0094] Moreover, the physical construction of such actuators varies widely and affects many other attributes of the device. An example of this is the onset delay or time-frequency response that varies significantly across the following haptic device types:
- the Tenderer should be configured to account for the onset delay of a particular haptics device type when rendering signals to be actuated by the haptics devices in the endpoint.
- the onset delay of the haptic device refers to the delay between the time that an actuator control signal is sent to the device and the device’s physical response.
- the off-set delay refers to the delay between the time that an actuator control signal is sent to zero the output of the device and the time the device stops actuating.
- the time-frequency response refers to the frequency range of the signal amplitude as a function of time that the haptic device can actuate at steady state.
- the spatial-frequency response refers to the frequency range of the signal amplitude as a function of the spacing of actuators of a haptic device. Devices with closely-spaced actuators have higher spatial-frequency responses.
- Dynamic range refers to the differences between the minimum and maximum amplitude of the physical actuation.
- airflow Another modality that some multi-sensory immersive experiences (MSIE) may use is airflow.
- the airflow may, for example, be rendered congruently with one or more other modalities such as audio, video, light-effects and/or haptics.
- some airflow effects may be provided at other endpoints that may typically include airflow, such as a car or a living room.
- the airflow sensory effects may be represented as an airflow object that may include properties such as:
- Some examples of air flow objects may be used to represent the movement of a bird flying past.
- the MS Tenderer 001 may be provided with information regarding:
- the type of airflow devices e.g. fan, air conditioning, heating;
- the capabilities of the airflow device e.g., the airflow device’s ability to control direction, airflow and temperature;
- the object-based metadata can be used to create experiences such as:
- temperature changes may be possible to achieve over relatively shorter periods of time — as compared to temperature changes in a larger environment, such as a living room environment.
- the MS Tenderer 001 may cause an increasing air temperature as a player enters a “lava level” or other hot area during a game.
- Some examples may include other elements, such as confetti in the air vents to celebrate an event, such as the celebration of a goal made by the user’s favorite football team.
- airflow may be synchronized to the breathing rhythm of a guided meditation in one example.
- airflow may be synchronized to the intensity of a workout, with increased airflow or decreased temperature as intensity increases.
- there may be relatively less control over spatial aspects during rendering. This is a limitation of current installations of commercial airflow and heating technologies which offer limited spatial resolution. For example a given seating position in a car will tend to only be serviced by a limited number of individual fan positions such as footwell and dash.
- a user interface on the steering wheel or on a touchscreen near or in the dashboard.
- the following actuators may be present in the car:
- the modalities supported by these actuators include the following:
- Haptics including: o Steering wheel: tactile vibration feedback; o Dash touchscreen: tactile vibration feedback and texture rendering; and o Seats: tactile vibrations and movement.
- a live music stream is being rendered to four users sitting in the front seats.
- the MS Tenderer 001 attempts to optimize the experience for multiple viewing positions.
- the content contains:
- the light content contains ambient light objects that are moving slowly around the scene. These may be rendered using one of the ambient layer methods disclosed herein, for example such that there is no spatial priority given to any user’s perspective.
- the haptic content may be spatially concentrated in the lower time-frequency spectrum and may be rendered only by the vibro- tactile motors in the floor mats.
- pyrotechnic events during the music stream correspond to multi-sensory-sensory content including:
- Haptic objects to reinforce the dynamism of the pyrotechnics via a shockwave effect renders both the light objects and the haptic objects spatially.
- Light objects may, for example, be rendered in the car such that each person in the car perceives the light objects to come from the left if the pyrotechnics content is located at the left of the scene. In this example, only lights on the left of the car are actuated.
- Haptics may be rendered across both the seats and floor mats in a way that conveys directionality to each user individually.
- the pyrotechnics are present in the audio content and both pyrotechnics and confetti are present in the video content.
- the effect of the confetti firing may be rendered using the airflow modality.
- the individually controllable air flow vents of the HVAC system may be pulsed.
- a haptics vest that the user — also referred to as a player — is wearing;
- An addressable air-flow bar which includes an array of individually controllable fans directed to the user (similar to HVAC vents in the front dashboard of a car).
- the user is playing a first person shooter game and the game contains a scene in which a destructive hurricane moves through the level. As it does so, in-game objects are thrown around and some hit the player. Haptics objects rendered by the MS renderer 001 cause a shockwave effect to be provided through all of the haptics devices that the user can perceive.
- the actuator control signals sent to each device may be optimized according to the intensity of the impact of the in-game objects, the directi on(s) of the impact and the capabilities and location of each actuator (as described earlier).
- the multi-sensory content contains a haptic object corresponding to a non-spatial rumble, one or more airflow objects corresponding to directional airflow; and one or more light objects corresponding to lightning.
- the MS Tenderer 001 renders the non-spatial rumble to the haptics devices.
- the actuator control signals sent to each haptics device may be rendered such that the ensemble of actuator control signals across the haptics array is congruent in perceived onset time, intensity and frequency.
- the frequency content of the actuator control signals sent to the smart watch may be low-pass filtered, so that they are congruent with the frequency-limited capability of the vest, which is proximate to the watch.
- the MS Tenderer 001 may render the one or more airflow objects to actuator control signals for the AFB such that the air flow in the room is congruent with the location and look direction of the player in the game, as well as the hurricane direction itself.
- Lightning may be rendered across all modalities as (1) a white flash across lights that are located in suitable locations, e.g., in or on the ceiling; and (2) an impulsive rumble in the user’s wearable haptics and seat shaker.
- a directional shockwave may be rendered to the haptics devices.
- a corresponding airflow impulse may be rendered.
- a damage take effect indicating the amount of damage caused to the player by being struck by the in-game object, may be rendered by the lights.
- signals may be rendered spatially to the haptics devices such that a perceived shockwave moves across the player’s body and the room.
- the MS Tenderer 001 may provide such effects according to actuator location information indicating the haptics devices locations relative to one another.
- the MS Tenderer 001 may provide the shockwave vector and position according to the actuator location information in addition to actuator capability information.
- a non-directional air flow impulse may be rendered, e.g., all the air vents of the AFB may be turned up briefly to reinforce the haptic modality.
- a red vignette may be rendered to the light strip surrounding the TV, indicating to the player that the player took damage in the game.
- Figure 4 shows example elements of a system for the creation and playback of multi- sensory (MS) experiences corresponding to virtual worlds.
- MS multi- sensory
- the types and numbers of elements shown in Figure 4 are merely provided by way of example.
- Other implementations may include more, fewer and/or different types and numbers of elements.
- other implementations may include multiple video displays 010, which are also referred to herein as display devices 010.
- system 400 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein.
- system 400 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
- the MS creation tool 000 is configured to create object-based sensory data 005 — also referred to herein as object-based MS content 005 — corresponding to the virtual world objects 14001.
- Virtual world objects 14001 may also be referred to herein as “in-world objects 14001.”
- the MS creation tool 000 may not be involved the actual MS content creation process, but instead may add properties, metadata, etc., to the virtual world objects 14001.
- Virtual worlds are dynamic and may change based on the actions of the user, which means the object-based MS content 005 will sometimes also be dynamic. For such dynamic situations, the MS creation tool 000 is processing at run-time to produce the appropriate object-based MS content 005.
- the MS creation tool 000 is configured to pre-author the object-based MS content 005, to varying degrees.
- the MS creation tool 000 may, in some examples, be configured to completely design the object-based MS content 005 as an authoring step, and to associate the object-based MS content 005 with a virtual world object (VWO) 14001.
- the object-based MS content 005 may, in some such examples, simply be passed to the MS Tenderer 001 at the appropriate time, as suggested by Figure 4.
- the MS creation tool 000 may indicate — for example, via a tag — that a particular VWO 14001 should be associated with particular effect, such as a haptic effect.
- the actual object-based MS content 005 may be generated — for example, by the MS Content Generator 14003 shown in Figure 5 — based on that VWO 14001’s tag and any other relevant VWOs 14001 that have the same tag.
- the object-based MS content 005 may include object-based light data, object-based haptic data, object-based air flow data, or object-based positional actuator data, object-based olfactory data, object-based smoke data, object based data for one or more other types of sensor effects, or combinations thereof.
- the object-based MS content 005 may include sensory objects and corresponding sensory metadata.
- the object-based light data may include light object position metadata, light object color metadata, light object size metadata, light object intensity metadata, light object shape metadata, light object diffusion metadata, light object gradient metadata, light object priority metadata, light object layer metadata, or combinations thereof.
- the object-based MS content 005 may include time data, such as time stamp information.
- VWOs 14001 may include, but are not limited to, the following:
- the avatars of non-playable characters such as those controlled by an artificial intelligence
- Interactive objects such as doors, vehicles or obstacles
- “Practical lighting” light sources that are themselves visible to the player such as an ornate lamp that is rendered in-world, such that the fixture itself can be observed but also emits light in-world. Properties of the light that may be observed include, light color, light intensity and radiation pattern;
- Non-practical lighting light sources that are not themselves visible to the player but are seen indirectly by the resultant lighting of the in-game objects and environment. Properties of the light that may be observed include light color, light intensity and radiation pattern; and
- Sound sources such as those that represents sound effects tied to other in-game objects, dialogue spoken by in-game characters.
- Figure 5 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds.
- the types and numbers of elements shown in Figure 5 are merely provided by way of example.
- Other implementations may include more, fewer and/or different types and numbers of elements.
- other implementations may include multiple video displays 010, which are also referred to herein as display devices 010.
- system 500 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein.
- system 500 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
- the MS content generator 14003 is configured to generate object-based MS content 005 based at least in part on virtual world object data 510 corresponding to the virtual world objects (VWOs) 14001 and virtual world state data 505 corresponding to the virtual world state (VWS) 14002.
- the virtual world object data 510 may, for example, include information regarding virtual world object properties of one or more virtual world objects 14001.
- the MS content generator 14003 may be configured to generate the object-based MS content 005 based at least in part on virtual world data, including but not necessarily limited to the virtual world object data 510.
- the virtual world data also may include virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
- Virtual world surface type information may, for example, indicate whether a virtual surface is a floor, a ceiling, a wall, a window, a door, a seat, a table, etc.
- Virtual world object instantiation rules may, for example, indicate what virtual world surface type(s), if any, on which a particular type of virtual world object may be “spawned” or replicated. For example, a tree-type virtual world object may only be allowed to spawn on an upward-facing surface.
- a virtual world state (VWS) 14002 may be associated with one or more of the following:
- Player objective(s) typically a video game will feature objectives that a player works towards as they play. These objectives can change as the game progresses, as they are completed and as the game story unfolds. Often there will be a current objective that the player is actively pursuing. This is accounted for by the state of the objectives; Room geometry: as the virtual world is navigated, the player may move through different rooms. The geometry of these rooms may vary, both in size and shape; In-world time of day clock: many video games build in a time of day clock, that leads to changes in the game. The current time of day indicated by this clock can affect many aspects, such as light levels, objectives and various events;
- Weather system some games will mimic the “real world” by incorporating a weather system, that provides the in-world weather conditions. For example, it may rain inworld or it may be sunny;
- In-world health of the player often the in-game characters being player will have an associated amount of health. If this health becomes depleted past a certain point, the character may die and need to re-spawn. A player’s health level will normally be indicated to the player in order to allow the player to act on this information, for example by attempting to increase the player’s health.
- the MS content generator 14003 may be configured to generate the object-based MS content 005 based at least in part on virtual world events (VWEs).
- VWEs are generally associated with both the VWOs 14001 and VWS 14002, and may include one or more of the following:
- An entity firing a weapon This event would involve at least one VWO 14001 (for example, the weapon) and may affect the VWS 14002 (for example, by decreasing the amount of ammunition available to the weapon);
- An interactive object being actioned For example, a virtual world object that is located by a player may be picked up, moved, destroyed etc.
- a change in player objective As the game progresses, the player’s objectives may be updated;
- VWOs 14001 A player taking damage, for instance by falling, being struck, being exposed to fire or ice. This maybe result from the interaction of VWOs 14001 and may lead to a change in the VWS 14002. For example two players (VWOs 14001) may collide (a VWE), leading to a decrease in the in-game health of both players (a change of the VWS 14002).
- the MS content generator 14003 may be configured for analyzing information about the VWS 14002 and VWOs 14001, which may include VWEs, and for producing the object-based MS content 005 based on the results of this analysis.
- the MS content generator 14003 may be configured for associating the VWOs 14001 with layers.
- the MS content generator 14003 may be configured for and assigning priorities to the VWOs 14001.
- the MS content generator 14003 may be configured for generating the object-based MS content 005 for both gameplay and for “cutscenes,” if the cutscenes are rendered using an engine, such as a game engine, that is providing the virtual world experience.
- a cutscene is generally a pre-authored audiovisual sequence, which may be associated with a game event such as a celebration, a change in level, etc.
- the object-based MS content 005 may be provided by the virtual world content creator, such as the game studio, because the game engine may not be active during cutscenes of user gameplay.
- the game studio may run the MS content generator 14003 and may provide the object-based MS content 005 along with the audio and video data for the virtual world cutscene presentation, e.g., in the same manner as described herein with reference to Figure 3.
- the MS content generator 14003 may be configured to inject a configuration corresponding to one or more user preferences.
- the user preferences may be obtained explicitly, may be learned by the MS content generator 14003 or by another instance of the control system 110, or both.
- a user configuration may, for example, indicate user preferences such as "no flashing lights," so that when the MS Content generator 14003 is creating the object-based MS content 005, the MS Content generator 14003 will ensure there are no flashing lights indicated by the object-based MS content 005.
- User preferences may, in some examples, be explicitly indicated by a user or learned from a user profile.
- Figure 6 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds.
- the types and numbers of elements shown in Figure 6 are merely provided by way of example.
- Other implementations may include more, fewer and/or different types and numbers of elements.
- other implementations may include multiple video displays 010, which are also referred to herein as display devices 010.
- system 600 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein.
- system 600 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
- the MS content generator 14003 is configured to generate object-based MS content 005 based not only on virtual world object data 510 and virtual world state data 505, but also based on audio data 011 and/or video data 012 corresponding with virtual world objects 14001, for example the audio data 011 and/or video data 012 that is used to present a virtual world that includes the virtual world objects 14001.
- the MS content generator 14003 may be configured to extract relevant information from the audio data 011 in various ways, which may include one or more of the following:
- the MS content generator 14003 may be configured to temporally align generated MS content with the audio. For instance if the generated MS content included a light flash that was to be associated with the firing of a weapon, or the clash of two swords, then the envelope of the audio signal from these sound events could guide the timing of the light event, giving temporal alignment;
- the MS content generator 14003 may be configured to generate appropriate spatial frequencies or velocities in the generated MS content. For instance, frequency analysis may indicate that the audio data Oi l includes predominantly low frequencies such as the rumble of a volcano that is about to erupt. A slow spatial frequency may be appropriate for the generated MS content, perhaps increasing as the volcano erupts;
- the MS content generator 14003 may be configured to directly read or derive spatial information that can influence the spatial properties of generated MS content associated with this audio signal;
- the signal may be categorized and a semantic tag may be attached to that audio signal.
- the tag can then influence the MS content 005 generated from that audio signal.
- the classifier may determine that the audio signal is the sound of a wind-blowing through trees.
- generated MS content 005 might include airflow.
- the classification categorizes the signal as being a “strong” rather than a “weak” wind, this categorization could guide the strength of the airflow MS content 005.
- the MS content generator 14003 may be configured to extract relevant information from the video data 012 in various ways, which may include one or more of the following:
- Analyzing the mesh of a virtual world object 14001 as rendered in the viewport can indicate various properties of the virtual world object 14001, for example: o
- the category of the obj ect (which may be determined by classifying the mesh), indicating that it is, for example, a truck or a lamp-post. This category property can then guide the generation of an MS effect. For instance if the MS content generator 14003 determines that the virtual world object 14001 is a semi-trailer truck, MS content generator 14003 may be configured to determine that a rumbling haptic effect would be appropriate; o
- the scale of the virtual world object 14001 which can be understood relative to itself over time.
- the MS content generator 14003 may be configured to determine that the rumbling haptic effect will increase in intensity at a rate corresponding to the truck’s rate of approach.
- the scale of the virtual world object 14001 can also be understood relative to the scale of other visible meshes, which may indicate the importance of this virtual world object 14001; o
- the MS content generator 14003 may be configured to derive information about movement of the virtual world object 14001 by analyzing how the mesh corresponding to the virtual world object 14001 changes over time. For example, the virtual world object 14001 may be rotating at a certain rate. This rate of rotation can then guide the MS content generator 14003’s generation of an MS effect, such as the rate of moving pattern of light in the real-world playback environment;
- the MS content generator 14003 may be configured to map the virtual world object 14001’s spatial location to an appropriate haptics spatial location in the real -world playback environment. For example, if the virtual world object 14001 is located at the bottom of the video, then the MS content generator 14003 may be configured to map this location to real-world haptics under the user’s feet, rather than to haptics that are in the user’s head-rest; o The MS content generator 14003 may be configured to perform a time-based analysis that indicates that a virtual world object 14001 is moving.
- the MS content generator 14003 may be configured to analyze the trajectory of the virtual world object 14001. The speed and direction of this trajectory may then be used by the MS content generator 14003 to generate a moving lightscape effect in the real world that has a speed and direction that comports with that of the moving virtual world object 14001, to generate an air flow effect consistent with the moving virtual world object 14001, etc.;
- the MS content generator 14003 may be configured to use this information to guide the generation of an MS effect, such as creating a real-world lightscape with a blue and red color scheme that is in harmony with that of the virtual world object 14001, creating a real-world lightscape that includes lights that correspond with headlights of the virtual world object 14001, etc.
- the MS content generator 14003 may be configured to extract or estimate one or more of the following types of information from the video data 012:
- the MS content generator 14003 may be configured to generate an MS effect that is intended to foster a sense of immediate danger, such as a high-frequency haptic “shaking” effect, a strobing light effect, a throbbing red lighting effect, etc.;
- the pacing of the current virtual world scene for example if the MS content generator 14003 detects that the current virtual world scene may be rapidly changing, indicating excitement and action, the MS content generator 14003 may be configured to generate an MS effect that is intended to reflect this pacing, also being rapidly changing;
- the optimal, or at least appropriate, type of the MS effect for example if the MS content generator 14003 detects that heavy virtual objects in the current virtual world scene are falling before hitting the ground and breaking apart, the MS content generator 14003 may be configured to determine that an optimal — or at least an appropriate — MS effect would be low-frequency haptics coinciding with the impacts of the heavy virtual objects;
- the general color palette of the virtual world environment for instance, if the MS content generator 14003 determines that the current virtual location is underwater, the MS content generator 14003 may be configured to provide lighting effects with a generally blue color scheme.
- Some implementations of the multi-sensory Tenderer 001 may be configured for rendering based, at least in part, on local context information. This type of rendering may be referred to herein as “context-based rendering” or as “context-aware rendering.”
- the MS content generator 14003 may be configured for context-aware generation of MS content.
- the “context” may be, or may include, local context information regarding the local playback environment, regarding one or more people in the local playback environment, etc.
- one aspect of the local context information may be the time of day in a region that includes the local playback environment, the weather in an area that includes the local playback environment, etc.
- the context may be, or may include, information regarding one or more people in the local playback environment, such as the apparent level of engagement with played-back content.
- information may be provided explicitly to allow the MS content generator 14003 to provide context-aware generation of MS content, to allow the multi- sensory Tenderer 001 to provide context-based rendering, or both.
- explicit information may be, or may include, user input for managing one or more aspects of the rendering process, the MS content creation process, or both, such as information regarding the overall immersion level and/or interactivity level.
- explicit information may be, or may include, input from a device or system that has access to information regarding one or more aspects of the local context information, from a device or system that is configured to learn local context information by analyzing sensor data, patterns of user behavior, etc.
- Managing the immersion and/or interactivity of the sensory experience can be achieved by changing the way the multi-sensory Tenderer 001, the MS content generator 14003, or both, manage temporal, frequency, intensity, input, spatial effects (e.g. color or vibration) dimensions, or combinations thereof.
- the multi-sensory Tenderer 001 and/or the MS content generator 14003 may be configured to manage the immersion and/or interactivity of the sensory experience automatically, for example by applying a low-pass filter to one or more of those dimensions.
- one or more relatively more dynamic or relatively more spatial layers of sensory content may be excluded and only an ambient layer may be used when a user prefers a less immersive experience.
- one or more relatively more dynamic or relatively more spatial layers of sensory content may be reduced in intensity. These actions may be taken with regard to any combination of audio, visual and sensory experiences.
- a context-aware MS Tenderer 001 and/or the MS content generator 14003 may be configured to provide an MS experience based at least in part on local context information, which may include one or more of the following:
- the objectbased MS content 005 may not correspond to extremely bright light.
- the MS content generator 14003 may avoid generating light content with a strong blue component during the last hour or so before a viewer’s bedtime, because this may hinder sleep. If it is daytime in the current location, the MS Tenderer 001 — or another device — may be configured to control automated blinds to reduce the ambient light from outdoors, depending on the content and experience;
- Local weather information as determined by an Internet weather forecast, by live internet weather observations from a nearby weather station, by local weather information supplied by a LAN, WLAN or Bluetooth connected weather station onsite, etc. For example, if it is very sunny outside the playback environment, high- intensity lighting may be required to overcome the light leaking into a viewing room through the windows. If it is very cloudy, it may be dark in the room so less light intensity may be required;
- Observations of human behavior which may in some examples include observations from multiple days, weeks or months. For example, based historical information from a home’s security system it may be possible to determine that at the present time it is highly likely that only one person is present in a living room and that the person is likely to be playing a video game on a TV
- the context-aware MS content generator 14003 and/or MS Tenderer 001 may determine that this a time to augment the gaming experience using sensory effects that include the living room’s lighting system, because it is likely that no one else in the house is trying to do something different in the living room at the same time;
- a person may have indicated preferences as to one or more types of sensory experiences.
- the person may be a photosensitive or colorblind viewer for whom the lighting experience should be toned down or otherwise personalized.
- the person may react badly to “jump-scares” or other content that is intended to induce fear and/or excitement, for examples because the person has had heart trouble, has already had one or more heart attacks, etc.
- the presence of a particular viewer may be determined by a Bluetooth or Wi-Fi beacon from a phone or smart watch, by talker identification using a microphone, by face identification, etc.
- the context-aware MS content generator 14003 may be configured for one or more of the following: a. Avoiding the generation of MS content that can cause jump-scares; b. Altering the palette, e.g., the color palette, of generated MS content;
- Information such as light sensor information — indicating ambient light from internal (within the playback environment) sources, from external (e.g., outdoor) sources, or both.
- a context-aware MS content generator 14003 and/or MS Tenderer 001 may be configured to adapt the lightscape provided by controllable light fixtures in the living room due to the light from the kitchen lights.
- a context-aware MS content generator 14003 and/or MS Tenderer 001 may be configured to cause light fixtures near the kitchen to be relatively brighter than those farther from the kitchen in order to compensate for the kitchen lights.
- Such compensatory techniques may be particularly relevant if the MS Tenderer 001 is mixing colors. For example, if the kitchen light is somewhat orange, but a white light was desired, the MS Tenderer 001 may make the side lights slightly greenish so that in a user’s peripheral vision the colors mix to white;
- the context-aware MS content generator 14003 and/or MS Tenderer 001 may cause only lower-level/less immersive sensory objects/types to be shown in order to prevent driver distraction.
- the context-aware MS Tenderer 001 receives information — such as sensor information or user input — indicating one or more people are playing a video game on a TV and no other nearby person is trying to do anything constructive, such as housework or homework, the context-aware MS content generator 14003 and/or the context-aware MS Tenderer 001 may determine that this a time for relatively more immersive sensory content playback corresponding to the gaming content provided via the TV, whereas if another nearby person is trying to do something constructive, the context-aware MS content generator 14003 and/or the context-aware MS Tenderer 001 may determine that this a time for relatively more ambient, or completely ambient, sensory content playback corresponding to the gaming content provided via the TV;
- the sensory object metadata may also contain information to assist the Tenderer to provide MS experiences for various contexts or user-selectable levels of immersion.
- the context-aware MS content generator 14003 may be configured to generate sensory object metadata that includes a “mood” object type to denote that the role of the object is to set an ambience layer.
- the sensory object metadata may include a “dynamic” object type that may be used to signal to the Tenderer that the role of the object is to bring change and movement.
- the context-aware MS content generator 14003 and/or MS Tenderer 001 may use this information to provide different types of sensory experiences.
- the full gamut of mood and dynamic sensory objects may be used to render actuator signals for the sensory experience.
- the context-aware MS content generator 14003 may provide — or the context- aware MS Tenderer 001 may use — all the mood objects, but only a subset of the dynamic objects or none at all.
- this immersion control may be continuous from “immersion off’ to “fully immersive,” for example within a range from zero to ten, from zero to one hundred, etc.
- Immersion level may change the sensory objects used, the intensity or amplitude of sensory actuator playback, the number of actuators used, etc.
- the immersion level may change the light objects used, the brightness of the lights, the number of light fixtures used, etc.
- some light sources are considered part of the constructed world and any associated narrative. These are typically “diegetic” light sources, meaning that they occur within the context of the virtual world — for example, within the game or the story — and are able to be seen by the characters in the virtual world.
- diegetic light sources typically include sunlight, emanating from an in-world (virtual world), creating the background lighting and changing as the in-world time changes.
- Other examples of diegetic light sources in a virtual world include lightning flashes emanating from an in-world electrical storm.
- Still other examples may result from the action of an in-world character, such as that played in a video game: one example of this would be flashes of light from the firing of an in-world weapon such as a laser canon. All these light sources are would normally be spatially dynamic and therefore some implementations involve making a spatially dynamic mapping of light sources from in-world to in-room (in the real-world playback environment).
- the MS content generator 14003 may be configured to use this mapping when producing the object-based MS content 005, particularly with regard to the placement of the virtual world objects 14001.
- Other graphical elements within the 3D world may exist to convey information. Such informational graphical elements will sometimes be diegetic, such as the information presented through a Head Up Display (HUD) being worn by an in-world avatar. Other informational graphical elements may be non-diegetic, such a health meter that indicates the in-world health of an avatar, which may affect the avatar’s progress in the virtual world. Another example of a non-diegetic informational graphical element is a progress bar that indicates the degree of progress on an in-world task.
- HUD Head Up Display
- non-diegetic informational graphical elements may be abstracted from the large scale in-world geometry, may be overlaid on a 2-D display such as a monitor and may be unaffected by, for example, the dynamic perspective of an avatar navigating the virtual world.
- These non-diegetic informational graphical elements/light sources will generally be unaffected by, and will not interact with, other diegetic light sources.
- non-diegetic informational light sources are not spatially dynamic and will have a spatially non-dynamic mapping from in-world to in-room.
- Some implementations of the MS content generator 14003 may be configured to analyze the following properties of in-world objects 14001 : The player position;
- the light source o Type; o Color; o Intensity; and o Emission pattern.
- the “player” is a person for whom a virtual world experience is being provided, so the “player position” is the person’s position in the virtual world and the “player’s viewpoint” is the person’s current point of view in the virtual world.
- the MS content generator 14003 may also be configured to analyze one or more aspects of the virtual world state 14002, such as the sizes, geometries and properties of the surfaces of the part of the virtual world — for example, the room — that the player is currently in.
- the MS content generator 14003 may be configured to analyze audio data, for example audio data corresponding to one or more audio objects, in order to provide timing cues for light content that the MS content generator 14003 may generate.
- the MS content generator 14003 may be configured to analyze audio signals of an audio object corresponding to a door slamming, a window breaking, a car crashing, a gun firing, etc.
- the time corresponding to the peak of the envelope of the audio signals may be used to align the light object content.
- MS content generator 14003 may be configured to analyze one or more of the following properties of in-world objects 14001 :
- the in-world object meshes currently visible in the viewport
- the MS content generator 14003 may be configured to analyze one or more properties of in-world objects 14001 that are not in the viewport, but that could still affect the player, such as the location, size, speed and trajectory of one or more such inworld objects 14001. [0147] According to some such implementations, the MS content generator 14003 may also be configured to analyze one or more aspects of the virtual world state 14002, such as the associated health level of any in-world object 14001 that has a health level.
- the MS content generator 14003 may be configured to generate haptics signals to coincide with the time at which one in-world object’s mesh collides with another in-world object’s mesh, e.g., if the collision involves the player and results in a decrease in the player’s health level.
- the MS content generator 14003 may be configured to increase the intensity of the haptic signals as the player’s health level decreases, indicating that the player is in increasing danger.
- the MS content generator 14003 may be configured to determine or estimate the spatial position, trajectory, intensity, or combinations thereof, of a collision based on an analysis of one or more properties of in-world objects 14001 and to map such information to spatially arranged haptics actuators of one or more haptics devices.
- the MS content generator 14003 may be configured to activate haptic actuators on the left side of a haptics suit worn by the player corresponding to, and synchronized with, damage caused by an in-world object 14001 that collides with the left side of the player.
- Figure 7 is a flow diagram that outlines one example of a method that may be performed by an apparatus or system such as those disclosed herein.
- the blocks of method 700 like other methods described herein, are not necessarily performed in the order indicated. In some implementation, one or more of the blocks of method 700 may be performed concurrently. Moreover, some implementations of method 700 may include more or fewer blocks than shown and/or described.
- the blocks of method 700 may be performed by one or more devices, which may be (or may include) one or more instances of control system such as the control system 110 that is shown in Figure 1 A and described above. For example, at least some aspects of method 700 may be performed by an instance of the control system 110 that is configured to implement the MS content generator 14003 of Figure 5 or Figure 6.
- the virtual world data includes virtual world object data corresponding to one or more virtual world objects.
- at least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects.
- at least some of the virtual world data may be, or may include, the virtual world object data 510 that is described herein.
- the one or more virtual world objects may include one or more virtual world entities, one or more interactive non-entity virtual world objects, one or more virtual world sound sources, one or more virtual world light sources, or combinations thereof.
- at least some of the virtual world data may correspond to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
- block 710 involves receiving, by the control system, virtual world state data corresponding to a virtual world state.
- the virtual world state may be a state of a virtual world in which the one or more virtual world objects exist.
- the virtual world state may be based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof.
- the virtual world state data may be, or may include, the virtual world state data 505 that is described herein.
- block 715 involves generating, by the control system, object-based multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data.
- MS object-based multi-sensory
- the object-based MS content may be, or may include, light-based content, haptic content, air flow content or combinations thereof.
- block 720 involves providing, by the control system, the object-based MS content to an MS Tenderer.
- the MS Tenderer may, for example, be an instance of the MS Tenderer 001 that is disclosed herein.
- the MS controller APIs 003 that are shown in Figure 3 may be implemented via the MS Tenderer 001 and actuator-specific signals may be provided to the actuators 008 by the MS Tenderer 001.
- the MS Tenderer 001 may provide actuator control signals 310 to the MS controller APIs 003 and the MS controller APIs 003 may provide actuatorspecific control signals to the actuators 008.
- method 700 also may involve providing, by the sensory actuators in the environment, the sensory effects.
- method 700 may involve rendering, by the MS Tenderer, the object-based MS content to control signals for one or more actuators residing in a real -world environment in which video data corresponding to a virtual world is being presented on one or more displays.
- the rendering may involve synchronizing the object-based MS content with the video data, for example according to time stamps or other time-related data.
- audio data corresponding to the virtual world may also be reproduced in the real-world environment.
- the rendering may involve synchronizing the object-based MS content with the audio data, for example according to time stamps or other time-related data.
- method 700 may involve analyzing, by the control system and prior to the rendering, the virtual world data to determine rendering parameters corresponding to the virtual world data.
- the rendering may be based, at least in part on the rendering parameters.
- the rendering parameters may include scaling parameters.
- the scaling parameters may be based, at least in part, on maximum virtual world object parameter values.
- method 700 may involve receiving, by the control system, virtual world event data associated with the virtual world state and with one or more virtual world objects.
- generating the object-based MS content may be based, at least in part, on the virtual world event data.
- method 700 may involve adding MS content metadata to the one or more virtual world objects.
- the MS content metadata may correspond to the object-based MS content.
- method 700 may involve analyzing, by the control system, virtual world object content of the one or more virtual world objects.
- the virtual world object content may include virtual world audio content, virtual world video content, or combinations thereof.
- generating the objectbased MS content may be based, at least in part, on one or more results of the analyzing process.
- analyzing the virtual world object content may involve analyzing an envelope of a virtual world audio signal.
- method 700 may involve temporally aligning generated MS content with the virtual world audio signal.
- analyzing the virtual world object content may involve analyzing a virtual world object portion of a virtual world video signal corresponding to a virtual world object, analyzing a context of the virtual world object portion, or combinations thereof.
- method 700 may involve estimating, by the control system, a creative intent corresponding to the virtual world object.
- generating the object-based MS content may be based, at least in part, on estimated creative intent.
- analyzing the virtual world object content may involve a virtual light source object analysis of one or more virtual world light source objects.
- generating the object-based MS content may be based, at least in part, on the virtual light source object analysis.
- method 700 may involve determining at least one of a player position or a player viewpoint.
- generating the object-based MS content may be based, at least in part, on the player position, the player viewpoint, or a combination thereof.
- generating the object-based MS content may be based, at least in part, on a machine learning process implemented by the control system. For example, generating the object-based MS content may be performed, at least in part, by a neural network implemented by the control system.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Some methods involve receiving virtual world data, including virtual world object data corresponding to one or more virtual world objects, and receiving virtual world state data corresponding to a virtual world state. At least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects. Some methods involve generating object-based multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data, and providing the object-based MS content to an MS renderer. Some methods involve rendering, by the MS renderer, the object-based MS content to control signals for one or more actuators residing in a real-world environment in which video data corresponding to a virtual world is being presented on one or more displays. The actuators may include one or more light fixtures, haptic devices, air flow control devices, or combinations thereof.
Description
GENERATION OF OBJECT-BASED MULTI-SENSORY CONTENT FOR VIRTUAL WORLDS
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of priority to U.S. Provisional Application No. 63/669,234 filed July 10, 2024, and U.S. Provisional Application No. 63/514,097 filed July 17, 2023, each of which are hereby incorporated by reference in their entireties.
TECHNICAL FIELD
[0002] The present disclosure relates to providing multi-sensory (MS) experiences, and is more specifically directed to providing MS experiences corresponding to virtual worlds.
BACKGROUND
[0003] Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted as prior art by inclusion in this section.
[0004] Media content delivery has generally focused on audio and screen-based visual experiences. There has been limited delivery of multi-sensory content due to the bespoke nature of actuation. Luminaires, for example, are used extensively as an expression of art and function for concerts. However, each installation is designed specifically for a unique set of luminaires. Delivering a lighting design beyond the set of fixtures the system was designed for is generally not feasible. Other systems that attempt to deliver light experiences more broadly simply do so by extending the screen visuals algorithmically, but are not specifically authored. Haptics content is designed for a specific haptics apparatus. If another device, such as a game controller, mobile phone or even a different brand of haptics device is used, there has been no way to translate the creative intent of content to the different actuators.
SUMMARY
[0005] At least some aspects of the present disclosure may be implemented via methods, such as audio processing methods. In some instances, the methods may be implemented, at least in part, by a control system such as those disclosed herein. Some such methods involve receiving, by a control system, virtual world data. The virtual world data may include
virtual world object data corresponding to one or more virtual world objects. Some disclosed methods involve receiving, by the control system, virtual world state data corresponding to a virtual world state. Some disclosed methods involve generating, by the control system, object-based multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data. Some disclosed methods involve providing, by the control system, the object-based MS content to an MS Tenderer. According to some examples, the object-based MS content may be, or may include, light-based content, haptic content, air flow content or combinations thereof.
[0006] In some examples, at least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects. According to some examples, at least some of the virtual world data may correspond to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof. In some examples, the virtual world state may be a state of a virtual world in which the one or more virtual world objects exist.
[0007] Some disclosed methods may involve rendering, by the MS Tenderer, the objectbased MS content to one or more control signals for one or more actuators residing in a real- world environment in which video data corresponding to a virtual world is being presented on one or more displays. In some examples, the rendering may involve synchronizing the object-based MS content with the video data.
[0008] According to some examples, audio data corresponding to the virtual world may be reproduced in the real-world environment. In some such examples, the rendering may involve synchronizing the object-based MS content with the audio data.
[0009] Some disclosed methods may involve analyzing, by the control system and prior to the rendering, the virtual world data to determine rendering parameters corresponding to the virtual world data. According to some such examples, the rendering may be based, at least in part, on the rendering parameters.
[0010] In some examples, the rendering parameters may include scaling parameters. In some such examples, the scaling parameters may be based, at least in part, on maximum virtual world object parameter values.
[0011] According to some examples, the one or more virtual world objects may include one or more virtual world entities, one or more interactive non-entity virtual world objects, one or more virtual world sound sources, one or more virtual world light sources, or combinations thereof. According to some such examples, the virtual world state may be
based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof.
[0012] Some disclosed methods may involve receiving, by the control system, virtual world event data associated with the virtual world state and with one or more virtual world objects, wherein generating the object-based MS content may be based, at least in part, on the virtual world event data.
[0013] Some disclosed methods may involve adding MS content metadata to the one or more virtual world objects. In some examples, the MS content metadata may correspond to the object-based MS content.
[0014] Some disclosed methods may involve analyzing, by the control system, virtual world object content of the one or more virtual world objects. In some examples, the virtual world object content may be, or may include, virtual world audio content, virtual world video content, or combinations thereof. In some examples, generating the object-based MS content may be based, at least in part, on one or more results of the analyzing process.
[0015] According to some examples, analyzing the virtual world object content may involve analyzing an envelope of a virtual world audio signal. Some disclosed methods may involve temporally aligning generated MS content with the virtual world audio signal.
[0016] In some examples, analyzing the virtual world object content may involve analyzing a virtual world object portion of a virtual world video signal corresponding to a virtual world object, analyzing a context of the virtual world object portion, or combinations thereof.
Some such examples also may involve estimating, by the control system, a creative intent corresponding to the virtual world object. In some examples, generating the object-based MS content may be based, at least in part, on the estimated creative intent.
[0017] According to some examples, analyzing the virtual world object content may involve a virtual light source object analysis of one or more virtual world light source objects. In some such examples, generating the object-based MS content may be based, at least in part, on the virtual light source object analysis.
[0018] Some disclosed methods may involve determining at least one of a player position or a player viewpoint. In some such examples, generating the object-based MS content may be based, at least in part, on the player position, the player viewpoint, or a combination thereof. [0019] In some examples, generating the object-based MS content may be based, at least in part, on a machine learning process implemented by the control system. According to some
examples, generating the object-based MS content may be performed by a neural network implemented by the control system.
[0020] Some or all of the operations, functions and/or methods described herein may be performed by one or more devices according to instructions (e.g., software) stored on one or more computer-readable non-transitory media. Such non-transitory media may include one or more memory devices such as those described herein, including but not limited to one or more random access memory (RAM) devices, read-only memory (ROM) devices, etc. Accordingly, some innovative aspects of the subject matter described in this disclosure can be implemented in one or more computer-readable non-transitory media having software stored thereon.
[0021] At least some aspects of the present disclosure may be implemented via apparatus. For example, one or more devices may be capable of performing, at least in part, the methods disclosed herein. In some implementations, an apparatus may include an interface system and a control system. The control system may include one or more general purpose single- or multi-chip processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs) or other programmable logic devices, discrete gates or transistor logic, discrete hardware components, or combinations thereof. The control system may be configured to perform some or all of the disclosed methods.
[0022] Details of one or more implementations of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages will become apparent from the description, the drawings, and the claims. Note that the relative dimensions of the following figures may not be drawn to scale.
BRIEF DESCRIPTION OF THE DRAWINGS
[0023] Disclosed embodiments now be described, by way of example only, with reference to the accompanying drawings.
[0024] Figure 1 A is a block diagram that shows examples of components of an apparatus capable of implementing various aspects of this disclosure.
[0025] Figure IB shows example elements of an endpoint.
[0026] Figure 2 shows examples of actuator elements.
[0027] Figure 3 shows example elements of a system for the creation and playback of multi- sensory (MS) experiences.
[0028] Figure 4 shows example elements of a system for the creation and playback of multi- sensory experiences corresponding to virtual worlds.
[0029] Figure 5 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds.
[0030] Figure 6 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds.
[0031] Figure 7 is a flow diagram that outlines one example of a method that may be performed by an apparatus or system such as those disclosed herein.
DETAILED DESCRIPTION
[0032] Described herein are techniques related to providing multi-sensory media content. In the following description, for purposes of explanation, numerous examples and specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be evident, however, to one skilled in the art that the present disclosure as defined by the claims may include some or all of the features in these examples alone or in combination with other features described below, and may further include modifications and equivalents of the features and concepts described herein.
[0033] In the following description, various methods, processes and procedures are detailed. Although particular steps may be described in a certain order, such order is mainly for convenience and clarity. A particular step may be repeated more than once, may occur before or after other steps (even if those steps are otherwise described in another order), and may occur in parallel with other steps. A second step is required to follow a first step only when the first step must be completed before the second step is begun. Such a situation will be specifically pointed out when not clear from the context.
[0034] In this document, the terms “and”, “or” and “and/or” are used. Such terms are to be read as having an inclusive meaning. For example, “A and B” may mean at least the following: “both A and B”, “at least both A and B”. As another example, “A or B” may mean at least the following: “at least A”, “at least B”, “both A and B”, “at least both A and B”. As another example, “A and/or B” may mean at least the following: “A and B”, “A or B”. When an exclusive-or is intended, such will be specifically noted (e.g., “either A or B”, “at most one of A and B”).
[0035] This document describes various processing functions that are associated with structures such as blocks, elements, components, circuits, etc. In general, these structures may be implemented by one or more processors controlled by one or more computer programs.
[0036] As noted above, media content delivery has generally been focused on audio and video experiences. There has been limited delivery of multi-sensory (MS) content due to the customized nature of actuation.
[0037] This disclosure describes devices, systems and methods for analyzing the scenes, objects and context of virtual worlds and generating object-based multi-sensory (MS) content. As used herein, terms such as “object-based MS content,” “object-based MS data,” etc., are used to distinguish, for example, channel-based data that corresponds to one or more particular sensory actuators in a playback environment. In contrast, object-based MS data may be generalized for a wide range of playback environments with a wide range of actuator types, numbers of actuators, etc. In some examples, object-based MS data may include object-based light data, object-based haptic data, object-based air flow data, or object-based positional actuator data, object-based olfactory data, object-based smoke data, object based data for one or more other types of sensor effects, or combinations thereof. According to some examples, the object-based MS data may include sensory objects and corresponding sensory metadata. Various examples are disclosed herein. The term “virtual world” refers generally to a computer-simulated environment. In some examples, the virtual world may be populated by one or more avatars corresponding to — but not necessarily resembling — one or more people in the real world. The terms “MS content,” “MS objects,” etc., generally refer to content other than audio content or video content. The terms “MS content,” “MS objects,” etc., also may be referred to herein as “sensory content,” “sensory objects,” etc., in part because in some instances only one type of sensory content may be involved, such as only light-based content, only haptics-based content, etc. However, in other examples combinations of sensory content may be provided.
[0038] Acronyms
MS - multisensory
MSIE - MS Immersive Experience AR - Augmented Reality VR - Virtual Reality
PC - personal computer
[0039] Figure 1 A is a block diagram that shows examples of components of an apparatus capable of implementing various aspects of this disclosure. As with other figures provided herein, the types and numbers of elements shown in Figure 1 A are merely provided by way of example. Other implementations may include more, fewer and/or different types and numbers of elements. According to some examples, the apparatus 101 may be, or may include, a device that is configured for performing at least some of the methods disclosed herein, such as a smart audio device, a laptop computer, a cellular telephone, a tablet device, a smart home hub, etc. In some such implementations the apparatus 101 may be, or may include, a server that is configured for performing at least some of the methods disclosed herein.
[0040] In this example, the apparatus 101 includes at least an interface system 105 and a control system 110. In some implementations, the control system 110 may be configured for performing, at least in part, the methods disclosed herein. In some examples, the control system 110 may be configured for receiving virtual world data, including virtual world object data corresponding to one or more virtual world objects. At least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects. In some examples, at least some of the virtual world data may correspond to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
[0041] According to some examples, the control system 110 may be configured for receiving virtual world state data corresponding to a virtual world state. The virtual world state may be a state of a virtual world in which the one or more virtual world objects exist. The virtual world state may be based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof.
[0042] In some examples, the control system 110 may be configured for generating objectbased multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data. In some such examples, the control system 110 may be configured for providing the object-based MS content to an MS Tenderer.
[0043] The MS Tenderer may, in some examples, be implemented by another instance of the control system 110. The MS renderer may be configured to render the object-based MS content to control signals for one or more actuators residing in a real-world environment in which video data corresponding to a virtual world is being presented on one or more displays. The rendering may involve synchronizing the object-based MS content with the video data. In some examples, audio data corresponding to the virtual world is also being reproduced in the real-world environment. In some such examples, the rendering may involve synchronizing the object-based MS content with the audio data. According to some examples, the audio data may be, or may include, object-based audio data. The object-based audio data may include audio signals and corresponding audio metadata. In some examples, the audio data may be, or may include, channel-based audio data. The MS Tenderer also may be referred to herein as a “sensory Tenderer,” because in some instances the MS Tenderer may be rendering only one type of MS data, such as object-based lighting data. According to some examples, the MS Tenderer may be configured for providing the actuator control signals to one or more controllable actuators of the set of controllable actuators.
[0044] According to some examples, the object-based MS content may include sensory objects and sensory object metadata, which may also be referred to herein as object-based sensory metadata. In some examples, the sensory object metadata may include sensory spatial metadata indicating at least a spatial position for rendering the object-based sensory metadata within the environment, an area or volume for rendering the object-based sensory metadata within the environment, or combinations thereof. In some implementations, the object-based sensory metadata does not correspond to any particular sensory actuator in the environment.
[0045] The interface system 105 may include one or more network interfaces and/or one or more external device interfaces (such as one or more universal serial bus (USB) interfaces). According to some implementations, the interface system 105 may include one or more wireless interfaces. The interface system 105 may include one or more devices for implementing a user interface, such as one or more microphones, one or more speakers, a display system, a touch sensor system and/or a gesture sensor system. In some examples, the interface system 105 may include one or more interfaces between the control system 110 and a memory system, such as the optional memory system 115 shown in Figure 1 A. However, the control system 110 may include a memory system in some instances.
[0046] The control system 110 may, for example, include a general purpose single- or multichip processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, and/or discrete hardware components.
[0047] In some implementations, the control system 110 may reside in more than one device. For example, a portion of the control system 110 may reside in a device within an environment (such as a laptop computer, a tablet computer, a smart audio device, etc.) and another portion of the control system 110 may reside in a device that is outside the environment, such as a server. In other examples, a portion of the control system 110 may reside in a device within an environment and another portion of the control system 110 may reside in one or more other devices of the environment.
[0048] Some or all of the methods described herein may be performed by one or more devices according to instructions (e.g., software) stored on one or more non-transitory media. Such non-transitory media may include memory devices such as those described herein, including but not limited to random access memory (RAM) devices, read-only memory (ROM) devices, etc. The one or more non-transitory media may, for example, reside in the optional memory system 115 shown in Figure 1 A and/or in the control system 110. Accordingly, various innovative aspects of the subject matter described in this disclosure can be implemented in one or more non-transitory media having software stored thereon. The software may, for example, include instructions for controlling at least one device to process audio data. The software may, for example, be executable by one or more components of a control system such as the control system 110 of Figure 1A.
[0049] In some examples, the apparatus 101 may include the optional microphone system 120 shown in Figure 1 A. The optional microphone system 120 may include one or more microphones. In some implementations, one or more of the microphones may be part of, or associated with, another device, such as a speaker of the speaker system, a smart audio device, etc.
[0050] According to some implementations, the apparatus 101 may include the optional actuator system 125 shown in Figure 1 A. The optional actuator system 125 may include one or more loudspeakers, one or more haptic devices, one or more light fixtures, also referred to herein as luminaires, one or more fans or other air-moving devices, one or more display devices, including but not limited to one or more televisions, one or more positional
actuators, one or more other types of devices for providing an MS experience, or combinations thereof. The term “light fixture” as used herein refers generally to any actuator that is configured to provide light. The term “light fixture” encompasses various types of light sources, including individual light sources such as light bulbs, groups of light sources such as light strips, light panels such as light-emitting diode (LED) panels, projectors, display devices such as television (TV) screens, etc. A “light fixture” may be moveable, and therefore the word “fixture” in this context does not mean that a light fixture is necessarily in a fixed position in space. The term “positional actuators” as used herein refers generally to devices that are configured to change a position or orientation of a person or object, such as motion simulator seats. Loudspeakers may sometimes be referred to herein as “speakers.” In some implementations, the optional actuator system 125 may include a display system including one or more displays, such as one or more light-emitting diode (LED) displays, one or more organic light-emitting diode (OLED) displays, etc. In some examples wherein the apparatus 101 includes a display system, the optional sensor system 130 may include a touch sensor system and/or a gesture sensor system proximate one or more displays of the display system. According to some such implementations, the control system 110 may be configured for controlling the display system to present a graphical user interface (GUI), such as a GUI related to implementing one of the methods disclosed herein.
[0051] In some implementations, the apparatus 101 may include the optional sensor system 130 shown in Figure 1 A. The optional sensor system 130 may include a touch sensor system, a gesture sensor system, one or more cameras, etc.
[0052] This disclosure describes methods for rendering and delivering a flexibly scaled multi-sensory (MS) immersive experience (MSIE) to different playback environments, which also may be referred to herein as endpoints. Such endpoints may include a room, such as the living room of a home, a car, a cinema, a night club or other venue, an AR/VR headset, a PC, a mobile device, etc.
[0053] Figure IB shows example elements of an endpoint. In this example, the endpoint is a living room 1001 containing multiple actuators 008, some furniture 1010 and a person 1000 — also referred to herein as a user — who will consume a flexibly-scaled MS experience. Actuators 008 are devices capable of altering the environment 1001 that the user 1000 is in. Actuators 008 may include one or more haptic devices, one or more light fixtures, also
referred to herein as luminaires, one or more fans or other air-moving devices, one or more display devices, including but not limited to one or more televisions, one or more positional actuators, one or more other types of devices for providing an MS experience, or combinations thereof.
[0054] The number of actuators 008, the arrangement of actuators 008 and the capabilities of actuators 008 in the space 1001 may vary significantly between different endpoint types. For example, the number, arrangement and capabilities of actuators 008 in a car will generally be different from the number, arrangement and capabilities of actuators 008 in a living room, a night club, etc. In many implementations, the number, arrangement and/or capabilities of actuators 008 may vary significantly between different instances of the same type, e.g., between a small living room with 2 actuators 008 and a large living room with 16 actuators 008. The present disclosure describes various methods for creating and delivering flexibly-scaled MSIEs to these non-homogenous endpoints.
[0055] Figure 2 shows examples of actuator elements. In this example, the actuator is a luminaire 1100, which includes a network module 1101, a control module 1102 and a light emitter 1103. According to this example, the light emitter 1103 includes one or more lightemitting devices, such as light-emitting diodes, which are configured to emit light into an environment in which the luminaire 1100 resides. In this example, the network module 1101 is configured to provide network connectivity to one or more other devices in the space, such as a device that sends commands to control the emission of light by the luminaire 1100. According to this example, the network module 1101 is an instance of the interface system 105 of Figure 1A. In this example, the control module 1102 is configured to receive signals via the network module 1101 and to control the light emitter 1103 accordingly. According to this example, the control module 1102 is an instance of the control system 110 of Figure 1A.
[0056] Other examples of actuators also may include a network module 1101 and a control module 1102, but may include other types of actuating elements. Some such actuators may include one or more haptic devices, one or more fans or other air-moving devices, one or more positional actuators, one or more loudspeakers, one or more display devices, etc.
[0057] Figure 3 shows example elements of a system for the creation and playback of multi- sensory (MS) experiences. As with other figures provided herein, the types and numbers of elements shown in Figure 3 are merely provided by way of example. Other implementations
may include more, fewer and/or different types and numbers of elements. According to some examples, system 300 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein. In some examples, system 300 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
[0058] According to the examples in the present disclosure, creating and providing an object-based MS Immersive Experience (MSIE) approach involves the application of a suite of technologies for creation, delivery and rendering of object-based sensory data, which may include sensory objects and corresponding sensory metadata, to the actuators 008. Some examples are described in the following paragraphs.
[0059] Object-Based Representation: In various disclosed implementations, multi-sensory (MS) effects are represented using what may be referred to herein as multi-sensory (MS) objects, or simply as “sensory objects.” According to some such implementations, properties such as layer-type and priority may be assigned to and associated with attached to each sensory object, enabling content creators’ intent to be represented in the rendered experiences. Detailed examples of sensory object properties are described below.
[0060] In this example, system 300 includes a content creation tool 000 that is configured for designing multi-sensory (MS) immersive content and for outputting object-based sensory data 005, either separately or in conjunction with corresponding audio data 011 and/or video data 012, depending on the particular implementation. The object-based sensory data 005 may include time stamp information, as well as information indicating the type of sensory object, the sensory object properties, etc. In this example, the object-based sensory data 005 is not “channel -based” data that corresponds to one or more particular sensory actuators in a playback environment, but instead is generalized for a wide range of playback environments with a wide range of actuator types, numbers of actuators, etc. In some examples, the object-based sensory data 005 may include object-based light data, object-based haptic data, object-based air flow data, or object-based positional actuator data, object-based olfactory data, object-based smoke data, object based data for one or more other types of sensor effects, or combinations thereof. According to some examples, the object-based sensory data 005 may include sensory objects and corresponding sensory metadata. For example, if the object-based sensory data 005 includes object-based light data, the object-based light data may include light object position metadata, light object color metadata, light object size
metadata, light object intensity metadata, light object shape metadata, light object diffusion metadata, light object gradient metadata, light object priority metadata, light object layer metadata, or combinations thereof. In some examples, the object-based sensory data 005 may include time data, such as time stamp information. Although the content creation tool 000 is shown providing a stream of object-based sensory data 005 to the experience player 002 in this example, in alternative examples the content creation tool 000 may produce object-based sensory data 005 that is stored for subsequent use. Examples of control parameters exposed by a light-object-based content creation tool are described below.
Examples of MS Object Properties
[0061] Following is a non-exhaustive list of possible properties of MS objects:
• Priority;
• Layer;
• Mixing Mode;
• Persistence;
• Effect; and
• Spatial Panning Law.
[0062] Effect
As used herein, the “effect” of an MS object is a synonym for the type of MS object. An “effect” is, or indicates, the sensory effect that the MS object is providing. If an MS object is a light object, its effect will involve providing direct or indirect light. If an MS object is a haptic object, its effect will involve providing some type of haptic feedback. If an MS object is an air flow object, its effect will involve providing some type of air flow. As described in more detail below, some examples involve other “effect” categories.
[0063] Persistence
Some MS objects may contain a persistence property in their metadata. For example, as an moveable MS object moves around in a scene, the moveable MS object may persist for some period of time at locations that the moveable MS object passes through. That period of time may be indicated by persistence metadata. In some implementations, the MS Tenderer is responsible for constructing and maintaining the persistence state.
[0064] Layers
According to some examples, individual MS objects may be assigned to “layers,” in which MS objects are grouped together according to one or more shared characteristics. For example, layers may group MS objects together according to their intended effect or type, which may include but are not limited to the following:
Mood/ Ambience
Informational
Punctuation al /Attention
Alternatively, or additionally, in some examples, layers may be used to group MS objects together according to shared properties, which may include but are not limited to the following:
Color
Intensity
Size
Shape
Position
Region in space
[0065] Priority
In some examples, MS objects may have a priority property that enables the Tenderer to determine which object(s) should take priority in an environment in which MS objects are contending for limited actuators. For example, if multiple light objects overlap with a single light fixture at a time during which all of the light objects are scheduled to be rendered, a Tenderer may refer to the priority of each light object in order to determine which light object(s) will be rendered. In some examples, priority may be defined between layers or within layers. According to some examples, priority may be linked to specific properties such as intensity. In some examples, priority may be defined temporally: for example, the most recent MS object to be rendered may take precedence over MS objects that have been rendered earlier. According to some examples, priority may be used to specify MS objects or layers that should be rendered regardless of the limitations of a particular actuator system in a playback environment.
[0066] Spatial Panning Laws
Spatial panning laws may define how an MS object’s position and movement across a space results in activating MS actuators as the MS object moves between the MS actuators, etc.
[0067] Mixing Mode
The mixing mode may specify how multiple MS objects are multiplexed onto a single actuator. In some examples, mixing modes may include one or more of the following:
Max mode: select the MS object which activates an actuator the most;
Mix mode: mix in some or all the objects according to a rule set, for example by summing activation levels, taking the average of activation levels, mixing color according to activation level or priority level, etc.;
MaxNmix: mix in the top N MS objects (by activation level), according to a rule set.
[0068] According to some examples, more general metadata for an entire multi-sensory content file, instead of (or in addition to) per-object metadata may be defined. For example, MS content files may include metadata such as trim passes or mastering environment.
[0069] Trim Controls
[0070] What are referred to in the context of Dolby Vision™ as “trim controls” may act as guidance on how to modulate the default rendering algorithm for specific environments or conditions at the endpoint. Trim controls may specify ranges and/or default values for various properties, including saturation, tone detail, gamma, etc. For example, there may be automotive trim controls, which provide specific defaults and/or rule sets for rendering in automotive environments, for example guidance that includes only objects of a certain priority or layer. Other examples may provide trim controls for environments with limited, complex or sparse multisensory actuators.
[0071] Mastering Environment
A single piece of multisensory content may include metadata on the properties of the mastering environment such as room size, reflectivity and ambient bias lighting level. The specific properties may differ depending on the desired endpoint actuators. Mastering environment information can aid in providing reference points for rendering in a playback environment.
[0072] MS Object Renderer: Various disclosed implementations provide a Tenderer that is configured to render MS effects to actuators in a playback environment. According to this example, system 300 includes an MS renderer 001 that is configured to render object-based sensory data 005 to actuator control signals 310, based at least in part on environment and actuator data 004. In this example, the MS renderer 001 is configured to output the actuator control signals 310 to MS controllers 003, which are configured to control the actuators 008. In some examples, the MS renderer 001 may be configured to receive light objects and object-based lighting metadata indicating an intended lighting environment, as well as lighting information regarding a local lighting environment. The lighting information is one general type of environment and actuator data 004, and may include one or more characteristics of one or more controllable light sources in the local lighting environment. In some examples, the MS renderer 001 may be configured to determine a drive level for each of the one or more controllable light sources that approximates the intended lighting environment. According to some examples, the MS renderer 001 (or one of the MS controllers 003) may be configured to output the drive level to at least one of the controllable light sources. Some alternative examples may include a separate renderer for each type of actuator 008, such as one renderer for light fixtures, another renderer for haptic devices, another renderer for air flow devices, etc. In other implementations, a single renderer may be configured as an MS renderer and as an audio renderer and/or as a video renderer. In some implementations, the MS renderer 001 may be configured to adapt to changing conditions. Some examples of MS renderer 001 implementations are described in more detail below.
[0073] The environment and actuator data 004 may include what are referred to herein as “room descriptors” that describe actuator locations (e.g., according to an x,y,z coordinate system or a spherical coordinate system). In some examples, the environment and actuator data 004 may indicate actuator orientation and/or placement properties (e.g., directional and north-facing, omnidirectional, occlusion information, etc.). According to some examples, the environment and actuator data 004 may indicate actuator orientation and/or placement properties according to a 3x3 matrix, in which three elements (for example, the elements of the first row) represent spatial position (x,y,z), three other elements (for example, the elements of the second row) represent orientation (roll, pitch, yaw), and three other elements (for example, the elements of the third row) indicate a scale or size (sx, sy, sz). In some examples, the environment and actuator data 004 may include device descriptors that
describe the actuator properties relevant to the MS Tenderer 001, such as intensity range and color gamut of a light fixture, the air flow speed range and direction(s) for an air-moving device, etc.
[0074] In this example, system 300 includes an experience player 002 that is configured to receive object-based sensory data 005’, audio data 011’ and video data 012’, and to provide object-based sensory data 005 to the MS Tenderer 001, to provide audio data 011 to the audio Tenderer 006 and to provide video data 012 to the video Tenderer 007. In this example, the reference numbers for the object-based sensory data 005’, audio data 011’ and video data 012’ received by the experience player 002 include primes (‘), in order to suggest that the data may in some instances be encoded. Likewise, the object-based sensory data 005, audio data 011 and video data 012 output by the experience player 002 do not include primes, in order to suggest that the data may in some instances have been decoded by the experience player 002. According to some examples, the experience player 002 may be a media player, a game engine or personal computer or mobile device, or a component integrated in a television, DVD player, sound bar, set top box, or a service provider media device such as a Chromecast, Apple TV device, or Amazon Fire TV. In some examples, the experience player 002 may be configured to receive encoded object-based sensory data 005’ along with encoded audio data OI L and/or encoded video data 012’. In some such examples, the encoded object-based sensory data 005’ may be received as part of the same bitstream with the encoded audio data OI L and/or the encoded video data 012’ . Some examples are described in more detail below. According to some examples, the experience player 002 may be configured to extract the object-based sensory data 005 from the content bitstream and to provide decoded object-based sensory data 005 to the MS Tenderer 001, to provide decoded audio data 011 to the audio Tenderer 006 and to provide decoded video data 012 to the video Tenderer 007. In some examples, time stamp information in the object-based sensory data 005 may be used — for example, by the experience player 102, the MS Tenderer 001, the audio Tenderer 106, the video Tenderer 107, or all of them — to synchronize effects relating to the object-based sensory data 005 with the audio data 111 and/or the video data 112, which may also include time stamp information.
[0075] According to this example, system 300 includes MS controllers 003 that are configured to communicate with a variety of actuator types using application program interfaces (APIs) or one or more similar interfaces. Generally speaking, each actuator will require a specific type of control signal to produce the desired output from the Tenderer.
According to this example, the MS controllers 003 are configured to map outputs from the MS Tenderer 001 to control signals for each actuator. For example, a Philips Hue™ light bulb receives control information in a particular format to turn the light on, with a particular saturation, brightness and hue, and a digital representation of the desired drive level. In some alternative examples, the MS Tenderer 001 also may be configured to implement some or all of the MS controllers 003. For example, the MS Tenderer 001 also may be configured to implement one or more lighting-based APIs but not haptic-based APIs, or vice versa.
[0076] In some examples, room descriptors also may describe the size and orientation of the playback environment itself, to establish a relative or absolute coordinate system to which all objects are positioned. For example, in a living room a display screen may be regarded as the front, in some instances the front and center, and the floor and ceiling may be regarded as the vertical bounds. In some such examples, the room descriptors also may also indicate bounds corresponding with the left, right, front, and rear, walls relative to the front position. According to some examples, the room descriptor also may be provided in terms of a matrix, such as a 3x3 matrix. This room descriptor information is useful in describing the physical dimensions of the playback environment, for example in physical units of distance such as meters. In some such examples, sensory object locations, sensory object sizes, and sensory object orientations may be described in units that are relative to the room size, for example in a range from -1 to 1. Room descriptors may also describe a preferred viewing position, in some instances according to a matrix.
[0077] The types, numbers and arrangements of the actuators 008 will generally vary according to the particular implementation. In some examples, actuators 008 may include lights and/or light strips (also referred to herein as “luminaires”), vibrational motors, air flow generators, positional actuators, or combinations thereof.
[0078] Similarly, the types, numbers and arrangements of the loudspeakers 009 and the display devices 010 will generally vary according to the particular implementation. In the examples shown in Figure 3, audio data 011 and video data 012 are rendered by the audio Tenderer 006 and the video Tenderer 007 to the loudspeakers 009 and display devices 010, respectively.
[0079] As noted above, according to some implementations the system 300 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein. In some such examples, one instance of the
control system 110 may implement the content creation tool 000 and another instance of the control system 110 may implement the experience player 002. In some examples, one instance of the control system 110 may implement the audio Tenderer 006, the video Tenderer 007, the multi-sensory Tenderer 001, or combinations thereof. According to some examples, an instance of the control system 110 that is configured to implement the experience player 002 may also be configured to implement the audio Tenderer 006, the video Tenderer 007, the multi-sensory Tenderer 001, or combinations thereof.
Multi-Sensory Rendering Synchronization
[0080] Object-based MS rendering involves different modalities being rendered flexibly to the endpoint/playback environment. Endpoints have differing capabilities according to various factors, including but not limited to the following:
• The number of actuators,
• The modalities of those actuators (e.g., light fixture vs. air flow control device vs. haptic device);
• The types of those actuators (e.g., a white smart light vs. a RGB smart light, or a haptic vest vs. a haptic seat cushion) and
• The location/layout of those actuators.
[0081] In order to render object-based sensory content to any endpoint, some processing of the object signals, e.g. intensities, colors, patterns etc., will generally need to be done. The processing of each modality’s signal path should not alter the relative phase of certain features within the object signals. For example, suppose that a lightning strike is presented in both the haptics and lightscape modalities. The signal processing chain for the corresponding actuator control signals should not result in a time delay of either type of sensory object signal — haptic or light — sufficient to alter the perceived synchronization of the two modalities. The level of required synchronization may depend on various factors, such as whether the experience is interactive and what other modalities are involved in the experience. Maximum time difference values may, for example, range from approximately 10ms to 100ms, depending on the particular context.
HAPTICS
Rendering of object-based haptics content
[0082] Object-based haptics content conveys sensory aspects of the scene through an abstract sensory representation rather than a channel-based scheme only. For example, instead of defining haptics content as a single-channel time-dependent amplitude signal only, that is in turn played out of a particular haptics actuator such as a vibro-tactile motor in a vest the user wears, object-based haptics content may be defined by the sensations that it is intended to convey. More specifically, in one example, we may have a haptic object representing a collision haptic sensory effect. Associated with this object is:
• The haptic object’s spatial location;
• The spatial direction/vector of the haptic effect;
• The intensity of the haptic effect;
• Haptic spatial and temporal frequency data; and
• A time-dependent amplitude signal.
[0083] According to some examples, a haptic object of this type may be created automatically in an interactive experience such as a video game, e.g. in a car racing game when another car hits a player’s car from behind. In this example, the MS Tenderer will determine how to render the spatial modality of this effect to the set of haptic actuators in the endpoint. In some examples, the Tenderer does this according to information about the following:
• The type(s) of haptic devices available, e.g., haptic vest vs. haptic glove vs. haptic seat cushion vs. haptic controller;
• The locale of each haptic device with respect to the user(s) (some haptic devices may not be coupled to the user(s), e.g., a floor- or seat-mounted shaker);
• The type of actuation each haptic device provides, e.g. kinesthetic vs. vibro-tactile;
• The on- and off-set delay of each haptic device (in other words, how fast each haptic device can turn on and off);
• The dynamic response of each haptic device (how much the amplitude can vary);
• The time-frequency response of each haptic device (what time-frequencies the haptic device can provide);
• The spatial distribution of addressable actuators within each haptic device: for example, a haptic vest may have dozens of addressable haptics actuators distributed over the user’ s torso; and
• The time-response of any haptic sensors used to render closed-loop haptic effects (e.g., an active force-feedback kinesthetic haptic device.
[0084] These attributes of the haptics modality of the endpoint will inform the render how best to render a particular haptic effect. Consider the car crash effect example again. In this example, a player is wearing a haptic vest, a haptic arm band and haptic gloves. According to this example, a haptic shockwave effect is spatially located at the place where the car has collided into the player. The shockwave vector is dictated by the relative velocity of the player’s car and the car that has hit the player. The spatial and temporal frequency spectra of the shockwave effect are authored according to the type of material the virtual cars are intended to be made of, amongst other virtual world properties. The Tenderer then renders this shockwave through the set of haptics devices in the endpoint, according to the shockwave vector and the physical location of the haptics devices relative to the user.
[0085] The signals sent to each specific actuator are preferably provided so that the sensory effect is congruent across all of the (potentially heterogenous) actuators available. For example, the Tenderer may not render very high frequencies to just one of the haptic actuators (e.g., the haptic arm band) due to capabilities lacking in other actuators. Otherwise, as the shockwave moves through the player’s body, because the haptic vest and haptic gloves the user is wearing do not have the capability to render such high frequencies, there would a degradation of the haptic effect perceived by the user as the wave moves through the vest, into the arm band and finally into the gloves.
[0086] Some types of abstract haptic effects include:
• Shockwave effects, such as described above;
• Barrier effects, such as haptic effects which are used to represent spatial limitations of a virtual world, for example in a video game. If there are kinesthetic actuators on input devices (e.g., force feedback on a steering wheel or joystick), either active or resistive, then rendering of such an effect can be done through the resistive force applied to the users input. If no such actuators are available in the endpoint then in some examples vibro-tactile feedback may be rendered that is congruent with the collision of the in-game avatar with a barrier;
• Presence, for example to indicate the presence of a large object approaching the scene such as a train. This type of haptic effect may be rendered using a low timefrequency rumbling of some haptic devices’ actuators. This type of haptic effect may also be rendered through contact spatial feedback applied as pressure from aircuffs;
• User interface feedback, such as clicks from a virtual button. For example, this type of haptic effect may be rendered to the closest actuator on the body of the user that performed the click, for example haptic gloves that the user is wearing. Alternatively, or additionally, this type of haptic effect may also be rendered to a shaker coupled to the chair in which the user is sitting. This type of haptic effect may, for example, be defined using time-dependent amplitude signals. However, such signals may be altered (modulated, frequency-shifted, etc.) in order to best suit the haptic device(s) that will be providing the haptic effect;
• Movement. These haptic effects are designed so that the user perceives some form of motion. These haptic effects may be rendered by an actuator that actually moves the user, e.g. a moving platform/seat. In some examples, an actuator may provide a secondary modality (via video, for example) to enhance the motion being rendered; and
• Triggered sequences. These haptic effects are characterized mainly by their timedependent amplitude signals. Such signals may be rendered to multiple actuators and may be augmented when doing so. Such augmentations may include splitting a signal in either time or frequency across multiple actuators. Some examples may involve augmenting the signal itself so that the sum of the haptic actuator outputs does not match the original signal.
Spatial and Non-Spatial Effects
[0087] Spatial effects are those which are constructed in a way that convey some spatial information of the multi-sensory scene being rendered. For example, if the playback environment is a room, a shockwave moving through the room would be rendered differently to each haptic device given its location within the room, according to the position and size of one or more haptic objects being rendered at a particular time.
[0088] Non-spatial effects may, in some examples, target particular locations on the user regardless of the user’s location or orientation. One example is a haptic device that provides a swelling vibration on the users back to indicate immediate danger. Another example is a haptic device that provides a sharp vibration to indicate an injury to a particular body area.
[0089] Some effects may be non-diegetic effects. Such effects are typically associated with user interface feedback, such as a haptic sensation to indicate the user completed a level or
has clicked a button on a menu item. Non-diegetic effects may be either spatial or non- spatial.
Haptic Device Type
[0090] Receiving information regarding the different types of haptics devices available at the endpoint enables the Tenderer to determine what kinds of perceived effects and rendering strategies are available to it. For example, local haptics device data indicating that the user is wearing both haptic gloves and a vibro-tactile vest — or at least local haptics device data indicating that that haptic gloves and a vibro-tactile vest are present in the playback environment — allows the Tenderer to render a congruent recoil effect across the two devices when a user shoots a gun in a virtual world. The actual actuator control signals sent to the haptic devices may be different than in the situation where only a single device is available. For example, if the user is only wearing a vest, the actuator control signals used to actuate the vest may differ with regard to the timing of the onset, the maximum amplitude, frequency and decay time of the actuator control signals, or combinations thereof.
Location of the Devices
[0091] Knowledge of the location of the haptics devices across the endpoint enables the Tenderer to render spatial effects congruently. For example, knowledge of the location of the shaker motors in a lounge enables the Tenderer to produce actuator control signals to each of the shaker motors in the lounge in a way to convey spatial effects such as a shockwave propagating through the room. Additionally, knowledge of where wearable haptics devices, whilst implicit by their type, e.g. a glove is on the user’s hand, may also be used by the Tenderer to convey spatial effects in addition to non-spatial effects.
Types of Actuation Provided by Haptic Devices
[0092] Haptic devices can provide a range of different actuations and thus perceived sensations. These are typically classed in two basic categories:
1. vibro-tactile , e.g. vibrations; or
2. Kinesthetic, e.g., resistive or active force feedback.
[0093] Either category of actuations may be static or dynamic, where dynamic effects are altered in real time according to some sensor input. Examples include a touch screen rendering a texture using a vibro-tactile actuator and a position sensor measuring the user’s finger position(s).
[0094] Moreover, the physical construction of such actuators varies widely and affects many other attributes of the device. An example of this is the onset delay or time-frequency response that varies significantly across the following haptic device types:
• Eccentric rotating mass;
• Linear resonant actuator;
• Piezoelectric actuator; and
• Linear magnetic ram.
[0095] The Tenderer should be configured to account for the onset delay of a particular haptics device type when rendering signals to be actuated by the haptics devices in the endpoint.
The On- and Off-Set Delays of the Haptic Devices
[0096] The onset delay of the haptic device refers to the delay between the time that an actuator control signal is sent to the device and the device’s physical response. The off-set delay refers to the delay between the time that an actuator control signal is sent to zero the output of the device and the time the device stops actuating.
The Time-Frequency Response
[0097] The time-frequency response refers to the frequency range of the signal amplitude as a function of time that the haptic device can actuate at steady state.
The Spatial-Frequency Response
[0098] The spatial-frequency response refers to the frequency range of the signal amplitude as a function of the spacing of actuators of a haptic device. Devices with closely-spaced actuators have higher spatial-frequency responses.
Dynamic Range
[0099] Dynamic range refers to the differences between the minimum and maximum amplitude of the physical actuation.
Characteristics of Sensors in Closed-Loop Haptics Devices
[0100] Some dynamic effects use sensors to update the actuation signal as a function of some observed state. The sampling frequencies, both temporal and spatial along with the
noise characteristics will limit the capability of the control loop updating the actuator providing the dynamic effect.
AIRFLOW
[0101] Another modality that some multi-sensory immersive experiences (MSIE) may use is airflow. The airflow may, for example, be rendered congruently with one or more other modalities such as audio, video, light-effects and/or haptics. Rather than only specialized (e.g. channel-based) setups for 4D experiences in cinemas which may include “wind effects,” some airflow effects may be provided at other endpoints that may typically include airflow, such as a car or a living room. Rather than a channel-based system, the airflow sensory effects may be represented as an airflow object that may include properties such as:
• Spatial location;
• Direction of the intended airflow effect;
• Intensity/airflow speed; and/or
• Air temperature.
[0102] Some examples of air flow objects may be used to represent the movement of a bird flying past. To render to the airflow actuators at the endpoint, the MS Tenderer 001 may be provided with information regarding:
• The type of airflow devices e.g. fan, air conditioning, heating;
• The position of each airflow device relative to the user’s location, or relative to an expected of the user;
• The capabilities of the airflow device, e.g., the airflow device’s ability to control direction, airflow and temperature;
• The level of control of each actuator, e.g., airflow speed, temperature range; and
• The response time of each actuator, e.g., how long does it take to reach a chosen speed.
Some Examples of Airflow Use in Different Endpoints
[0103] In a vehicle such as a car, the object-based metadata can be used to create experiences such as:
• Mimicking “chills down your spine” during a horror movie or gaming piece of content with airflow down the chair;
• Simulating the movement of a bird flying past; and/or
• Create a gentle breeze in a seascape.
[0104] In the small enclosed space of a typical vehicle, temperature changes may be possible to achieve over relatively shorter periods of time — as compared to temperature changes in a larger environment, such as a living room environment. In one example, the MS Tenderer 001 may cause an increasing air temperature as a player enters a “lava level” or other hot area during a game. Some examples may include other elements, such as confetti in the air vents to celebrate an event, such as the celebration of a goal made by the user’s favorite football team.
[0105] In a living space or other room, airflow may be synchronized to the breathing rhythm of a guided meditation in one example. In another example, airflow may be synchronized to the intensity of a workout, with increased airflow or decreased temperature as intensity increases. In some examples, there may be relatively less control over spatial aspects during rendering. This is a limitation of current installations of commercial airflow and heating technologies which offer limited spatial resolution. For example a given seating position in a car will tend to only be serviced by a limited number of individual fan positions such as footwell and dash.
Combinations of Lights, Airflow and Haptics
Car Examples
[0106] In some examples, there may be a user interface on the steering wheel or on a touchscreen near or in the dashboard. According to some examples, the following actuators may be present in the car:
1. Individually addressable lights, spatially distributed around the car as follows: o on the dashboard; o under the footwells; o on the doors; and o in the center console.
2. Individually controllable air conditioning/heating outlets distributed around the car as follows: o In the front dashboard; o Under the footwells; o In the center console facing the rear seats;
o On the side pillars; o In the seats; and o Directed to the windscreens (for defogging).
3. Individually controllable seats with vibro-tractile haptics; and
4. Individually controllable floor mats with vibro-tactile haptics.
[0107] In this example, the modalities supported by these actuators include the following:
• Lights across the individually addressable LEDs in the car, plus the indicator lights on the dash and steering wheel;
• Air flow via the controllable air conditioning vents;
• Haptics, including: o Steering wheel: tactile vibration feedback; o Dash touchscreen: tactile vibration feedback and texture rendering; and o Seats: tactile vibrations and movement.
[0108] In one example, a live music stream is being rendered to four users sitting in the front seats. In this example, the MS Tenderer 001 attempts to optimize the experience for multiple viewing positions. During the build-up before the artist has taken the stage and the previous acts have finished, the content contains:
• Interlude music;
• Low intensity lighting; and
• Haptic content representing the moshing of the crowd.
[0109] In addition to the rendered audio and video stream, the light content contains ambient light objects that are moving slowly around the scene. These may be rendered using one of the ambient layer methods disclosed herein, for example such that there is no spatial priority given to any user’s perspective. In some examples, the haptic content may be spatially concentrated in the lower time-frequency spectrum and may be rendered only by the vibro- tactile motors in the floor mats.
[0110] According to this example, pyrotechnic events during the music stream correspond to multi-sensory-sensory content including:
• Light objects that spatially correspond to the location of the pyrotechnics at the event; and
• Haptic objects to reinforce the dynamism of the pyrotechnics via a shockwave effect.
[0111] In this example, the MS Tenderer 001 renders both the light objects and the haptic objects spatially. Light objects may, for example, be rendered in the car such that each person in the car perceives the light objects to come from the left if the pyrotechnics content is located at the left of the scene. In this example, only lights on the left of the car are actuated. Haptics may be rendered across both the seats and floor mats in a way that conveys directionality to each user individually.
[0112] At the end of the concert the pyrotechnics are present in the audio content and both pyrotechnics and confetti are present in the video content. In addition to rendering light objects and haptic objects corresponding to the pyrotechnics as above, the effect of the confetti firing may be rendered using the airflow modality. For example, the individually controllable air flow vents of the HVAC system may be pulsed.
Living Room Examples
[0113] In this implementation, in addition to an audio/visual (AV) system that includes multiple loudspeakers and a television, the following actuators and related controls are available in the living room:
• A haptics vest that the user — also referred to as a player — is wearing;
• Haptics shakers mounted to the seat in which the player is sitting;
• A (haptics) controllable smart watch;
• Smart lights spatially distributed around the room;
• A wireless controller; and
• An addressable air-flow bar (AFB), which includes an array of individually controllable fans directed to the user (similar to HVAC vents in the front dashboard of a car).
[0114] In this example, the user is playing a first person shooter game and the game contains a scene in which a destructive hurricane moves through the level. As it does so, in-game objects are thrown around and some hit the player. Haptics objects rendered by the MS renderer 001 cause a shockwave effect to be provided through all of the haptics devices that the user can perceive. The actuator control signals sent to each device may be optimized according to the intensity of the impact of the in-game objects, the directi on(s) of the impact and the capabilities and location of each actuator (as described earlier).
[0115] At a time before the user is struck by an in-game object, the multi-sensory content contains a haptic object corresponding to a non-spatial rumble, one or more airflow objects corresponding to directional airflow; and one or more light objects corresponding to lightning. The MS Tenderer 001 renders the non-spatial rumble to the haptics devices. The actuator control signals sent to each haptics device may be rendered such that the ensemble of actuator control signals across the haptics array is congruent in perceived onset time, intensity and frequency. In some examples, the frequency content of the actuator control signals sent to the smart watch may be low-pass filtered, so that they are congruent with the frequency-limited capability of the vest, which is proximate to the watch. The MS Tenderer 001 may render the one or more airflow objects to actuator control signals for the AFB such that the air flow in the room is congruent with the location and look direction of the player in the game, as well as the hurricane direction itself. Lightning may be rendered across all modalities as (1) a white flash across lights that are located in suitable locations, e.g., in or on the ceiling; and (2) an impulsive rumble in the user’s wearable haptics and seat shaker.
[0116] When the user is struck by an in-game object, a directional shockwave may be rendered to the haptics devices. In some examples, a corresponding airflow impulse may be rendered. According to some examples, a damage take effect, indicating the amount of damage caused to the player by being struck by the in-game object, may be rendered by the lights.
[0117] In some such examples, signals may be rendered spatially to the haptics devices such that a perceived shockwave moves across the player’s body and the room. The MS Tenderer 001 may provide such effects according to actuator location information indicating the haptics devices locations relative to one another. The MS Tenderer 001 may provide the shockwave vector and position according to the actuator location information in addition to actuator capability information. According to some examples, a non-directional air flow impulse may be rendered, e.g., all the air vents of the AFB may be turned up briefly to reinforce the haptic modality. In some examples, at the same time, a red vignette may be rendered to the light strip surrounding the TV, indicating to the player that the player took damage in the game.
[0118] Figure 4 shows example elements of a system for the creation and playback of multi- sensory (MS) experiences corresponding to virtual worlds. As with other figures provided herein, the types and numbers of elements shown in Figure 4 are merely provided by way of
example. Other implementations may include more, fewer and/or different types and numbers of elements. For example, other implementations may include multiple video displays 010, which are also referred to herein as display devices 010. According to some examples, system 400 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein. In some examples, system 400 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
[0119] In the example shown in Figure 4, as in Figure 3, the MS creation tool 000 is configured to create object-based sensory data 005 — also referred to herein as object-based MS content 005 — corresponding to the virtual world objects 14001. Virtual world objects 14001 may also be referred to herein as “in-world objects 14001.” According to some examples, the MS creation tool 000 may not be involved the actual MS content creation process, but instead may add properties, metadata, etc., to the virtual world objects 14001. Virtual worlds are dynamic and may change based on the actions of the user, which means the object-based MS content 005 will sometimes also be dynamic. For such dynamic situations, the MS creation tool 000 is processing at run-time to produce the appropriate object-based MS content 005. The MS creation tool 000 is configured to pre-author the object-based MS content 005, to varying degrees. The MS creation tool 000 may, in some examples, be configured to completely design the object-based MS content 005 as an authoring step, and to associate the object-based MS content 005 with a virtual world object (VWO) 14001. The object-based MS content 005 may, in some such examples, simply be passed to the MS Tenderer 001 at the appropriate time, as suggested by Figure 4. Alternatively, the MS creation tool 000 may indicate — for example, via a tag — that a particular VWO 14001 should be associated with particular effect, such as a haptic effect. At run time, the actual object-based MS content 005 may be generated — for example, by the MS Content Generator 14003 shown in Figure 5 — based on that VWO 14001’s tag and any other relevant VWOs 14001 that have the same tag.
[0120] In some examples, the object-based MS content 005 may include object-based light data, object-based haptic data, object-based air flow data, or object-based positional actuator data, object-based olfactory data, object-based smoke data, object based data for one or more other types of sensor effects, or combinations thereof. According to some examples, the object-based MS content 005 may include sensory objects and corresponding sensory metadata. For example, if the object-based MS content 005 includes object-based light data,
the object-based light data may include light object position metadata, light object color metadata, light object size metadata, light object intensity metadata, light object shape metadata, light object diffusion metadata, light object gradient metadata, light object priority metadata, light object layer metadata, or combinations thereof. In some examples, the object-based MS content 005 may include time data, such as time stamp information.
[0121] According to some examples, virtual world objects (VWOs) 14001 may include, but are not limited to, the following:
The avatar of a person acting within the virtual world;
The avatars of other persons that are also acting within the virtual world, for instance in the case where the world supports multiple concurrent users;
The avatars of non-playable characters, such as those controlled by an artificial intelligence;
Interactive objects such as doors, vehicles or obstacles;
“Practical lighting” light sources that are themselves visible to the player such as an ornate lamp that is rendered in-world, such that the fixture itself can be observed but also emits light in-world. Properties of the light that may be observed include, light color, light intensity and radiation pattern;
“Non-practical lighting” light sources that are not themselves visible to the player but are seen indirectly by the resultant lighting of the in-game objects and environment. Properties of the light that may be observed include light color, light intensity and radiation pattern; and
Sound sources, such as those that represents sound effects tied to other in-game objects, dialogue spoken by in-game characters.
[0122] Figure 5 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds. As with other figures provided herein, the types and numbers of elements shown in Figure 5 are merely provided by way of example. Other implementations may include more, fewer and/or different types and numbers of elements. For example, other implementations may include multiple video displays 010, which are also referred to herein as display devices 010. According to some examples, system 500 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein. In some examples, system 500 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
[0123] In the example shown in Figure 5, the MS content generator 14003 is configured to generate object-based MS content 005 based at least in part on virtual world object data 510 corresponding to the virtual world objects (VWOs) 14001 and virtual world state data 505 corresponding to the virtual world state (VWS) 14002. The virtual world object data 510 may, for example, include information regarding virtual world object properties of one or more virtual world objects 14001.
[0124] According to some examples, the MS content generator 14003 may be configured to generate the object-based MS content 005 based at least in part on virtual world data, including but not necessarily limited to the virtual world object data 510. The virtual world data also may include virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof. Virtual world surface type information may, for example, indicate whether a virtual surface is a floor, a ceiling, a wall, a window, a door, a seat, a table, etc. Virtual world object instantiation rules may, for example, indicate what virtual world surface type(s), if any, on which a particular type of virtual world object may be “spawned” or replicated. For example, a tree-type virtual world object may only be allowed to spawn on an upward-facing surface.
[0125] According to some examples, a virtual world state (VWS) 14002 may be associated with one or more of the following:
Player objective(s): typically a video game will feature objectives that a player works towards as they play. These objectives can change as the game progresses, as they are completed and as the game story unfolds. Often there will be a current objective that the player is actively pursuing. This is accounted for by the state of the objectives; Room geometry: as the virtual world is navigated, the player may move through different rooms. The geometry of these rooms may vary, both in size and shape; In-world time of day clock: many video games build in a time of day clock, that leads to changes in the game. The current time of day indicated by this clock can affect many aspects, such as light levels, objectives and various events;
Weather system: some games will mimic the “real world” by incorporating a weather system, that provides the in-world weather conditions. For example, it may rain inworld or it may be sunny;
In-world health of the player: often the in-game characters being player will have an associated amount of health. If this health becomes depleted past a certain point, the
character may die and need to re-spawn. A player’s health level will normally be indicated to the player in order to allow the player to act on this information, for example by attempting to increase the player’s health.
[0126] In some examples, the MS content generator 14003 may be configured to generate the object-based MS content 005 based at least in part on virtual world events (VWEs). VWEs are generally associated with both the VWOs 14001 and VWS 14002, and may include one or more of the following:
An entity firing a weapon. This event would involve at least one VWO 14001 (for example, the weapon) and may affect the VWS 14002 (for example, by decreasing the amount of ammunition available to the weapon);
An interactive object being actioned. For example, a virtual world object that is located by a player may be picked up, moved, destroyed etc.
A change in player objective. As the game progresses, the player’s objectives may be updated;
A player taking damage, for instance by falling, being struck, being exposed to fire or ice. This maybe result from the interaction of VWOs 14001 and may lead to a change in the VWS 14002. For example two players (VWOs 14001) may collide (a VWE), leading to a decrease in the in-game health of both players (a change of the VWS 14002).
[0127] Accordingly, the MS content generator 14003 may be configured for analyzing information about the VWS 14002 and VWOs 14001, which may include VWEs, and for producing the object-based MS content 005 based on the results of this analysis. In some examples, the MS content generator 14003 may be configured for associating the VWOs 14001 with layers. According to some examples, the MS content generator 14003 may be configured for and assigning priorities to the VWOs 14001.
[0128] In some examples, the MS content generator 14003 may be configured for generating the object-based MS content 005 for both gameplay and for “cutscenes,” if the cutscenes are rendered using an engine, such as a game engine, that is providing the virtual world experience. A cutscene is generally a pre-authored audiovisual sequence, which may be associated with a game event such as a celebration, a change in level, etc. In some instances, the object-based MS content 005 may be provided by the virtual world content creator, such as the game studio, because the game engine may not be active during cutscenes of user
gameplay. In some such examples, the game studio may run the MS content generator 14003 and may provide the object-based MS content 005 along with the audio and video data for the virtual world cutscene presentation, e.g., in the same manner as described herein with reference to Figure 3.
[0129] In some examples, the MS content generator 14003 may be configured to inject a configuration corresponding to one or more user preferences. The user preferences may be obtained explicitly, may be learned by the MS content generator 14003 or by another instance of the control system 110, or both. A user configuration may, for example, indicate user preferences such as "no flashing lights," so that when the MS Content generator 14003 is creating the object-based MS content 005, the MS Content generator 14003 will ensure there are no flashing lights indicated by the object-based MS content 005. User preferences may, in some examples, be explicitly indicated by a user or learned from a user profile.
[0130] Figure 6 shows another example of system elements configured for the creation and playback of MS experiences corresponding to virtual worlds. As with other figures provided herein, the types and numbers of elements shown in Figure 6 are merely provided by way of example. Other implementations may include more, fewer and/or different types and numbers of elements. For example, other implementations may include multiple video displays 010, which are also referred to herein as display devices 010. According to some examples, system 600 may be, or may include, one or more devices configured for performing at least some of the methods disclosed herein. In some examples, system 600 may include one or more instances of the control system 110 of Figure 1 A that are configured for performing at least some of the methods disclosed herein.
[0131] In the example shown in Figure 6, the MS content generator 14003 is configured to generate object-based MS content 005 based not only on virtual world object data 510 and virtual world state data 505, but also based on audio data 011 and/or video data 012 corresponding with virtual world objects 14001, for example the audio data 011 and/or video data 012 that is used to present a virtual world that includes the virtual world objects 14001.
[0132] The MS content generator 14003 may be configured to extract relevant information from the audio data 011 in various ways, which may include one or more of the following:
By analyzing the envelope of audio signals in the audio data 011, the MS content generator 14003 may be configured to temporally align generated MS content with the audio. For instance if the generated MS content included a light flash that was to be
associated with the firing of a weapon, or the clash of two swords, then the envelope of the audio signal from these sound events could guide the timing of the light event, giving temporal alignment;
By analyzing the frequency content of the audio data Oi l, the MS content generator 14003 may be configured to generate appropriate spatial frequencies or velocities in the generated MS content. For instance, frequency analysis may indicate that the audio data Oi l includes predominantly low frequencies such as the rumble of a volcano that is about to erupt. A slow spatial frequency may be appropriate for the generated MS content, perhaps increasing as the volcano erupts;
By analyzing the properties of an audio object, the MS content generator 14003 may be configured to directly read or derive spatial information that can influence the spatial properties of generated MS content associated with this audio signal;
By analyzing the audio signal with a machine learning classifier, the signal may be categorized and a semantic tag may be attached to that audio signal. The tag can then influence the MS content 005 generated from that audio signal. For instance, the classifier may determine that the audio signal is the sound of a wind-blowing through trees. In this case, generated MS content 005 might include airflow. Further, if the classification categorizes the signal as being a “strong” rather than a “weak” wind, this categorization could guide the strength of the airflow MS content 005.
[0133] The MS content generator 14003 may be configured to extract relevant information from the video data 012 in various ways, which may include one or more of the following:
Analyzing the mesh of a virtual world object 14001 as rendered in the viewport (the view presented to the user). This analysis can indicate various properties of the virtual world object 14001, for example: o The category of the obj ect (which may be determined by classifying the mesh), indicating that it is, for example, a truck or a lamp-post. This category property can then guide the generation of an MS effect. For instance if the MS content generator 14003 determines that the virtual world object 14001 is a semi-trailer truck, MS content generator 14003 may be configured to determine that a rumbling haptic effect would be appropriate; o The scale of the virtual world object 14001, which can be understood relative to itself over time. For example, if the MS content generator 14003 determines that the virtual world object 14001 is a truck that is approaching the location of a person’s avatar in the virtual world, the MS content generator 14003 may
be configured to determine that the rumbling haptic effect will increase in intensity at a rate corresponding to the truck’s rate of approach. The scale of the virtual world object 14001 can also be understood relative to the scale of other visible meshes, which may indicate the importance of this virtual world object 14001; o The MS content generator 14003 may be configured to derive information about movement of the virtual world object 14001 by analyzing how the mesh corresponding to the virtual world object 14001 changes over time. For example, the virtual world object 14001 may be rotating at a certain rate. This rate of rotation can then guide the MS content generator 14003’s generation of an MS effect, such as the rate of moving pattern of light in the real-world playback environment;
Analyzing the virtual world object 14001’s location as rendered in the viewport. This analysis can provide spatial information that can be used for MS effect generation. For example: o The MS content generator 14003 may be configured to map the virtual world object 14001’s spatial location to an appropriate haptics spatial location in the real -world playback environment. For example, if the virtual world object 14001 is located at the bottom of the video, then the MS content generator 14003 may be configured to map this location to real-world haptics under the user’s feet, rather than to haptics that are in the user’s head-rest; o The MS content generator 14003 may be configured to perform a time-based analysis that indicates that a virtual world object 14001 is moving. In some such examples, the MS content generator 14003 may be configured to analyze the trajectory of the virtual world object 14001. The speed and direction of this trajectory may then be used by the MS content generator 14003 to generate a moving lightscape effect in the real world that has a speed and direction that comports with that of the moving virtual world object 14001, to generate an air flow effect consistent with the moving virtual world object 14001, etc.;
Analyzing the color palette of a virtual world object 14001. For instance, if the MS content generator 14003 determines that the virtual world object 14001 is a blue and red car, the MS content generator 14003 may be configured to use this information to guide the generation of an MS effect, such as creating a real-world lightscape with a blue and red color scheme that is in harmony with that of the virtual world object
14001, creating a real-world lightscape that includes lights that correspond with headlights of the virtual world object 14001, etc.
[0134] According to some examples, the MS content generator 14003 may be configured to extract or estimate one or more of the following types of information from the video data 012:
The intent of the MS effect: for example, if the MS content generator 14003 detects an aggressive virtual world object 14001 (for example, a virtual world object 14001 with a lot of sharp teeth, a virtual world object 14001 that appears to be threatening or attacking the user’s avatar or other avatars in the virtual world), the MS content generator 14003 may be configured to generate an MS effect that is intended to foster a sense of immediate danger, such as a high-frequency haptic “shaking” effect, a strobing light effect, a throbbing red lighting effect, etc.;
The pacing of the current virtual world scene: for example if the MS content generator 14003 detects that the current virtual world scene may be rapidly changing, indicating excitement and action, the MS content generator 14003 may be configured to generate an MS effect that is intended to reflect this pacing, also being rapidly changing;
The optimal, or at least appropriate, type of the MS effect: for example if the MS content generator 14003 detects that heavy virtual objects in the current virtual world scene are falling before hitting the ground and breaking apart, the MS content generator 14003 may be configured to determine that an optimal — or at least an appropriate — MS effect would be low-frequency haptics coinciding with the impacts of the heavy virtual objects;
The general color palette of the virtual world environment: for instance, if the MS content generator 14003 determines that the current virtual location is underwater, the MS content generator 14003 may be configured to provide lighting effects with a generally blue color scheme.
Context-Aware Generation of MS Content
[0135] Some implementations of the multi-sensory Tenderer 001 may be configured for rendering based, at least in part, on local context information. This type of rendering may be referred to herein as “context-based rendering” or as “context-aware rendering.” Alternatively, or additionally, in some examples the MS content generator 14003 may be configured for context-aware generation of MS content. The “context” may be, or may
include, local context information regarding the local playback environment, regarding one or more people in the local playback environment, etc. For example, one aspect of the local context information may be the time of day in a region that includes the local playback environment, the weather in an area that includes the local playback environment, etc. Alternatively, or additionally, the context may be, or may include, information regarding one or more people in the local playback environment, such as the apparent level of engagement with played-back content.
[0136] In some examples, information may be provided explicitly to allow the MS content generator 14003 to provide context-aware generation of MS content, to allow the multi- sensory Tenderer 001 to provide context-based rendering, or both. In some examples, such explicit information may be, or may include, user input for managing one or more aspects of the rendering process, the MS content creation process, or both, such as information regarding the overall immersion level and/or interactivity level. According to some examples, such explicit information may be, or may include, input from a device or system that has access to information regarding one or more aspects of the local context information, from a device or system that is configured to learn local context information by analyzing sensor data, patterns of user behavior, etc.
[0137] Managing the immersion and/or interactivity of the sensory experience can be achieved by changing the way the multi-sensory Tenderer 001, the MS content generator 14003, or both, manage temporal, frequency, intensity, input, spatial effects (e.g. color or vibration) dimensions, or combinations thereof. In some examples, the multi-sensory Tenderer 001 and/or the MS content generator 14003 may be configured to manage the immersion and/or interactivity of the sensory experience automatically, for example by applying a low-pass filter to one or more of those dimensions.
[0138] In some examples, one or more relatively more dynamic or relatively more spatial layers of sensory content may be excluded and only an ambient layer may be used when a user prefers a less immersive experience. According to some examples, one or more relatively more dynamic or relatively more spatial layers of sensory content may be reduced in intensity. These actions may be taken with regard to any combination of audio, visual and sensory experiences.
[0139] Accordingly, a context-aware MS Tenderer 001 and/or the MS content generator 14003 may be configured to provide an MS experience based at least in part on local context information, which may include one or more of the following:
• Local time of day. For example, if it is late at night in the current location the objectbased MS content 005 may not correspond to extremely bright light. In some examples, the MS content generator 14003 may avoid generating light content with a strong blue component during the last hour or so before a viewer’s bedtime, because this may hinder sleep. If it is daytime in the current location, the MS Tenderer 001 — or another device — may be configured to control automated blinds to reduce the ambient light from outdoors, depending on the content and experience;
• Local weather information as determined by an Internet weather forecast, by live internet weather observations from a nearby weather station, by local weather information supplied by a LAN, WLAN or Bluetooth connected weather station onsite, etc. For example, if it is very sunny outside the playback environment, high- intensity lighting may be required to overcome the light leaking into a viewing room through the windows. If it is very cloudy, it may be dark in the room so less light intensity may be required;
• Observations of human behavior, which may in some examples include observations from multiple days, weeks or months. For example, based historical information from a home’s security system it may be possible to determine that at the present time it is highly likely that only one person is present in a living room and that the person is likely to be playing a video game on a TV The context-aware MS content generator 14003 and/or MS Tenderer 001 may determine that this a time to augment the gaming experience using sensory effects that include the living room’s lighting system, because it is likely that no one else in the house is trying to do something different in the living room at the same time;
• Explicit input regarding mode switches, such as an explicit instruction — for example, received via user input from a person in the playback environment — to not augment a gaming experience because another member of the household is trying to do their homework in the same room;
• Information indicating the presence of a particular person in the playback environment. For example, a person may have indicated preferences as to one or more types of sensory experiences. In some examples, the person may be a
photosensitive or colorblind viewer for whom the lighting experience should be toned down or otherwise personalized. In some instances, the person may react badly to “jump-scares” or other content that is intended to induce fear and/or excitement, for examples because the person has had heart trouble, has already had one or more heart attacks, etc. The presence of a particular viewer may be determined by a Bluetooth or Wi-Fi beacon from a phone or smart watch, by talker identification using a microphone, by face identification, etc. According to some examples, the context-aware MS content generator 14003 may be configured for one or more of the following: a. Avoiding the generation of MS content that can cause jump-scares; b. Altering the palette, e.g., the color palette, of generated MS content;
• Information — such as light sensor information — indicating ambient light from internal (within the playback environment) sources, from external (e.g., outdoor) sources, or both. For example, if a home has smart lights in the living room, but the kitchen is just to the side of the living room and has a separate light setup, the kitchen lights may interfere with the light from controllable light fixtures in the living room. In some such examples, a context-aware MS content generator 14003 and/or MS Tenderer 001 may be configured to adapt the lightscape provided by controllable light fixtures in the living room due to the light from the kitchen lights. For example, a context-aware MS content generator 14003 and/or MS Tenderer 001 may be configured to cause light fixtures near the kitchen to be relatively brighter than those farther from the kitchen in order to compensate for the kitchen lights.
Such compensatory techniques may be particularly relevant if the MS Tenderer 001 is mixing colors. For example, if the kitchen light is somewhat orange, but a white light was desired, the MS Tenderer 001 may make the side lights slightly greenish so that in a user’s peripheral vision the colors mix to white;
• Information indicating the current context of one or more people in the playback environment. For example, if the current context information indicates that a passenger is playing a video game in a car while another person is driving the car, the context-aware MS content generator 14003 and/or MS Tenderer 001 may cause only lower-level/less immersive sensory objects/types to be shown in order to prevent driver distraction. In another example, if the context-aware MS Tenderer 001 receives information — such as sensor information or user input — indicating one or
more people are playing a video game on a TV and no other nearby person is trying to do anything constructive, such as housework or homework, the context-aware MS content generator 14003 and/or the context-aware MS Tenderer 001 may determine that this a time for relatively more immersive sensory content playback corresponding to the gaming content provided via the TV, whereas if another nearby person is trying to do something constructive, the context-aware MS content generator 14003 and/or the context-aware MS Tenderer 001 may determine that this a time for relatively more ambient, or completely ambient, sensory content playback corresponding to the gaming content provided via the TV;
[0140] As noted elsewhere herein, the sensory object metadata may also contain information to assist the Tenderer to provide MS experiences for various contexts or user-selectable levels of immersion. In some examples, the context-aware MS content generator 14003 may be configured to generate sensory object metadata that includes a “mood” object type to denote that the role of the object is to set an ambience layer. In some examples, the sensory object metadata may include a “dynamic” object type that may be used to signal to the Tenderer that the role of the object is to bring change and movement. In different contexts, the context-aware MS content generator 14003 and/or MS Tenderer 001 may use this information to provide different types of sensory experiences. For example, in an “immersive” context in which one or more people want to have a completely immersive sensory experience — for example, while playing a video game in a living room of a home — the full gamut of mood and dynamic sensory objects may be used to render actuator signals for the sensory experience. If, however, an “ambient” context is selected by the user or the system, then the context-aware MS content generator 14003 may provide — or the context- aware MS Tenderer 001 may use — all the mood objects, but only a subset of the dynamic objects or none at all. According to some implementations, this immersion control may be continuous from “immersion off’ to “fully immersive,” for example within a range from zero to ten, from zero to one hundred, etc. In some examples, different contexts may be sensed, categorized and programmed to correspond to various immersion levels. Immersion level may change the sensory objects used, the intensity or amplitude of sensory actuator playback, the number of actuators used, etc. For example, the immersion level may change the light objects used, the brightness of the lights, the number of light fixtures used, etc.
Analysis of Virtual World Light-Source Objects
[0141] Within a virtual world, some light sources are considered part of the constructed world and any associated narrative. These are typically “diegetic” light sources, meaning that they occur within the context of the virtual world — for example, within the game or the story — and are able to be seen by the characters in the virtual world. One example would be sunlight, emanating from an in-world (virtual world), creating the background lighting and changing as the in-world time changes. Other examples of diegetic light sources in a virtual world include lightning flashes emanating from an in-world electrical storm. Still other examples may result from the action of an in-world character, such as that played in a video game: one example of this would be flashes of light from the firing of an in-world weapon such as a laser canon. All these light sources are would normally be spatially dynamic and therefore some implementations involve making a spatially dynamic mapping of light sources from in-world to in-room (in the real-world playback environment). In some such examples, the MS content generator 14003 may be configured to use this mapping when producing the object-based MS content 005, particularly with regard to the placement of the virtual world objects 14001.
[0142] Other graphical elements within the 3D world may exist to convey information. Such informational graphical elements will sometimes be diegetic, such as the information presented through a Head Up Display (HUD) being worn by an in-world avatar. Other informational graphical elements may be non-diegetic, such a health meter that indicates the in-world health of an avatar, which may affect the avatar’s progress in the virtual world. Another example of a non-diegetic informational graphical element is a progress bar that indicates the degree of progress on an in-world task. These non-diegetic informational graphical elements may be abstracted from the large scale in-world geometry, may be overlaid on a 2-D display such as a monitor and may be unaffected by, for example, the dynamic perspective of an avatar navigating the virtual world. These non-diegetic informational graphical elements/light sources will generally be unaffected by, and will not interact with, other diegetic light sources. In many examples, non-diegetic informational light sources are not spatially dynamic and will have a spatially non-dynamic mapping from in-world to in-room.
Lightscapes for Virtual Worlds
[0143] Some implementations of the MS content generator 14003 may be configured to analyze the following properties of in-world objects 14001 :
The player position;
The player’s viewpoint; and
The light source o Type; o Color; o Intensity; and o Emission pattern.
In this context, the “player” is a person for whom a virtual world experience is being provided, so the “player position” is the person’s position in the virtual world and the “player’s viewpoint” is the person’s current point of view in the virtual world.
[0144] According to some such implementations, the MS content generator 14003 may also be configured to analyze one or more aspects of the virtual world state 14002, such as the sizes, geometries and properties of the surfaces of the part of the virtual world — for example, the room — that the player is currently in.
[0145] In some examples, the MS content generator 14003 may be configured to analyze audio data, for example audio data corresponding to one or more audio objects, in order to provide timing cues for light content that the MS content generator 14003 may generate.
For example, the MS content generator 14003 may be configured to analyze audio signals of an audio object corresponding to a door slamming, a window breaking, a car crashing, a gun firing, etc. The time corresponding to the peak of the envelope of the audio signals may be used to align the light object content.
Haptics for Virtual Worlds
[0146] Some implementations of the MS content generator 14003 may be configured to analyze one or more of the following properties of in-world objects 14001 :
The in-world object meshes currently visible in the viewport;
The position of the in-world object meshes within the viewport; and
The speed and direction of the trajectory of the in-world objects 14001 within the viewport.
In some examples, the MS content generator 14003 may be configured to analyze one or more properties of in-world objects 14001 that are not in the viewport, but that could still affect the player, such as the location, size, speed and trajectory of one or more such inworld objects 14001.
[0147] According to some such implementations, the MS content generator 14003 may also be configured to analyze one or more aspects of the virtual world state 14002, such as the associated health level of any in-world object 14001 that has a health level.
[0148] When generating the object-based MS content, the MS content generator 14003 may be configured to generate haptics signals to coincide with the time at which one in-world object’s mesh collides with another in-world object’s mesh, e.g., if the collision involves the player and results in a decrease in the player’s health level. In some examples, the MS content generator 14003 may be configured to increase the intensity of the haptic signals as the player’s health level decreases, indicating that the player is in increasing danger. In some examples, the MS content generator 14003 may be configured to determine or estimate the spatial position, trajectory, intensity, or combinations thereof, of a collision based on an analysis of one or more properties of in-world objects 14001 and to map such information to spatially arranged haptics actuators of one or more haptics devices. For example, the MS content generator 14003 may be configured to activate haptic actuators on the left side of a haptics suit worn by the player corresponding to, and synchronized with, damage caused by an in-world object 14001 that collides with the left side of the player.
[0149] Figure 7 is a flow diagram that outlines one example of a method that may be performed by an apparatus or system such as those disclosed herein. The blocks of method 700, like other methods described herein, are not necessarily performed in the order indicated. In some implementation, one or more of the blocks of method 700 may be performed concurrently. Moreover, some implementations of method 700 may include more or fewer blocks than shown and/or described. The blocks of method 700 may be performed by one or more devices, which may be (or may include) one or more instances of control system such as the control system 110 that is shown in Figure 1 A and described above. For example, at least some aspects of method 700 may be performed by an instance of the control system 110 that is configured to implement the MS content generator 14003 of Figure 5 or Figure 6.
[0150] In this example, block 705 receiving, by a control system, virtual world data. In this example, the virtual world data includes virtual world object data corresponding to one or more virtual world objects. In some examples, at least some of the virtual world data may correspond to virtual world object properties of one or more virtual world objects. For example, at least some of the virtual world data may be, or may include, the virtual world
object data 510 that is described herein. According to some examples, the one or more virtual world objects may include one or more virtual world entities, one or more interactive non-entity virtual world objects, one or more virtual world sound sources, one or more virtual world light sources, or combinations thereof. In some examples, at least some of the virtual world data may correspond to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
[0151] According to this example, block 710 involves receiving, by the control system, virtual world state data corresponding to a virtual world state. The virtual world state may be a state of a virtual world in which the one or more virtual world objects exist. According to some examples, the virtual world state may be based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof. In some instances, the virtual world state data may be, or may include, the virtual world state data 505 that is described herein.
[0152] In this example, block 715 involves generating, by the control system, object-based multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data. In some examples, the object-based MS content may be, or may include, light-based content, haptic content, air flow content or combinations thereof.
[0153] According to this example, block 720 involves providing, by the control system, the object-based MS content to an MS Tenderer. The MS Tenderer may, for example, be an instance of the MS Tenderer 001 that is disclosed herein. In some examples, the MS controller APIs 003 that are shown in Figure 3 may be implemented via the MS Tenderer 001 and actuator-specific signals may be provided to the actuators 008 by the MS Tenderer 001. In some alternative examples, the MS Tenderer 001 may provide actuator control signals 310 to the MS controller APIs 003 and the MS controller APIs 003 may provide actuatorspecific control signals to the actuators 008. In some examples, method 700 also may involve providing, by the sensory actuators in the environment, the sensory effects.
[0154] In some examples, method 700 may involve rendering, by the MS Tenderer, the object-based MS content to control signals for one or more actuators residing in a real -world environment in which video data corresponding to a virtual world is being presented on one or more displays. According to some examples, the rendering may involve synchronizing
the object-based MS content with the video data, for example according to time stamps or other time-related data.
[0155] According to some examples, audio data corresponding to the virtual world may also be reproduced in the real-world environment. According to some such examples, the rendering may involve synchronizing the object-based MS content with the audio data, for example according to time stamps or other time-related data.
[0156] In some examples, method 700 may involve analyzing, by the control system and prior to the rendering, the virtual world data to determine rendering parameters corresponding to the virtual world data. According to some such examples, the rendering may be based, at least in part on the rendering parameters. In some examples, the rendering parameters may include scaling parameters. In some such examples, the scaling parameters may be based, at least in part, on maximum virtual world object parameter values.
[0157] According to some examples, method 700 may involve receiving, by the control system, virtual world event data associated with the virtual world state and with one or more virtual world objects. In some such examples, generating the object-based MS content may be based, at least in part, on the virtual world event data.
[0158] In some examples, method 700 may involve adding MS content metadata to the one or more virtual world objects. In some such examples, the MS content metadata may correspond to the object-based MS content.
[0159] According to some examples, method 700 may involve analyzing, by the control system, virtual world object content of the one or more virtual world objects. In some such examples, the virtual world object content may include virtual world audio content, virtual world video content, or combinations thereof. In some examples, generating the objectbased MS content may be based, at least in part, on one or more results of the analyzing process.
[0160] In some examples, analyzing the virtual world object content may involve analyzing an envelope of a virtual world audio signal. In some such examples, method 700 may involve temporally aligning generated MS content with the virtual world audio signal. In some examples, analyzing the virtual world object content may involve analyzing a virtual world object portion of a virtual world video signal corresponding to a virtual world object, analyzing a context of the virtual world object portion, or combinations thereof. According
to some examples, method 700 may involve estimating, by the control system, a creative intent corresponding to the virtual world object. In some such examples, generating the object-based MS content may be based, at least in part, on estimated creative intent.
[0161] According to some examples, analyzing the virtual world object content may involve a virtual light source object analysis of one or more virtual world light source objects. In some such examples, generating the object-based MS content may be based, at least in part, on the virtual light source object analysis.
[0162] In some examples, method 700 may involve determining at least one of a player position or a player viewpoint. In some such examples, generating the object-based MS content may be based, at least in part, on the player position, the player viewpoint, or a combination thereof.
[0163] According to some examples, generating the object-based MS content may be based, at least in part, on a machine learning process implemented by the control system. For example, generating the object-based MS content may be performed, at least in part, by a neural network implemented by the control system.
[0164] The above description illustrates various embodiments of the present disclosure along with examples of how aspects of the present disclosure may be implemented. The above examples and embodiments should not be deemed to be the only embodiments, and are presented to illustrate the flexibility and advantages of the present disclosure as defined by the following claims. Based on the above disclosure and the following claims, other arrangements, embodiments, implementations and equivalents will be evident to those skilled in the art and may be employed without departing from the spirit and scope of the disclosure as defined by the claims.
Claims
1. A method, compri sing : receiving, by a control system, virtual world data, the virtual world data including virtual world object data corresponding to one or more virtual world objects; receiving, by the control system, virtual world state data corresponding to a virtual world state; generating, by the control system, object-based multi-sensory (MS) content based, at least in part, on the virtual world object data and the virtual world state data; and providing, by the control system, the object-based MS content to an MS Tenderer.
2. The method of claim 1, wherein at least some of the virtual world data corresponds to virtual world object properties of one or more virtual world objects.
3. The method of claim 1 or claim 2, wherein at least some of the virtual world data corresponds to virtual world mesh information, virtual world surface type information, virtual world surface orientation information, virtual world surface texture information, virtual world object instantiation rules, or combinations thereof.
4. The method of any one of claims 1-3, further comprising rendering, by the MS Tenderer, the object-based MS content to one or more control signals for one or more actuators residing in a real-world environment in which video data corresponding to a virtual world is being presented on one or more displays.
5. The method of claim 4, wherein the rendering involves synchronizing the objectbased MS content with the video data.
6. The method of claim 4 or claim 5, wherein audio data corresponding to the virtual world is also being reproduced in the real-world environment.
7. The method of claim 6, wherein the rendering involves synchronizing the objectbased MS content with the audio data.
8. The method of any one of claims 4-7, further comprising analyzing, by the control system and prior to the rendering, the virtual world data to determine rendering parameters
corresponding to the virtual world data, wherein the rendering is based, at least in part on the rendering parameters.
9. The method of claim 8, wherein the rendering parameters include scaling parameters and wherein the scaling parameters are based, at least in part, on maximum virtual world object parameter values.
10. The method of any one of claims 1-9, wherein the object-based MS content comprises light-based content, haptic content, air flow content or combinations thereof.
11. The method of any one of claims 1-10, wherein the one or more virtual world objects include one or more virtual world entities, one or more interactive non-entity virtual world objects, one or more virtual world sound sources, one or more virtual world light sources, or combinations thereof.
12. The method of any one of claims 1-11, wherein the virtual world state is a state of a virtual world in which the one or more virtual world objects exist.
13. The method of claim 12, wherein the virtual world state is based, at least in part, on a state of a game being played in the virtual world, on one or more player objectives, on a geometry of a virtual world environment, on a virtual world time of day, on a virtual world season, or combinations thereof.
14. The method of any one of claims 1-13, further comprising receiving, by the control system, virtual world event data associated with the virtual world state and with one or more virtual world objects, wherein generating the object-based MS content is based, at least in part, on the virtual world event data.
15. The method of any one of claims 1-14, further comprising adding MS content metadata to the one or more virtual world objects, the MS content metadata corresponding to the object-based MS content.
16. The method of any one of claims 1-15, further comprising analyzing, by the control system, virtual world object content of the one or more virtual world objects, the virtual world object content comprising virtual world audio content, virtual world video content, or combinations thereof, wherein generating the object-based MS content is based, at least in part, on one or more results of the analyzing.
17. The method of claim 16, wherein analyzing the virtual world object content comprises analyzing an envelope of a virtual world audio signal, further comprising temporally aligning generated MS content with the virtual world audio signal.
18. The method of claim 16 or claim 17, wherein analyzing the virtual world object content comprises analyzing a virtual world object portion of a virtual world video signal corresponding to a virtual world object, analyzing a context of the virtual world object portion, or combinations thereof.
19. The method of claim 18, further comprising estimating, by the control system, a creative intent corresponding to the virtual world object, wherein generating the objectbased MS content is based, at least in part, on the estimated creative intent.
20. The method of any one of claims 16-19, wherein analyzing the virtual world object content comprises a virtual light source object analysis of one or more virtual world light source objects, wherein generating the object-based MS content is based, at least in part, on the virtual light source object analysis.
21. The method of any one of claims 1-20, further comprising determining at least one of a player position or a player viewpoint, wherein generating the object-based MS content is based, at least in part, on the player position, the player viewpoint, or a combination thereof.
22. The method of any one of claims 1-21, wherein generating the object-based MS content is based, at least in part, on a machine learning process implemented by the control system.
23. The method of any one of claims 1-22, wherein generating the object-based MS content is performed by a neural network implemented by the control system.
24. An apparatus configured to perform the method of any one of claims 1-23.
25. A system configured to perform the method of any one of claims 1-23.
26. One or more non-transitory and computer-readable media having instructions for controlling one or more devices to perform the method of any one of claims 1-23 stored thereon.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202363514097P | 2023-07-17 | 2023-07-17 | |
US63/514,097 | 2023-07-17 | ||
US202463669234P | 2024-07-10 | 2024-07-10 | |
US63/669,234 | 2024-07-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2025019441A1 true WO2025019441A1 (en) | 2025-01-23 |
Family
ID=92258702
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2024/038077 WO2025019441A1 (en) | 2023-07-17 | 2024-07-15 | Generation of object-based multi-sensory content for virtual worlds |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2025019441A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210216132A1 (en) * | 2018-11-13 | 2021-07-15 | Spark Xr, Inc. | Systems and Methods for Generating Sensory Input Associated with Virtual Objects |
US20220113801A1 (en) * | 2019-04-26 | 2022-04-14 | Hewlett-Packard Development Company, L.P. | Spatial audio and haptics |
WO2022100985A1 (en) * | 2020-11-12 | 2022-05-19 | Interdigital Ce Patent Holdings, Sas | Representation format for haptic object |
-
2024
- 2024-07-15 WO PCT/US2024/038077 patent/WO2025019441A1/en active Search and Examination
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210216132A1 (en) * | 2018-11-13 | 2021-07-15 | Spark Xr, Inc. | Systems and Methods for Generating Sensory Input Associated with Virtual Objects |
US20220113801A1 (en) * | 2019-04-26 | 2022-04-14 | Hewlett-Packard Development Company, L.P. | Spatial audio and haptics |
WO2022100985A1 (en) * | 2020-11-12 | 2022-05-19 | Interdigital Ce Patent Holdings, Sas | Representation format for haptic object |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10092827B2 (en) | Active trigger poses | |
US11311804B2 (en) | Incorporating and coordinating multiple home systems into a play experience | |
US10796489B1 (en) | Game engine responsive to motion-capture data for mixed-reality environments | |
US10105594B2 (en) | Wearable garments recognition and integration with an interactive gaming system | |
US11748950B2 (en) | Display method and virtual reality device | |
US20110218039A1 (en) | Method for generating an effect script corresponding to a game play event | |
EP3506262A1 (en) | Intuitive haptic design | |
Marner et al. | Exploring interactivity and augmented reality in theater: A case study of Half Real | |
US12064707B2 (en) | Interactive display with special effects assembly | |
WO2024082897A1 (en) | Illumination control method and apparatus, and computer device and storage medium | |
CN109714647A (en) | Information processing method and device | |
US20080305713A1 (en) | Shadow Generation Apparatus and Method | |
WO2025019441A1 (en) | Generation of object-based multi-sensory content for virtual worlds | |
US20210216132A1 (en) | Systems and Methods for Generating Sensory Input Associated with Virtual Objects | |
US12141907B2 (en) | Virtual separate spaces for virtual reality experiences | |
JP2022505977A (en) | Special effects visualization method | |
WO2025019440A1 (en) | Multi-sensory object renderer | |
WO2025019443A1 (en) | Multi-sensory (ms) spatial mapping and characterization for ms rendering | |
Casas et al. | Romot: A robotic 3D-movie theater allowing interaction and multimodal experiences | |
Xu et al. | Urban Interactive Installation Art as Pseudo-Environment Based on the Frame of the Shannon–Weaver Model | |
WO2025019438A1 (en) | Providing object-based multi-sensory experiences | |
Nijholt | Augmented Reality: Perceiving and Experiencing What Possibly Cannot Be Real (Although it May Be Humoristic) | |
Hughes et al. | Mixed Fantasy: An integrated system for delivering MR experiences | |
JP2004252496A (en) | System and method for controlling moving picture by tagging object in game environment | |
CN117601891A (en) | Vehicle-mounted system, vehicle, control method of vehicle-mounted system, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24752221 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) |