GB2566946A

GB2566946A - Provision of virtual reality objects

Info

Publication number: GB2566946A
Application number: GB1715608.4A
Authority: GB
Inventors: Tytgat Donny; Aerts Martin
Original assignee: Nokia Technologies Oy
Current assignee: Nokia Technologies Oy
Priority date: 2017-09-27
Filing date: 2017-09-27
Publication date: 2019-04-03
Also published as: GB201715608D0

Abstract

A method of selecting modalities for virtual or mixed reality objects involves identifying users that are potentially relevant to a first user, filtering the potentially relevant users by policies of the first user, identifying candidate virtual objects for each relevant user, filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user, determining relevant object modalities, e.g. visual, audible or touchable modalities, for the virtual objects of relevance to the first user and providing said relevant object modalities to said first user. Identifying users may comprise use of geo-location, device triangulation and visual detection and filtering policies for relevant users include location, speed, visual detection and other sensor data. Filtering object modalities takes into account sensory information which may be useful to the first user at a current time or within a future time interval. The method may also involve determining variables of a virtual object interaction model relevant for at least some of the virtual objects of relevance to the first user. The method reduces computational load and bandwidth by reducing the amount of modal data which needs to be used.

Description

(57) A method of selecting modalities for virtual or mixed reality objects involves identifying users that are potentially relevant to a first user, filtering the potentially relevant users by policies of the first user, identifying candidate virtual objects for each relevant user, filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user, determining relevant object modalities, e.g. visual, audible or touchable modalities, for the virtual objects of relevance to the first user and providing said relevant object modalities to said first user. Identifying users may comprise use of geo-location, device triangulation and visual detection and filtering policies for relevant users include location, speed, visual detection and other sensor data. Filtering object modalities takes into account sensory information which may be useful to the first user at a current time or within a future time interval. The method may also involve determining variables of a virtual object interaction model relevant for at least some of the virtual objects of relevance to the first user. The method reduces computational load and bandwidth by reducing the amount of modal data which needs to be used.

FIG. 4

At least one drawing originally filed was informal and the print reproduced here is taken from a later filed formal copy.

1601 18 /6

ί___________________________f__________________________I

FIG. 2 ¹

2/6

1601 18

FIG. 4

3/6

FIG. 5

1601 18

FIG. 6

4/6

User Context

1601 18

Filter modalities
1	r
Modality transformation (optional)

Encode and transmit modalities

Decode modalities
	r
Virtual object sensory view

X

Estimate dynamic range

Create interaction view
1	ΐ
Interaction view transformation (optional)
Ί	r
Encode and transmit interaction view
1	r
Decode view
1	r
Virtual object interaction view

FIG. 7

5/6

1601 18

FIG. 8

6/6

1601 18

105 106 107

108 109 110

103

FIG. 9

Provision of Virtual Reality Objects

Field

This invention relates to virtual reality, particularly the delivery of objects to users 5 within a virtual reality or mixed reality environment.

Background

Virtual reality (VR) is a rapidly developing area of technology in which video content is provided to a VR display system. As is known, a VR display system may be provided 10 with a live or stored feed from a video content source, the feed representing a VR space or world for immersive output through the display system. In some embodiments, audio is provided, which may be spatial audio. A virtual space or virtual world is any computer-generated version of a space, for example a captured real world space, in which a user can be immersed through a display system such as a VR headset. A VR 15 headset may be configured to provide VR video and audio content to the user, e.g.

through the use of a pair of video screens and headphones incorporated within the headset.

Mixed Reality (MR) is an area of technology in which real and virtual worlds are combined such that physical and digital objects co-exist and interact in real time. For example, a virtual dog may be owned by a user within the Mixed Reality environment, but others users can not only see the virtual dog, but also interact with it. In such a world, there maybe a large number of users and a large number of virtual objects that co-exist. At least some communication amongst the users within the Mixed Reality environment is typically required in order to allow such interaction.

Augmented Reality (AR) refers to a real-world view that is augmented by computergenerated sensory input. In the context of the present specification, the term Mixed Reality is intended to encompass Augmented Reality.

Summary

In a first aspect, this specification describes a method comprising: identifying users that are potentially relevant to a first user; filtering the potentially relevant users by policies of the first user to identify relevant users; identifying candidate virtual objects 35 for each relevant user; filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; determining relevant object modalities for at least some of the virtual objects of relevance to the first user; and providing said relevant object modalities to said first user. Identifying users that are potentially relevant to the first user may comprise the use of one or more of geo-location, device triangulation and visual detection. The policies of the first user used for filtering the potentially relevant users may include one or more of location, speed, visual detection and other sensor data.

Object modalities may comprise one or more of visible, audible and touchable modalities. The object modalities maybe modified for said first user.

The filtering of said object modalities maybe based on a dynamic range of the first user. The dynamic range may be indicative of sensory information that is useful to the first user at the current time or within a future time interval.

The relevant object modalities of a virtual object that are provided to the first user may be selected based on context data. The context data may include a position and a viewpoint of the first user relative to the respective virtual object. The context data may include capabilities of a viewing device being used by the first user. The context data may include preferences of the first user.

The first aspect may further comprises determining variables of a virtual object interaction model relevant for at least some of the virtual objects of relevance to the first user and providing said variables of said virtual object interaction models to said first user. The virtual object interaction model may include symbolic interaction capabilities.

In a second aspect, this specification describes an apparatus configured to perform any method as described with reference to the first aspect.

In a third aspect, this specification describes computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform any method as described with reference to the first aspect.

In a fourth aspect, this specification describes a computer-readable medium having computer-readable code stored thereon, the computer readable code, when executed by at least one processor, causes performance of: identifying users that are potentially

-3relevant to a first user; filtering the potentially relevant users by policies of the first user to identify relevant users; identifying candidate virtual objects for each relevant user; filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; determining relevant object modalities for at least some of the virtual objects of relevance to the first user; and providing said relevant object modalities to said first user.

In a fifth aspect, this specification describes an apparatus comprising: at least one processor; and at least one memory including computer program code which, when 10 executed by the at least one processor, causes the apparatus to: identify users that are potentially relevant to a first user; filter the potentially relevant users by policies of the first user to identify relevant users; identify candidate virtual objects for each relevant user; filter the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user;

determine relevant object modalities for at least some of the virtual objects of relevance to the first user; and provide said relevant object modalities to said first user.

In a sixth aspect, the specification describes an apparatus comprising: means for identifying users that are potentially relevant to a first user; means for filtering the potentially relevant users by policies of the first user to identify relevant users; means for identifying candidate virtual objects for each relevant user; means for filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; means for determining relevant object modalities for at least some of the virtual objects of relevance to the first user;

and means for providing said relevant object modalities to said first user.

Brief description of the drawings

Example embodiments will now be described, by way of non-limiting example, with reference to the following schematic drawings, in which:

Figure 1 is a perspective view of a AHL or MR display system, useful for understanding the embodiments;

Figure 2 is a block diagram of a computer network including the Figure 1 display system, according to exemplary embodiments;

Figure 3 is a block diagram of a Mixed Reality environment in which exemplary embodiments may be used;

-4Figure 4 is a flow chart of an algorithm in accordance with an exemplary embodiment;

Figure 5 is a flow chart showing details of a peer discovery operation in accordance with an exemplary embodiment;

Figure 6 is a flow chart showing details of a virtual object discovery operation in accordance with an exemplary embodiment;

Figure 7 is a flow chart showing details of a provision of views of virtual objects operation in accordance with an exemplary embodiment;

Figure 8 shows exemplary views in accordance with an exemplary embodiment; and

Figure 9 is a block diagram of a system in accordance with an exemplary embodiment.

Detailed Description of Preferred Embodiments

Figure 1 is a schematic illustration of a VR or MR display system 1 which represents user-end equipment. The system 1 includes a user device in the form of a VR or MR headset 20, for displaying visual data for a VR or MR space, and a media player 10 for 15 rendering visual data on the headset 20. Headset 20 may comprise alternative reality (AR) or mixed reality (MR) glasses, which may enable visual content, for example one or more virtual objects, to be projected or displayed on top of a see-through portion of the glasses. In some example embodiments, a separate user control (not shown) may be associated with the display system 1, e.g. a hand-held controller.

The headset 20 receives the VR or MR content data from the media player 10. The media player 10 may be part of a separate device which is connected to the headset 20 by a wired or wireless connection. For example, the media player 10 may include a games console, or a PC configured to communicate visual data to the headset 20.

Alternatively, the media player 10 may form part of the headset 20.

Here, the media player 10 may comprise a mobile phone, smartphone or tablet computer configured to play content through its display. For example, the media player 10 maybe a touchscreen device haring a large display over a major surface of the 30 device, through which video content can be displayed. The media player 10 may be inserted into a holder of a headset 20. With such headsets 20, a smart phone or tablet computer may display visual data which is provided to a user’s eyes ria respective lenses in the headset 20. The VR or MR display system 1 may also include hardware configured to convert the device to operate as part of display system 1. Alternatively, the 35 media player 10 may be integrated into the headset 20. The media player 10 may be

-5implemented in software. In some example embodiments, a device comprising VR (or MR) media player software is referred to as the VR (or MR) media player 10.

The display system 1 may include means for determining the spatial position of the user and/or orientation of the user’s head. This may be by means of determining the spatial position and/or orientation of the headset 20. Over successive time frames, a measure of movement may therefore be calculated and stored. Such means may comprise part of the media player 10. Alternatively, the means may comprise part of the headset 20. For example, the headset 20 may incorporate motion tracking sensors which may include one or more of gyroscopes, accelerometers and structured light systems. These sensors may generate position data from which a current visual field-of-view (FOVj is determined and updated as the user, and so the headset 20, changes position and/or orientation. The headset 20 will typically comprise two digital screens for displaying stereoscopic video images of the virtual world in front of respective eyes of the user, and also two speakers for delivering audio, if provided from the media player 10. The example embodiments herein are not limited to a particular type of headset 20.

In some example embodiments, the display system 1 may determine the spatial position and/or orientation of the user’s head using the known six degrees of freedom (6D0F) method. As shown in Figure 1, these include measurements of pitch 22, roll 23 and yaw 24 and also translational movement in Euclidean space along side-to-side, front-to-back and up-and-down axes 25,26, 27.

The display system 1 maybe configured to display VR or MR content data to the headset 20 based on spatial position and/or the orientation of the headset 20. A detected change in spatial position and/or orientation, i.e. a form of movement, may result in a corresponding change in the visual data to reflect a position or orientation transformation of the user with reference to the space into which the visual data is projected. This allows VR content data to be consumed with the user experiencing a 3D

VR or MR environment.

Audio data may also be provided to headphones provided as part of the headset 20. The audio data may represent spatial audio source content. Spatial audio may refer to directional rendering of audio in the VR or MR space such that a detected change in the user’s spatial position or in the orientation of their head may result in a corresponding

-6change in the spatial audio rendering to reflect a transformation with reference to the space in which the spatial audio data is rendered.

The angular extent of the environment observable through the headset 20 is called the visual field of view (FOV). The actual FOV observed by a user depends on the interpupillary distance and on the distance between the lenses of the headset 20 and the user’s eyes, but the FOV can be considered to be approximately the same for all users of a given display device when the headset 20 is being worn by the user.

Referring to Figure 2, a first remote content provider 30 and a second remote content provider 31 may each store and transmit streaming VR/MR content data for output to the headset 20. Responsive to receive or download requests sent by the media player 10, the content providers 30 and 31 stream the VR/MR data over a data network 33, which may be any network, for example an IP network such as the Internet.

The representation and ownership of physical items in the physical world is, in general, a straightforward matter. When someone carries a physical object, this can generally be seen by other people within the space. There is one instance of the item and any manipulations made to that item are persistent and consistent to all users.

In a virtual or augmented world, seeing an item does not necessarily mean that other users you are interacting with can see the same item. One needs to explicitly transfer the item to the relevant interface(s) of the other user(s) in order for the item to become visible. Further, in order to allow other users to interact with or manipulate a particular virtual object, permissions to allow this interaction may be granted. With large numbers of users and objects, scalability, bandwidth, transfer time and security issues may make such transfers undesirable.

Figure 3 is a block diagram of a Virtual Reality or Mixed Reality environment, indicated generally by the reference numeral 40. The environment 40 includes a first user 41, a second user 42 and a third user 43, each including a headset similar to the headset 20 described above. The first user 41 has a first virtual object Al and a second virtual object A2 that maybe shared within the environment 40. The second user 42 has a first virtual object Bi and a second virtual object B2 that may be shared within the environment 40. The third user 43 has a first virtual object Ci and a second virtual object C2 that may be shared within the environment 40.

-ΊThe environment 40 can, for example, range from small scale environments such as a house, up to large scale environments such as an office, a train station, a complete city or even larger environments. A user mapping made for a particular environment is not 5 typically user specific. Rather, data from many sensors and users may be aggregated to create an overall view of an environment that is an accurate as possible. The gathered information is not restricted to location information. One can also, for example, gather information such as the current means of transportation of a user as this might influence the selection process described in further detail below. For example, Mixed 10 Reality users that are passing through an environment in a car might be handled differently from Mixed Reality users that are stationary, or moving on foot. In some embodiments, a user may not be physically present in environment 40 and may be therefore visible to other users only when the other users wear the headset 20 (or similar device).

By way of example, assume that the first user 41 within the environment 40 is stationary, the second user 42 has a virtual dog Bi and the third user 43 has a virtual dog Ci. Assume also that the second user 42 is walking through the environment and that the third user 43 is travelling in a fast-moving car. In this scenario, the first user 20 41 will typically be more interested in viewing the virtual dog Bi of the second user 42 than the virtual dog Ci of the third user 43.

Figure 4 shows a flow chart showing an algorithm, indicated generally by the reference numeral 50, for delivering information to a first user regarding virtual objects in 25 accordance with an exemplary embodiment.

The algorithm 50 starts at operation 52 where a peer discovery operation is carried out. The peer discovery operation identifies other potentially relevant users that are near (in a physical or virtual environment) to the first user. Byway of example, a peer discovery 30 operation conducted by the first user 41 in the mixed reality environment 40 may find the second user 42 and the third user 43 as potentially relevant users.

Next, at operation 54, the algorithm 50 performs a virtual object discovery operation in which relevant virtual objects owned by the users identified at operation 52 are identified. Thus, in the example above, the operation 54 may identify the virtual objects Bi, B2, Ci and C2.

-8Finally, at operation 56, the algorithm 50 provides views of the virtual objects identified in operation 54 to the first user (such as the first user 41 in the example above).

The operations 52,54 and 56 are described in more detail below with reference to Figures 5, 6 and 7 respectively.

Figure 5 is a flow chart showing more details of the peer discovery operation 52 in accordance with an exemplary embodiment. As noted above, the peer discovery 10 operation 52 identifies other users that are near to the first user.

As shown in Figure 5, the peer discovery operation 52 starts at operation 61 where a query is made to identify others users that are near to the first user (e.g. in a physical or a virtual environment).

In one implementation, a user mapping will already have taken place for the relevant environment in which the query 61 is made. The operation 61 may be implemented by interrogating the data that makes up this user mapping. The user mapping may be derived from one or more sensors that provide location data. These sensors may include one or more of the following (although the invention is not limited to the use of the listed sensors and is not limited to location data):

• Geo-location sensors. The physical location of a user may be used to determine their absolute and relative locations. The sensors used may include global positioning system (GPS), assisted GPS (A-GPS), indoor GPS solutions etc. While a singular GPS measurement might not always give the accuracy that is required, it can give a good baseline for other methods to start from. Further, a consensus may be constructed based on multiple available geo-sensors.

• Device triangulation. A particular VR or MR device may be able to sense other devices that are nearby and use triangulation to determine the relative location of these other VR or MR device(s). The relative location information can be combined with other techniques that provide absolute location (e.g. geo-sensors) in order to create a more reliable map.

• Visual detection. As VR and MR devices typically include embedded cameras, these cameras may be used to identify users and/or the VR or MR devices of other users.

• Prior information. Additional information may be used to aid consensus regarding the different sensors. Zones with a low probability of users can, for example, be indicated in the system. For example, it maybe reasonable to assume that there are no users located outside of a building on the third floor. Other examples of prior information might be that the density of users is unlikely to be higher than a certain number per square metre. Prior information can also include statistical information for detecting user activity, such as walking, cycling, running, driving etc.

If a user is not physically located in environment 40, the peer discovery operation 52 10 may identify the other potentially relevant users based on an indication of a virtual location of the remotely-located user in environment 40. For example, the query for nearby users 61 may include sending a request for and/or receiving a message including virtual location data of a remotely located user and/or virtual objects associated with the remotely located user.

Once the mapping of an environment has been made, a first user can query the mapping information (in operation 61) in order to identify other users within the vicinity of the first user.

With the nearby users identified in operation 61, the peer discovery operation 52 moves to operation 62 where the nearby users are filtered in some way in order to provide a list of relevant users (as defined by a given set of rules, such as policies of either the first user or of another user). Thus, the relevant users need not only be defined in terms of location. By way of example, the first user might prefer not to see virtual objects from users who are driving a car when the first user is not in that car. Further, policies might be in place to prevent certain officials (e.g. police officers) from sharing virtual objects when they are working. Policies might also take into account the social context of each of the users. For example, different policies for handling virtual objects might be in place for particular groups of people, for example for friends, family, colleagues or strangers. Clearly, the filtering operation 62 is highly flexible and can include many additional policies not described herein.

The identification (operation 61) and filtering (operation 62) of nearby users provides an indication of VR- or MR-enabled users that meet the policies of the first user. These 35 users are referred to as “peers” and are included in a peer subscription store (see operation 63 of the peer discovery operation 52). The peer subscription store 63 is

- 10 typically updated by subscribing to newly identified users and/or unsubscribing to users who no longer meet the requirements of operations 61 and 62 (e.g. they have moved too far away from the first user).

Figure 6 provides further details of the virtual object discovery operation 54 in accordance with an exemplary embodiment. As described above, the peer discovery operation 52 identifies VR- or MR-enabled users who might have virtual objects of relevance to the first user. The virtual object discovery operation 54 seeks to identify those virtual objects.

The virtual object discovery operation 54 starts at operation 71 where virtual objects of the users within the peer subscription store (updated in operation 63) are queried.

The virtual objects identified in the query operation 71 are filtered (in filtering virtual objects operation 72) to identify virtual objects with which the relevant user should subscribe or which should be provided to the relevant user. The filtering may be carried out on the basis of policies of the owner of the virtual object and/or on the basis of policies of the user seeking access to the virtual object.

For example, assume that the first user 41 has carried out the peer discovery operation and identified that the second user 42 and the third user 43. The query operation 71 will identify the virtual objects Bi, B2, Ci and C2. The second user 42 may have policies that restrict access to one or more of the virtual objects Bi and B2. Similarly, the third user 43 may have policies that restrict access to one or more of the virtual objects Ci and C2. Furthermore, the first user may have policies that restrict access to one or more of the virtual objects Bi, B2, Ci and C2. For example, a policy associated with a user may restrict access to his/her virtual objects based on an identity of the user seeking access to the virtual object. Other example policies may restrict the access based on a time of day, a type of device seeking the access, or a password. Yet another example relates to the type of virtual object that is being shared. For children for instance, one could restrict access to virtual objects that expose features that are not suitable to children.

At operation 73, the object subscription store of the relevant user (the first user 41 in the example above) is updated on the basis of the identified and filtered virtual objects.

- 11 The object subscription store identifies the virtual objects that the relevant user has access to.

With the relevant virtual objects identified (at operation 54), the algorithm 50 moves to 5 operation 56 where views of the relevant virtual objects are provided to the relevant user in accordance with an exemplary embodiment.

Delivering virtual objects by downloading them on each of the relevant VR- or MRdevices and instantiating them accordingly has a number of disadvantages. When 10 displaying virtual objects from users that are passing by for instance, it may not be efficient to first completely download these objects and then only show those objects for a few seconds. The delay that is introduced by first downloading and instantiating might also not be acceptable in some use cases. The VR- or MR-device restrictions should also be taken into account as it can be expected that these devices will evolve 15 towards lightweight untethered devices that facilitate near-transparent humanmachine interface paradigms. Instantiating virtual objects and keeping them in memory is expected to be an issue, especially considering the envisioned mixed reality where virtual objects are ubiquitous and viewing them is often short-term. Privacy and ownership are also potential issues with full object transfer towards VR or MR devices.

Figure 7 provides further details of the provide views of virtual objects operation 56 of the algorithm 50 (generally referred to below as the view operation 56) in accordance with an exemplary embodiment. The view provided to the user may be based on one or both of two different models: a virtual object sensory model and a virtual object interaction model. The use of each of these models is described in further detail below. However, in essence, the “sensory model” includes data that specifies how to stimulate a user’s sensors to convey the presence of the virtual object. The sensory model can include visual information, audio information, haptics (touch) etc. These data are collectively referred to herein as modalities. In contrast, the virtual object interaction 30 model is a computational model defining how a user can interact with the virtual object.

Consider first the sensory model. Instead of transferring the complete model describing the virtual object, data (i.e. modalities) relevant to the MR user concerned 35 may be streamed. It is also possible that only a subset of data belonging to a particular modality is provided to the user. The subset of data may be determined such that the

- 12 virtual object may be sufficiently displayed to the user, taking into account context of the user, as described below.

The selection of which data to stream may be based on a user context description that is received from the user. The context description (which can be continuously updated as it can change over time) can include (but is not limited to):

• The relative position and/or viewpoint of the user compared with the virtual object. This can restrict many of the modalities (data) that need to be sent to the MR user. For example, visual information may not need to be sent for self-occluded portions of the object. Further, haptic information may not be needed when the user is too far away to be able to touch the object (such data can be sent if the user moves closer).

• Spatial context. The environment in which the user is seeing the virtual object can influence the way that object will be perceived. When there is a wall in between the user and the virtual object, for instance, visual information may not need to be sent (although there maybe a need to send audio data).

• VR or MR device characteristics. The devices that are used to consume the virtual objects are unlikely to be identical. Some VR and MR devices may have better resolution: others may have more colours. Also, computational capacity can be different between devices. The data that is sent to a VR or MR device can thus be adapted to these characteristics. A high-resolution VR or MR device can use highresolution textures, for example, and a device with low computational complexity might benefit from a different type of representation.

• User preferences. The user might have certain preferences with regards to how the 25 virtual objects are displayed on his/her device and this might have consequences on how the view of the virtual object is constructed. A user could, for example, prefer that virtual objects from people known to the user to be of higher quality (e.g. having higher resolution rendering) than people unknown to the user.

The user contexts (such as those outlined above) are collectively used to estimate the so-called dynamic range of the view. In terms of the sensory model, the dynamic range may be the sensory information that is useful to the user at the current time, adapted to take into account the system latency. The dynamic range may refer to data or data modalities expected to be useful in near future, for example within a time interval corresponding to the system latency.

-13As shown in Figure 7, the user context 80 described above is received as an input and used to estimate the dynamic range (operation 81 of the view operation 56). For example, if the user context indicates that the user is substantially stationary and not associated with a vehicle, the dynamic range may include data that enables the rendering of virtual objects in locations within a short distance from user’s current location. Taking into account the spatial behaviour of the user, one may furthermore restrict portions of virtual objects from being sent because these are not expected to be seen in the near future because they are self-occluded (e.g. no need to send the back of 10 a person when (s)he is facing the user).

The dynamic range may be used to filter the modalities that need to be sent to the user in order to provide the required sensory information for display on the VR or MR device (operation 82 of the view operation 56). The modalities are, of course, a subset 15 of the available modalities and the smaller this subset can be made, the lower the data transfer requirements.

Once operation 82 is complete, the view operation 56 either moves to the optional modality transformation operation 90 (discussed further below) or an encode and 20 transmit modalities operation 83. At operation 83, the filtered modalities are encoded and transmitted to the user device (e.g. over a network, such as the Internet). The transmitted modalities are then decoded at the user device (operation 84) and included in the view presented to the user (operation 85).

Sending the user context 80 to a virtual object provider, creating the proper views and delivering them to the user takes time. As such, the data that is received by the user might already be outdated. When, for example, only sending the visual information from the point of view of the user, there might be an issue when the VR or MR device exhibits fast movement. The user might need visual information that is not available at 30 that point (because it was occluded in the previous viewpoint). This is illustrated in Figure 8 where a non-zero latency is assumed and where no dynamic range estimation is performed. The user is therefore assumed to remain static up to time instance t₀, after which the user moves to the location shown at time ti. It can be seen that when the user moves, there is blind spot of data that is not available to be shown. This is one reason why the dynamic range estimation may be used as it can be used to analyse the typical behaviour pattern of the user(s) and already send data that is likely to be needed

-14in the near future. Note that although this issue has been described with reference to visual data, the same issues arise with other sensory data.

As described above, the virtual object sensory model represents the information that is fed to each of the senses of the user. A virtual model can, however, have interaction capabilities that are not captured by the virtual object sensory model. These interaction capabilities may be represented by a virtual object interaction model and represent a computational model about how a user can interact with the virtual object. Such a model can for example include interaction volumes in 3D space around the virtual object where certain gestures from the user are translated into action triggers for the virtual object. Higher level interactions are also possible with optionally local reasoning. When a certain sequence of actions is required from the user in order to trigger a reaction in the virtual object for instance, the model can contain this sequence of actions and the triggers to invoke these. By optionally enabling local reasoning, one can keep track of the sequence of the actions locally and only trigger the response when the appropriate actions have been executed. As such one may limit the amount of network interactions between the user and the virtual object host.

As shown in Figure 7, on the basis of the dynamic range estimated in operation 81 and an interaction model for the virtual object, an interaction view may be generated at operation 86. Once operation 86 is complete, the view operation 56 either moves to the optional interaction view transformation operation 91 (discussed further below) or the encode and transmit interaction view operation 87. At operation 87, the interaction view may be transmitted to the user device. The transmitted interaction view may then be decoded at the user device (operation 88) and may then be made available to the user in a virtual object interaction view (operation 89).

In the description of the user of object modalities in the view operation 56 described above, it is assumed that there is a direct mapping between the filtered modalities and what is perceived by the user. This is not required in all embodiments of the invention.

Not all users will perceive modalities in the same manner. For example, someone who is colour blind does not perceive visual stimuli in the same way as someone who is not. In the view operation 56, an optional modality transformation operation 90 maybe provided, which modality transformation block can be used to modify filtered modalities in order to best fit that specific user.

-ι₅A similar indirection is possible in the virtual object interaction, by using the optional interaction view transformation operation 91 to translate one interaction by a user into a different action within the virtual object interaction model. One example of providing such a transformation is to add semantic tags to the exposed interactions for at least 5 some of the virtual objects. Similarly, semantic tags may be also added to the interactions that a certain user has learned or configured. The transformation 91 may then attempt to map the exposed interactions from the virtual object to the known interactions from the user. This may be done by using a pre-configured semantic tree for instance that allows for generalization or specification of the semantic tags until a 10 match is found. If one of the exposed interactions of a virtual object is tagged with a “confirmation” tag for instance, the interactions of the user may be queries for the best matching semantic tag. A “thumbs up” tag might be associated with a certain gesture for instance which, when traversing the semantic tree, could match the “confirmation” tag and one could thus use thumbs up to trigger confirmation in the virtual object, even 15 if the virtual object had configured a different type of gesture. As a higher level example, a user might interact with a virtual dog by scratching the head of the dog. A user might prefer to configure a certain gesture (e.g. moving a hand whilst looking at the virtual dog) to have the same effect as scratching the head of the dog. This would have the advantage that the user would not need to bend down in order to interact with 20 the virtual dog. The transformation of one action (moving a hand whilst looking at the virtual dog) into another (scratching the virtual dog’s head) is handled within the optional interaction view transformation operation 91.

Another example of the user of interactive transformation is the potential for harmonization of interaction with objects of different owners. For example, each user may have their own set of interactions primitives that he/she likes to use for particular actions. When interacting with an object owned by a different user, the interaction view transformation operation may be used to enable each user to user their own interaction paradigms (instead of those defined by the object owner).

Figure 9 is a schematic diagram showing a system, indicated generally by the reference numeral too that maybe used to implement the various embodiments described herein. The system too comprises a first user 101, a second user 102, a server 103 and a network 104 (e.g. an IP network, such as the Internet). The first user 101 includes a 35 controller 105, a memory 106 and a network interface 107. Similarly, the second user

-16102 includes a network interface 108, a memory 109 and a controller 110. Finally, the server 103 includes a controller 111, a memory 112 and a network interface 113.

The network interfaces 107,108 and 113 may take any suitable form for providing connection to the network 104 and may, for example, be modems (which may be wired or wireless). Even though embodiments have been described using a centralized server hosting virtual objects as an example, it is appreciated that virtual objects could be alternatively or additionally hosted by a storage associated with at least one user, for example a VR headset of the user. A first user could for example retrieve one or more virtual objects from one or more other users through the respective network interfaces, for example based a Uniform Resource Identifier (URI) that identifies the virtual object.

The memories 106,109 and 112 maybe a non-volatile memory such as read only memory (ROM), a hard disk drive (HDD) or a solid state drive (SSD). The memories 106,109 and 112 stores, amongst other things, an operating system and may store software applications. In addition to (or instead of) non-volatile memory, the memories may include RAM used by the respective controllers for the temporary storage of data. An operating system may contain code which, when executed by the 20 respective controller in conjunction with the RAM memory, controls operation of each of the hardware components.

The controllers 105, no and 111 may take any suitable form. For instance, they maybe microcontrollers, plural microcontrollers, processors, plural processors, or combinations thereof. The controller maybe used to implement the various algorithms described herein.

Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic. The 30 software, application logic and/or hardware may reside on memory, or any computer media. In an example embodiment, the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media. In the context of this document, a “memory” or “computer-readable medium” maybe any non-transitory media or means that can contain, store, communicate, propagate or 35 transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.

-17Reference to, where relevant, “computer-readable storage medium”, “computer program product”, “tangibly embodied computer program” etc., or a “processor” or “processing circuitry” etc. should be understood to encompass not only computers having differing architectures such as single/multi-processor architectures and sequencers/parallel architectures, but also specialised circuits such as field programmable gate arrays FPGA, application specify circuits ASIC, signal processing devices and other devices. References to computer program, instructions, code etc. should be understood to express software for a programmable processor firmware such as the programmable content of a hardware device as instructions for a processor or configured or configuration settings for a fixed function device, gate array, programmable logic device, etc.

As used in this application, the term “circuitry” refers to all of the following: (a) hardware-only circuit implementations (such as implementations in only analogue and/or digital circuitry) and (b) to combinations of circuits and software (and/or firmware), such as (as applicable): (i) to a combination of processor(s) or (ii) to portions of processor(s)/software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a server, to perform various functions) and (c) to circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.

If desired, the different functions discussed herein maybe performed in a different order and/or concurrently with each other. Furthermore, if desired, one or more of the above-described functions may be optional or may be combined. Similarly, it will also be appreciated that the flow diagrams of Figures 4 to 7 are examples only and that various operations depicted therein maybe omitted, reordered and/or combined.

By way of example, one or more (or all) of the following operations may be omitted from some embodiments of the invention: identifying users that are potentially relevant to a first user; filtering the potentially relevant users by policies of the first user to identify relevant users; identifying candidate virtual objects for each relevant user; and filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user could be omitted.

In one exemplary aspect of the invention, a method may comprise: determining

-18relevant object modalities for at least some (e.g. all) of the virtual objects of relevance to a first user; and/or determining variables of a virtual object interaction model relevant for at least some (e.g. all) of the virtual objects of relevance to the first user.

It will be appreciated that the above described example embodiments are purely illustrative and are not limiting on the scope of the invention. Other variations and modifications will be apparent to persons skilled in the art upon reading the present specification.

Moreover, the disclosure of the present application should be understood to include any novel features or any novel combination of features either explicitly or implicitly disclosed herein or any generalization thereof and during the prosecution of the present application or of any application derived therefrom, new claims may be formulated to cover any such features and/or combination of such features.

Although various aspects of the invention are set out in the independent claims, other aspects of the invention comprise other combinations of features from the described embodiments and/or the dependent claims with the features of the independent claims, and not solely the combinations explicitly set out in the claims.

It is also noted herein that while the above describes various examples, these descriptions should not be viewed in a limiting sense. Rather, there are several variations and modifications which may be made without departing from the scope of the present invention as defined in the appended claims.

Claims

Claims:

1. A method comprising: identifying users that are potentially relevant to a first user; filtering the potentially relevant users by policies of the first user to identify relevant users; identifying candidate virtual objects for each relevant user; filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; determining relevant object modalities for at least some of the virtual objects of relevance to the first user; and providing said relevant object modalities to said first user.

2. A method as claimed in claim l, wherein identifying users that are potentially relevant to the first user comprises the use of one or more of geo-location, device triangulation and visual detection.

3. A method as claimed in claim 1 or claim 2, wherein the policies of the first user used for filtering the potentially relevant users include one or more of location, speed, visual detection and other sensor data.

4. A method as claimed in any preceding claim, wherein object modalities comprise one or more of visible, audible and touchable modalities.

5. A method as claimed in any preceding claim, further comprising filtering said object modalities based on a dynamic range of the first user.

6. A method as claimed in claim 5, wherein the dynamic range is indicative of sensory information that is useful to the first user at the current time or within a future time interval.

7. A method as claimed in any preceding claim, wherein said object modalities are modified for said first user.

8. A method as claimed in any preceding claim, wherein the relevant object modalities of a virtual object that are provided to the first user are selected based on context data.

9. A method as claimed in claim 8, wherein said context data includes a position and/or a viewpoint of the first user relative to the respective virtual object.

10. A method as claimed 8 or claim 9, wherein said context data includes capabilities of a viewing device being used by the first user.

11. A method as claimed in any one of claims 8 to 10, wherein said context data includes preferences of the first user.

12. A method as claimed in any preceding claim, further comprising determining variables of a virtual object interaction model relevant for at least some of the virtual objects of relevance to the first user and providing said variables of said virtual object interaction models to said first user.

13. A method as claimed in claim 12, wherein said virtual object interaction model includes symbolic interaction capabilities.

14. Apparatus configured to perform the method of any preceding claim.

15. Computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform a method according to any one of claims 1 to 13·

16. A computer-readable medium having computer-readable code stored thereon, the computer readable code, when executed by at least one processor, causes performance of: identifying users that are potentially relevant to a first user; filtering the potentially relevant users by policies of the first user to identify relevant users; identifying candidate virtual objects for each relevant user; filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; determining relevant object modalities for at least some of the virtual objects of relevance to the first user; and providing said relevant object modalities to said first user.

17. Apparatus comprising: at least one processor; and at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to: identify users that are potentially relevant to a first user; filter the potentially relevant users by policies of the first user to identify relevant users; identify candidate virtual objects for each relevant user; filter the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; determine relevant object modalities for at least some of the virtual objects of relevance to the first user; and provide said relevant object modalities to said first user.

18. Apparatus comprising: means for identifying users that are potentially relevant to a first user; means for filtering the potentially relevant users by policies of the first user to identify relevant users; means for identifying candidate virtual objects for each relevant user; means for filtering the candidate virtual objects by policies of the first user and/or the respective relevant user to identify virtual objects of relevance to the first user; means for determining relevant object modalities for at least some of the virtual objects of relevance to the first user; and means for providing said relevant object modalities to said first user.